14th Aug 2020 12 minutes read

What Is the Difference Between WHERE and HAVING Clauses in SQL?

As the amount of data in the world grows day by day, the need increases for managing and querying data efficiently. SQL offers ways to manipulate and query data at any scale. As an important part of this, the WHERE and HAVING clauses allow you to filter data as you need, but they are not the same. We will explain the differences between them in detail.

One of the most common tasks in manipulating and querying data is filtering using the WHERE and HAVING clauses. Although they both filter data based on user-defined condition(s) and/or function(s), there are important differences in how each clause executes.

This article explains in detail when to use WHERE or HAVING. We will briefly explain the logical process flow of an SQL query that includes WHERE and/or HAVING clauses, followed by some practice scenarios and examples for us to solve together.

Logical Process Flow for Queries With WHERE and HAVING

The order in which you write an SQL query differs from the order of execution. Let’s look at the following as an example:

SELECT country_id, city, MIN(salary), MAX(salary) 
FROM hr.emp_details_view
WHERE country_id IN ('US', 'UK')
GROUP BY country_id, city
HAVING MIN(salary) < 15000
ORDER BY country_id;

The logical process flow of an SQL query is diagrammed below.

Figure 1: Order of execution in SQL queries.

The logical processing of an SQL query starts with a FROM statement, which gathers data from the tables listed in the query. It also may contain a JOIN clause that combines two or more tables using the ON operator.

Note that the conditions used in the WHERE clause can also be used in the ON operator of the JOIN clause. Also, the tables combined using conditions in the ON operator of the JOIN clause can be used in the WHERE clause. However, you should use a WHERE only for conditions that filter individual rows. It improves the readability of the query and allows the ON operator to handle just the combining of the tables, which is its purpose.
The next step in the logical process flow is the WHERE clause, which filters the records to the ones that meet the user-defined condition(s) and/or function(s) and pass them on.
The WHERE clause is followed by the GROUP BY clause, which groups the records received from the WHERE condition. For example, the column(s) listed in GROUP BY clause may be the teams, and other columns are aggregated by the aggregate function(s) and assigned to corresponding group(s)/team(s).
Next is the HAVING clause, which filters the groups created in GROUP BY rather than individual records.
At this point, the logical processing goes to the SELECT command. It evaluates which columns are going to be sent to the output. It also evaluates any keywords such as UNIQUE, DISTINCT, and TOP, if included.
The ORDER BY clause is executed at the end of the logical process flow. It sorts the data by the column(s) specified in it and in ascending order by default.

First, the Essentials: Specifying Conditions in WHERE and HAVING

There is a wide range of operators that help us create and combine conditions in WHERE and HAVING clauses:

comparison signs: <, >, <=, >=, =, <>
null value testing: IS NULL, IS NOT NULL
wildcards: LIKE, '%=_'
range testing: BETWEEN
testing presence in a set: IN
existence testing: EXISTS
quantified comparison signs: ALL, ANY, SOME
logical combination of the conditions: AND, OR, NOT

The operands in a WHERE clause can include columns listed in SELECT, except for the columns used in aggregate functions. In contrast, HAVING filters by resulting values of the aggregate function such as SUM(), COUNT(), and AVG(), among others.

Examples

The tables employee and emp_details_view, shown below partially, are used throughout the upcoming examples.

SELECT * 
FROM hr.employees;

EMPLOYEE_ID	LAST_NAME	MANAGER_ID	HIRE_DATE	SALARY	JOB_ID
101	Kochhar	100	21-SEP-05	17000	AD_VP
102	De Haan	100	13-JAN-01	17000	AD_VP
108	Greenberg	101	17-AUG-02	12008	FI_MGR
103	Hunold	102	03-JAN-06	9000	IT_PROG
105	Austin	103	25-JUN-05	4800	IT_PROG
116	Baida	114	24-DEC-05	2900	PU_CLERK
117	Tobias	114	24-JUL-05	2800	PU_CLERK
145	Russell	100	01-OCT-04	14000	SA_MAN
148	Cambrault	100	15-OCT-07	11000	SA_MAN

SELECT * 
FROM hr.emp_details_view;

EMPLOYEE_ID	LAST_NAME	FIRST_NAME	SALARY	DEPARTMENT_ID	COUNTRY_ID	CITY	LOCATION_ID
103	Hunold	Alexander	9000	60	US	Southlake	1400
105	Austin	David	4800	60	US	Southlake	1400
101	Kochhar	Neena	17000	90	US	Seattle	1700
102	De Haan	Lex	17000	90	US	Seattle	1700
108	Greenberg	Nancy	12008	100	US	Seattle	1700
116	Baida	Shelli	2900	30	US	Seattle	1700
117	Tobias	Sigal	2800	30	US	Seattle	1700
145	Russell	John	14000	80	UK	Oxford	2500
148	Cambrault	Gerald	11000	80	UK	Oxford	2500

Scenario I: The WHERE Clause

The WHERE clause is used to filter individual rows of data fetched from the database by SELECT. The records not meeting the WHERE condition(s) are not included in the output.

Let’s look at an example:

where age > 25 AND city='New York'

It passes only the records whose age is greater than 25 and whose city is New York.

In the query syntax, the WHERE clause is used directly after SELECT and FROM. It filters individual rows—the raw, ungrouped records— fetched by the query, to determine which records should be passed on to the GROUP BY clause.

Let’s examine the output of the query below which uses the WHERE clause.

SELECT last_name, hire_date, manager_id 
FROM hr.employees
WHERE last_name LIKE '%a%' 
AND (hire_date BETWEEN '01-JAN-05' AND '01-JAN-07') 
AND manager_id IS NOT NULL;

LAST_NAME	HIRE_DATE	MANAGER_ID
Baida	24-DEC-05	114
Kochhar	21-SEP-05	100
Tobias	24-JUL-05	114

The example above outputs the columns last_name, hire_date, and manager_id from the table hr.employee. The WHERE clause filters the output as follows:

last_name must include a letter “a” in it;
hire_date must be greater than or equal to 01-JAN-05 and less than or equal to 01-JAN-07; and
manager_id must have a value and not be null.

Only the records meeting the WHERE conditions are presented in the output.

Let’s look at another example from the same scenario:

SELECT employee_id, salary, job_id 
FROM hr.employees
WHERE (salary < 3000 OR salary = 9000) 
AND job_id IN ('IT_PROG', 'FI_MGR', 'PU_CLERK');

EMPLOYEE_ID	SALARY	JOB_ID
103	9000	IT_PROG
116	2900	PU_CLERK
117	2800	PU_CLERK

This example outputs the columns employee_id, salary, and job_id from the table hr.employee. The WHERE clause filters the output as follows:

salary must be less than 3000, or it must be equal to 9000; and
job_id must be equal to any value listed in the IN operator.

As in the previous query, the output contains only the records that meet the WHERE condition.

Keep in mind that WHERE cannot contain conditions that include aggregate functions. That is the job for HAVING.

Scenario II: The HAVING Clause

HAVING is used to filter groups of records created by the GROUP BY clause. For this reason, the HAVING clause must follow the GROUP BY clause. It is similar to the WHERE clause that filters the SELECT output, only that WHERE filters individual records while HAVING filters groups.

Aggregate functions such as SUM(), MAX(), MIN(), AVG(), and COUNT() are the focus of the HAVING clause.

Let’s see it in action in the examples.

SELECT SUM(salary), department_id 
FROM hr.emp_details_view
GROUP BY department_id
HAVING SUM(salary) > 10000;

SUM(SALARY)	DEPARTMENT_ID
34000	90
13800	60
12008	100
25000	80

The example above lists the departments along with the sum of all salaries in each department. Only the department_ids whose total salaries are greater than 10000 are listed in the output. Those not meeting the condition set by the HAVING clause are filtered out.

Let’s look at another example:

SELECT COUNT(employee_id), job_id, salary
FROM hr.employees
WHERE salary > 12000
GROUP BY job_id, salary
HAVING COUNT(employee_id) < 10;

COUNT(EMPLOYEE_ID)	JOB_ID	SALARY
1	SA_MAN	14000
2	AD_VP	17000
1	FI_MGR	12008

This query lists the number of employees for each combination of job_id and salary. The WHERE clause filters the records to those with salaries greater than 12000. The GROUP BY clause, which follows a WHERE, specifies grouping by non-aggregated columns job_id and salary. Finally, the HAVING clause specifies the aggregated value COUNT(employee_id) to be less than 10.

Scenario III: The GROUP BY Clause

WHERE With a GROUP BY Clause

The WHERE clause must always be placed before GROUP BY. This is because WHERE filters individual rows of data, not groups of rows. The GROUP BY clause takes the individual rows from the result of the row-level filter WHERE to create groups of rows.

Here is an example:

SELECT job_id, SUM(salary) 
FROM hr.employees
WHERE manager_id IN (100, 101, 102, 103)
GROUP BY job_id;

JOB_ID	SUM(SALARY)
AD_VP	34000
FI_MGR	12008
IT_PROG	13800
SA_MAN	25000

The query above is evaluated as follows:

The WHERE clause first filters out the records with manager_id that are not found in the list following the IN operator.
The GROUP BY clause then groups by job_id the records that passed the WHERE condition.

The query calculates the total salaries of employees managed by each of the specified managers (manager_ids 100, 101, 102, 103). The salaries of the employees reporting to other managers are not included in computing this total.

HAVING With a GROUP BY Clause

The GROUP BY clause is often used with aggregate functions. It creates summary values for the columns listed in GROUP BY.

In contrast, HAVING always follows a GROUP BY clause, since HAVING works with the groups created by the GROUP BY.

Let’s consider a case using HAVING in which the GROUP BY cannot be omitted. This is true if there are columns listed in SELECT that are not used by the aggregate functions, such as department_id in the example below. These non-aggregated columns must be listed in GROUP BY for grouping data.

SELECT avg(salary), department_id 
FROM hr.emp_details_view
GROUP BY department_id
HAVING avg(salary) < 15000;

AVG(SALARY)	DEPARTMENT_ID
2850	30
6900	60
12008	100
12500	80

However, HAVING can be used without an accompanying GROUP BY. Note that in this case, HAVING is applied to the entire SELECT output, treating it as a single group. The below is an example:

SELECT round(avg(salary))
FROM hr.emp_details_view
HAVING avg(salary) < 11000;

ROUND(AVG(SALARY))
10056

The query returns a single value containing the average of all salaries. Note that the HAVING clause puts a limit to this aggregated value. Had the calculated average been greater than 11000, the query would have returned no records.

The next is an example of pseudo-aggregation, in which aggregate functions are used in place of a GROUP BY:

SELECT MIN(first_name), MIN(department_id), MAX(salary)
FROM hr.emp_details_view
HAVING MIN(salary) > 1000;

MIN(FIRST_NAME)	MIN(DEPARTMENT_ID)	MAX(SALARY)
Alexander	30	17000

The MIN() function is used here on columns first_name and department_id. It is not being called in a typical usage of aggregate functions, but rather for the purpose of avoiding a GROUP BY. In this case, the output is just a single record of data that consists of the minimum value for first_name, the minimum value for department_id, and maximum value for salary, each of the entire table. Note that these 3 values might be coming from 3 different records as it is in this example. Furthermore, in this case, had the minimum salary for the entire table been 1000 or less, the query would not return any record.

This particular example is a bit artificial so that we can show you something simpler. While less common than GROUP BY, pseudo-aggregation may be used in queries that treat all included rows as one group.

The usage for HAVING with or without a GROUP BY may vary by database. If you want to learn more about the GROUP BY clause, please visit the article GROUP BY Clause.

Scenario IV: Using Both WHERE and HAVING Clauses Together

The syntax for using WHERE and HAVING in one query follows the specific order presented below:

SELECT: selects the columns from the database to output to the result.
FROM: lists the tables to be used in the query.
WHERE: filters individual records.
GROUP BY: groups the records based on the column(s) specified.
HAVING: filters the groups defined by GROUP BY.
ORDER BY: sorts the output records by the column(s) specified.

Let’s look at an example that uses all of the above.

SELECT country_id, city, MIN(salary), MAX(salary) 
FROM hr.emp_details_view
WHERE country_id IN ('US', 'UK')
GROUP BY country_id, city
HAVING MIN(salary) < 15000
ORDER BY country_id;

COUNTRY_ID	CITY	MIN(SALARY)	MAX(SALARY)
UK	Oxford	11000	14000
US	Seattle	2800	17000
US	Southlake	4800	9000

The query fetches the minimum salary and maximum salary for each combination of city and country_id, the latter limited to the US and the UK only by the WHERE condition.

Suppose you have 10 records with the US as the country_id, and within it, there are 5 records with New York as the city and another 5 with Los Angeles as the city. When you run the query above, these 10 records become 2 records:

one record for the US and New York, with its minimum and maximum salaries,
another record for the US and Los Angeles, also with its minimum and maximum salaries.

For more information on the differences between WHERE and HAVING, please visit the article on SQL HAVING vs. WHERE.

Scenario V: WHERE and HAVING With JOINs

The following example takes it a step further to use a JOIN with both WHERE and HAVING. It joins the data from two tables based on a common column, employee_id. The WHERE and HAVING clauses follow the order of execution we just discussed.

SELECT e.job_id, edv.location_id, SUM(e.salary)
FROM hr.employees e
JOIN hr.emp_details_view edv
ON e.employee_id=edv.employee_id
WHERE e.job_id IN ('IT_PROG', 'SA_MAN')
GROUP BY e.job_id, edv.location_id
HAVING SUM(e.salary) > 5000
ORDER BY e.job_id;

JOB_ID	LOCATION_ID	SUM(E.SALARY)
IT_PROG	1400	13800
SA_MAN	2500	25000

This example selects the column job_id from the table hr.employee, the column location_id from the table hr.emp_details_view, and the total salaries for each combination of job_id and location_id. The condition for joining the tables together is listed in the ON operator. This condition is based on the common column, employee_id.

Next, the WHERE clause filters the records to only job_ids that are either IT_PROG or SA_MAN. The GROUP BY clause groups records by the combination of the columns job_id and location_id. Finally, the HAVING clause filters the aggregated value SUM(e.salary) to those greater than 5000.

The ORDER BY clause sorts the output records (in ascending order by default) by the column job_id. For more discussion on the GROUP BY and ORDER BY clauses, visit the article on GROUP BY vs. ORDER BY.

Difference Between WHERE and HAVING Clauses in SQL

Here is a summary at a glance of the differences between WHERE and HAVING clauses in SQL:

	`WHERE` clause	`HAVING` clause
Filtering	Filters individual rows fetched by `SELECT`, i.e., the output of the `SELECT` command	Filters groups of rows created by `GROUP BY`, i.e., the output of the `GROUP BY` clause
Conditions	Cannot have conditions containing aggregate functions	Focuses on conditions containing aggregate functions
Syntax	`WHERE` comes before `GROUP BY` in the syntax	`HAVING` comes after `GROUP BY` in the syntax
Order of evaluation	`WHERE` clause is evaluated before `GROUP BY` in the process flow, before aggregation is performed	`HAVING` clause is evaluated after `GROUP BY` in the process flow, after aggregation is performed
With/without GROUP BY	Can be used with or without `GROUP BY`, since `WHERE` and `GROUP BY` are not related	Should be used with `GROUP BY`; otherwise, the output of `SELECT` is treated as one group

WHERE and HAVING are a big part of learning SQL. Check out the SQL Basics course to help develop your SQL skills.

Tags: