14th Aug 2020 12 minutes read What Is the Difference Between WHERE and HAVING Clauses in SQL? Martyna Sławińska sql learn sql GROUP BY As the amount of data in the world grows day by day, the need increases for managing and querying data efficiently. SQL offers ways to manipulate and query data at any scale. As an important part of this, the WHERE and HAVING clauses allow you to filter data as you need, but they are not the same. We will explain the differences between them in detail. One of the most common tasks in manipulating and querying data is filtering using the WHERE and HAVING clauses. Although they both filter data based on user-defined condition(s) and/or function(s), there are important differences in how each clause executes. This article explains in detail when to use WHERE or HAVING. We will briefly explain the logical process flow of an SQL query that includes WHERE and/or HAVING clauses, followed by some practice scenarios and examples for us to solve together. Logical Process Flow for Queries With WHERE and HAVING The order in which you write an SQL query differs from the order of execution. Let’s look at the following as an example: SELECT country_id, city, MIN(salary), MAX(salary) FROM hr.emp_details_view WHERE country_id IN ('US', 'UK') GROUP BY country_id, city HAVING MIN(salary) < 15000 ORDER BY country_id; The logical process flow of an SQL query is diagrammed below. Figure 1: Order of execution in SQL queries. The logical processing of an SQL query starts with a FROM statement, which gathers data from the tables listed in the query. It also may contain a JOIN clause that combines two or more tables using the ON operator. Note that the conditions used in the WHERE clause can also be used in the ON operator of the JOIN clause. Also, the tables combined using conditions in the ON operator of the JOIN clause can be used in the WHERE clause. However, you should use a WHERE only for conditions that filter individual rows. It improves the readability of the query and allows the ON operator to handle just the combining of the tables, which is its purpose. The next step in the logical process flow is the WHERE clause, which filters the records to the ones that meet the user-defined condition(s) and/or function(s) and pass them on. The WHERE clause is followed by the GROUP BY clause, which groups the records received from the WHERE condition. For example, the column(s) listed in GROUP BY clause may be the teams, and other columns are aggregated by the aggregate function(s) and assigned to corresponding group(s)/team(s). Next is the HAVING clause, which filters the groups created in GROUP BY rather than individual records. At this point, the logical processing goes to the SELECT command. It evaluates which columns are going to be sent to the output. It also evaluates any keywords such as UNIQUE, DISTINCT, and TOP, if included. The ORDER BY clause is executed at the end of the logical process flow. It sorts the data by the column(s) specified in it and in ascending order by default. First, the Essentials: Specifying Conditions in WHERE and HAVING There is a wide range of operators that help us create and combine conditions in WHERE and HAVING clauses: comparison signs: <, >, <=, >=, =, <> null value testing: IS NULL, IS NOT NULL wildcards: LIKE, '%=_' range testing: BETWEEN testing presence in a set: IN existence testing: EXISTS quantified comparison signs: ALL, ANY, SOME logical combination of the conditions: AND, OR, NOT The operands in a WHERE clause can include columns listed in SELECT, except for the columns used in aggregate functions. In contrast, HAVING filters by resulting values of the aggregate function such as SUM(), COUNT(), and AVG(), among others. Examples The tables employee and emp_details_view, shown below partially, are used throughout the upcoming examples. SELECT * FROM hr.employees; EMPLOYEE_IDLAST_NAMEMANAGER_IDHIRE_DATESALARYJOB_ID 101Kochhar10021-SEP-0517000AD_VP 102De Haan10013-JAN-0117000AD_VP 108Greenberg10117-AUG-0212008FI_MGR 103Hunold10203-JAN-069000IT_PROG 105Austin10325-JUN-054800IT_PROG 116Baida11424-DEC-052900PU_CLERK 117Tobias11424-JUL-052800PU_CLERK 145Russell10001-OCT-0414000SA_MAN 148Cambrault10015-OCT-0711000SA_MAN SELECT * FROM hr.emp_details_view; EMPLOYEE_IDLAST_NAMEFIRST_NAMESALARYDEPARTMENT_IDCOUNTRY_IDCITYLOCATION_ID 103HunoldAlexander900060USSouthlake1400 105AustinDavid480060USSouthlake1400 101KochharNeena1700090USSeattle1700 102De HaanLex1700090USSeattle1700 108GreenbergNancy12008100USSeattle1700 116BaidaShelli290030USSeattle1700 117TobiasSigal280030USSeattle1700 145RussellJohn1400080UKOxford2500 148CambraultGerald1100080UKOxford2500 Scenario I: The WHERE Clause The WHERE clause is used to filter individual rows of data fetched from the database by SELECT. The records not meeting the WHERE condition(s) are not included in the output. Let’s look at an example: where age > 25 AND city='New York' It passes only the records whose age is greater than 25 and whose city is New York. In the query syntax, the WHERE clause is used directly after SELECT and FROM. It filters individual rows—the raw, ungrouped records— fetched by the query, to determine which records should be passed on to the GROUP BY clause. Let’s examine the output of the query below which uses the WHERE clause. SELECT last_name, hire_date, manager_id FROM hr.employees WHERE last_name LIKE '%a%' AND (hire_date BETWEEN '01-JAN-05' AND '01-JAN-07') AND manager_id IS NOT NULL; LAST_NAMEHIRE_DATEMANAGER_ID Baida24-DEC-05114 Kochhar21-SEP-05100 Tobias24-JUL-05114 The example above outputs the columns last_name, hire_date, and manager_id from the table hr.employee. The WHERE clause filters the output as follows: last_name must include a letter “a” in it; hire_date must be greater than or equal to 01-JAN-05 and less than or equal to 01-JAN-07; and manager_id must have a value and not be null. Only the records meeting the WHERE conditions are presented in the output. Let’s look at another example from the same scenario: SELECT employee_id, salary, job_id FROM hr.employees WHERE (salary < 3000 OR salary = 9000) AND job_id IN ('IT_PROG', 'FI_MGR', 'PU_CLERK'); EMPLOYEE_IDSALARYJOB_ID 1039000IT_PROG 1162900PU_CLERK 1172800PU_CLERK This example outputs the columns employee_id, salary, and job_id from the table hr.employee. The WHERE clause filters the output as follows: salary must be less than 3000, or it must be equal to 9000; and job_id must be equal to any value listed in the IN operator. As in the previous query, the output contains only the records that meet the WHERE condition. Keep in mind that WHERE cannot contain conditions that include aggregate functions. That is the job for HAVING. Scenario II: The HAVING Clause HAVING is used to filter groups of records created by the GROUP BY clause. For this reason, the HAVING clause must follow the GROUP BY clause. It is similar to the WHERE clause that filters the SELECT output, only that WHERE filters individual records while HAVING filters groups. Aggregate functions such as SUM(), MAX(), MIN(), AVG(), and COUNT() are the focus of the HAVING clause. Let’s see it in action in the examples. SELECT SUM(salary), department_id FROM hr.emp_details_view GROUP BY department_id HAVING SUM(salary) > 10000; SUM(SALARY)DEPARTMENT_ID 3400090 1380060 12008100 2500080 The example above lists the departments along with the sum of all salaries in each department. Only the department_ids whose total salaries are greater than 10000 are listed in the output. Those not meeting the condition set by the HAVING clause are filtered out. Let’s look at another example: SELECT COUNT(employee_id), job_id, salary FROM hr.employees WHERE salary > 12000 GROUP BY job_id, salary HAVING COUNT(employee_id) < 10; COUNT(EMPLOYEE_ID)JOB_IDSALARY 1SA_MAN14000 2AD_VP17000 1FI_MGR12008 This query lists the number of employees for each combination of job_id and salary. The WHERE clause filters the records to those with salaries greater than 12000. The GROUP BY clause, which follows a WHERE, specifies grouping by non-aggregated columns job_id and salary. Finally, the HAVING clause specifies the aggregated value COUNT(employee_id) to be less than 10. Scenario III: The GROUP BY Clause WHERE With a GROUP BY Clause The WHERE clause must always be placed before GROUP BY. This is because WHERE filters individual rows of data, not groups of rows. The GROUP BY clause takes the individual rows from the result of the row-level filter WHERE to create groups of rows. Here is an example: SELECT job_id, SUM(salary) FROM hr.employees WHERE manager_id IN (100, 101, 102, 103) GROUP BY job_id; JOB_IDSUM(SALARY) AD_VP34000 FI_MGR12008 IT_PROG13800 SA_MAN25000 The query above is evaluated as follows: The WHERE clause first filters out the records with manager_id that are not found in the list following the IN operator. The GROUP BY clause then groups by job_id the records that passed the WHERE condition. The query calculates the total salaries of employees managed by each of the specified managers (manager_ids 100, 101, 102, 103). The salaries of the employees reporting to other managers are not included in computing this total. HAVING With a GROUP BY Clause The GROUP BY clause is often used with aggregate functions. It creates summary values for the columns listed in GROUP BY. In contrast, HAVING always follows a GROUP BY clause, since HAVING works with the groups created by the GROUP BY. Let’s consider a case using HAVING in which the GROUP BY cannot be omitted. This is true if there are columns listed in SELECT that are not used by the aggregate functions, such as department_id in the example below. These non-aggregated columns must be listed in GROUP BY for grouping data. SELECT avg(salary), department_id FROM hr.emp_details_view GROUP BY department_id HAVING avg(salary) < 15000; AVG(SALARY)DEPARTMENT_ID 285030 690060 12008100 1250080 However, HAVING can be used without an accompanying GROUP BY. Note that in this case, HAVING is applied to the entire SELECT output, treating it as a single group. The below is an example: SELECT round(avg(salary)) FROM hr.emp_details_view HAVING avg(salary) < 11000; ROUND(AVG(SALARY)) 10056 The query returns a single value containing the average of all salaries. Note that the HAVING clause puts a limit to this aggregated value. Had the calculated average been greater than 11000, the query would have returned no records. The next is an example of pseudo-aggregation, in which aggregate functions are used in place of a GROUP BY: SELECT MIN(first_name), MIN(department_id), MAX(salary) FROM hr.emp_details_view HAVING MIN(salary) > 1000; MIN(FIRST_NAME)MIN(DEPARTMENT_ID)MAX(SALARY) Alexander3017000 The MIN() function is used here on columns first_name and department_id. It is not being called in a typical usage of aggregate functions, but rather for the purpose of avoiding a GROUP BY. In this case, the output is just a single record of data that consists of the minimum value for first_name, the minimum value for department_id, and maximum value for salary, each of the entire table. Note that these 3 values might be coming from 3 different records as it is in this example. Furthermore, in this case, had the minimum salary for the entire table been 1000 or less, the query would not return any record. This particular example is a bit artificial so that we can show you something simpler. While less common than GROUP BY, pseudo-aggregation may be used in queries that treat all included rows as one group. The usage for HAVING with or without a GROUP BY may vary by database. If you want to learn more about the GROUP BY clause, please visit the article GROUP BY Clause. Scenario IV: Using Both WHERE and HAVING Clauses Together The syntax for using WHERE and HAVING in one query follows the specific order presented below: SELECT: selects the columns from the database to output to the result. FROM: lists the tables to be used in the query. WHERE: filters individual records. GROUP BY: groups the records based on the column(s) specified. HAVING: filters the groups defined by GROUP BY. ORDER BY: sorts the output records by the column(s) specified. Let’s look at an example that uses all of the above. SELECT country_id, city, MIN(salary), MAX(salary) FROM hr.emp_details_view WHERE country_id IN ('US', 'UK') GROUP BY country_id, city HAVING MIN(salary) < 15000 ORDER BY country_id; COUNTRY_IDCITYMIN(SALARY)MAX(SALARY) UKOxford1100014000 USSeattle280017000 USSouthlake48009000 The query fetches the minimum salary and maximum salary for each combination of city and country_id, the latter limited to the US and the UK only by the WHERE condition. Suppose you have 10 records with the US as the country_id, and within it, there are 5 records with New York as the city and another 5 with Los Angeles as the city. When you run the query above, these 10 records become 2 records: one record for the US and New York, with its minimum and maximum salaries, another record for the US and Los Angeles, also with its minimum and maximum salaries. For more information on the differences between WHERE and HAVING, please visit the article on SQL HAVING vs. WHERE. Scenario V: WHERE and HAVING With JOINs The following example takes it a step further to use a JOIN with both WHERE and HAVING. It joins the data from two tables based on a common column, employee_id. The WHERE and HAVING clauses follow the order of execution we just discussed. SELECT e.job_id, edv.location_id, SUM(e.salary) FROM hr.employees e JOIN hr.emp_details_view edv ON e.employee_id=edv.employee_id WHERE e.job_id IN ('IT_PROG', 'SA_MAN') GROUP BY e.job_id, edv.location_id HAVING SUM(e.salary) > 5000 ORDER BY e.job_id; JOB_IDLOCATION_IDSUM(E.SALARY) IT_PROG140013800 SA_MAN250025000 This example selects the column job_id from the table hr.employee, the column location_id from the table hr.emp_details_view, and the total salaries for each combination of job_id and location_id. The condition for joining the tables together is listed in the ON operator. This condition is based on the common column, employee_id. Next, the WHERE clause filters the records to only job_ids that are either IT_PROG or SA_MAN. The GROUP BY clause groups records by the combination of the columns job_id and location_id. Finally, the HAVING clause filters the aggregated value SUM(e.salary) to those greater than 5000. The ORDER BY clause sorts the output records (in ascending order by default) by the column job_id. For more discussion on the GROUP BY and ORDER BY clauses, visit the article on GROUP BY vs. ORDER BY. Difference Between WHERE and HAVING Clauses in SQL Here is a summary at a glance of the differences between WHERE and HAVING clauses in SQL: WHERE clauseHAVING clause FilteringFilters individual rows fetched by SELECT, i.e., the output of the SELECT commandFilters groups of rows created by GROUP BY, i.e., the output of the GROUP BY clause ConditionsCannot have conditions containing aggregate functionsFocuses on conditions containing aggregate functions SyntaxWHERE comes before GROUP BY in the syntaxHAVING comes after GROUP BY in the syntax Order of evaluationWHERE clause is evaluated before GROUP BY in the process flow, before aggregation is performedHAVING clause is evaluated after GROUP BY in the process flow, after aggregation is performed With/without GROUP BYCan be used with or without GROUP BY, since WHERE and GROUP BY are not relatedShould be used with GROUP BY; otherwise, the output of SELECT is treated as one group WHERE and HAVING are a big part of learning SQL. Check out the SQL Basics course to help develop your SQL skills. Tags: sql learn sql GROUP BY