11th Jun 2021 6 minutes read What Is the SQL HAVING Clause? András Novoszáth sql learn sql HAVING GROUP BY Are you learning SQL? Are you wondering what you can use the HAVING clause for? Or, perhaps, have you tried to use WHERE on your GROUP BY aggregates? You are in the right place! In this article, we explain how to use HAVING with plenty of examples. Have you tried to filter your metrics you created with GROUP BY with a WHERE clause and found out that it did not work? This is because this is not what WHERE does. You need a HAVING clause! In this article, we will explain how to use HAVING to filter your aggregated data. You will learn about when and how to use it, its syntax, and the difference from WHERE. We will show these in detailed examples from a movie rental company. Are you ready? Let’s jump right in! The Dataset and the Problem In this article, we use the film table from the Sakila sample dataset. It contains a list of movies you can rent from a fictional DVD rental service. Here is an example query and its result. SELECT title, release_year, language_id, rental_duration, length, rating FROM film LIMIT 10; titlerelease_yearlanguage_idrental_durationlengthrating ACADEMY DINOSAUR20061686PG ACE GOLDFINGER20061348G ADAPTATION HOLES20061750NC-17 AFFAIR PREJUDICE200615117G AFRICAN EGG200616130G AGENT TRUMAN200613169PG AIRPLANE SIERRA20061662PG-13 AIRPORT POLLOCK20061654R ALABAMA DEVIL200613114PG-13 ALADDIN CALENDAR20061663NC-17 With this data, we want to see what types of movies people rent for shorter or longer periods. This information will allow us to identify the type of films and audiences we might focus on. GROUP BY: A Summary We use HAVING on groups created with a GROUP BY clause. Let’s do a quick reminder of how we aggregate data with GROUP BY. We use GROUP BY to summarize our data into aggregated metrics. We need two things for it: An aggregation function like COUNT, MAX, or AVG. A column to aggregate on. For example, we want to know how many movies we have in each rating category. To answer this question, we use COUNT(*) to count the number of movies in each group, and we use GROUP BY to aggregate this metric according to the values of the rating column: SELECT rating, COUNT(*) AS films FROM film GROUP BY rating ORDER BY rating; ratingfilms G178 PG194 PG-13223 R195 NC-17210 The result shows the number of movies by rating. The distribution of movies by rating is relatively even. The most popular rating is PG-13, while the least popular one is G. To learn more about GROUP BY, check out this in-depth article or this list of practical examples. Now that we have reviewed what GROUP BY is, we can combine it with HAVING! When to Use SQL HAVING The primary use of the HAVING operation is to filter aggregated data. You can use it when you summarize your data with GROUP BY into new metrics, and you want to select the results based on these new values. It is similar to WHERE, but you use them in different circumstances. At the end of this article, you will read more about this difference in detail. The SQL HAVING Syntax In terms of syntax, HAVING just requires a line after the GROUP BY. The following snippet summarizes the syntax required for using a HAVING clause: SELECT aggregate_function(column) FROM table GROUP BY column_to_aggregate_on HAVING condition_on_aggregate_metric; This syntax is concise, but let’s see how it works through examples! Examples of HAVING Say we want to know what movies people rent for longer or shorter durations. We might be interested in the rental duration by rating. For example, focusing on movies with shorter rentals might allow us to have a faster overall turnover. Let’s see what the data can tell us about this question! We create two aggregate metrics grouped by rating. Then, we use HAVING to filter the resulting table on the average rental duration (avg_rentdur). SELECT rating, COUNT(*) AS films, AVG(rental_duration), AVG(length) FROM film GROUP BY rating HAVING AVG(rental_duration) > 5 HAVING here selects data based on a condition on a group that we created with an aggregate function (AVG). ratingfilmsAVG(rental_duration)AVG(length) PG1945.0825112.0052 NC-172105.1429113.2286 PG-132235.0538120.4439 The result does not include G and R ratings, because their average rental durations are shorter than 5 days. People rent movies rated PG, PG-13, or NC-17 for longer periods. Interesting! Since we are looking at categories for their rental durations, let’s filter our results further. We will also remove ratings whose average length is above two hours. We do this by adding another group condition to the HAVING clause: SELECT rating, COUNT(*) AS films, AVG(rental_duration), AVG(length) FROM film GROUP BY rating HAVING AVG(rental_duration) > 5 AND AVG(length) < 120; The additional condition on AVG(length) filters out the PG-13 rating. ratingfilmsAVG(rental_duration)AVG(length) PG1945.0825112.0052 NC-172105.1429113.2286 Well done! You Can't Use WHERE to Filter Groups We mentioned that HAVING is similar to WHERE, but we use them in different scenarios. Let’s see how this plays out in our example. Let’s replace HAVING with WHERE in the query we just wrote: SELECT rating, COUNT(*) AS films, AVG(rental_duration), AVG(length) FROM film GROUP BY rating WHERE AVG(rental_duration) > 5 However, this results in the following error message: ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE avg_rentdur > 5' at line 9 This is because GROUP BY clauses run after WHERE clauses. This query tries to apply WHERE on avg_rentdur when avg_rentdur does not yet exist! This means that we can use WHERE only on “raw” data and not on aggregated values. We need to use HAVING on aggregated metrics. Using Both WHERE and HAVING in one Query We saw that we use WHERE to filter individual rows and HAVING to filter on groups. To demonstrate their difference further, let’s see an example in which we use both HAVING and WHERE! We want to know how our previous findings change when we focus on short movies. For this, we limit the movie length to 60 minutes. In our query, we do this by applying WHERE on the length column: SELECT rating, COUNT(*) AS films, AVG(rental_duration), AVG(length) FROM film WHERE length < 60 GROUP BY rating HAVING AVG(rental_duration) < 5; ratingfilmsAVG(rental_duration)AVG(length) G214.381053.0952 NC-17214.904851.3810 R114.636455.0000 The results show that if we focus only on short movies, we find several films rated NC-17 with short rentals on average. This finding can give us ideas about the type of movies we can introduce into our future selection of offerings. Want to learn more about the differences between HAVING and WHERE? Check out our detailed article on the topic! Learn More About SQL HAVING! Do you like what you just learned? Make sure you don’t forget it! You best learn SQL by solving problems through new and challenging examples. You can set up the same testing environment we use here, import the same dataset, and rewrite the queries you just learned. However, that might require a lot of work, and you would just be recreating the same examples. Instead, we recommend that you check out our practice course! It will help you practice the SQL HAVING and other clauses through new and challenging examples! What do you think? Are you up for it? Tags: sql learn sql HAVING GROUP BY