Back to articles list Articles Cookbook
10 minutes read

An Overview of Aggregate Functions in SQL

Aggregate functions are commonly used in SQL. This article will lead you through their use and show examples of how they work.

SQL aggregate functions are a useful tool, especially for creating reports. They’re not difficult to understand, especially if you have some experience with Excel or similar programs. You’ve probably used aggregate functions like SUM or AVERAGE in a spreadsheet. Even if you only occasionally use them, knowing aggregate functions can help you better understand your data and work more efficiently.

Everything that I’ll cover in this article is explained in more detail in our Creating Basic SQL Reports course, where you can also practice all the functions you learn here.

What Are SQL Aggregate Functions?

In SQL, aggregate functions perform a calculation on multiple rows and return one value. They’re often used in the GROUP BY statement, but they can be used without it as well. There are five aggregate functions in SQL:

  • COUNT()
  • SUM()
  • AVG()
  • MIN()
  • MAX()

I won’t be explaining the GROUP BY statement here. There’s no need, since we already have an excellent article on how GROUP BY works. If you’re not familiar with GROUP BY, I suggest you read that article and then come back here.

What Does Each Aggregate Function Do?

The names of the above functions are self-explanatory, at least in my opinion. Maybe you’ve already figured out what they do just by looking at them. Nevertheless, a short explanation will do you no harm:

FunctionExplanationIgnores NULL values
COUNT()Counts the number of rows in a table
SUM()Calculates the sum of column values
AVG()Calculates the average column value
MIN()Returns the minimum value from a set of values
MAX()Returns the maximum value from a set of values

The Sales Table

I'll be using only one table to show you how aggregate functions work. It’s named sales and consists of the following attributes:

  • id – The salesperson’s ID.
  • first_name – The salesperson’s first name.
  • last_name – The salesperson’s last name.
  • items_sold – The number of items sold.
  • product – The name of the product sold.
  • date – The date of the sale.

Here are several rows to show you what the data looks like:

idfirst_namelast_nameitems_soldproductdate
1FrankCoyle42.00Product 12020-12-01
2FrankCoyle81.00Product 22020-12-01
3FrankCoyle14.00Product 32020-12-01
4NatashaHorvat69.00Product 12020-12-01
5NatashaHorvat44.00Product 22020-12-01

Notice that one salesperson can appear in several rows. The same goes for the product and the date. This means one salesperson can sell multiple products on multiple dates. It also means that on one date the same product can be sold by various salespersons. This is important to remember for the examples that’ll follow.

Using COUNT()

You’ve already learned that this function is used for counting the rows in a table. So, let’s count them!

COUNT() Without GROUP BY

First, let’s count the number of rows in the sales table. Here’s the code:

SELECT COUNT (id) AS number_of_columns
FROM sales;

This code uses the COUNT() function to count the number of rows in the column id. If you count the number of rows in this column, it’s also the total number of rows in the table. Running the code will return the results in the column number_of_columns. There are 27 rows:

number_of_columns
27

Hopefully, you’re not having trouble with this simple SELECT statement. If you are, the SQL Basics course can be beneficial. It’ll teach you the fundamental principles of databases, aggregation, and querying on one or multiple tables.

COUNT() with GROUP BY

Your next task is to count the number of different products sold by each salesperson. Think about how the data is presented in the table sales. Having considered that, your code should look like this:

SELECT	first_name,
		last_name,
		COUNT (DISTINCT product) AS number_of_products
FROM sales
GROUP BY first_name, last_name;

The query first selects the salesperson’s first name and the last name. Then it uses the COUNT() function to count the number of products, with the result shown in the column number_of_products.

Notice there’s a DISTINCT clause. This means the code will count only distinct products – i.e. it will count a specific product only the first time it appears for a particular salesperson. The DISTINCT clause is essential in this query because the same product can appear multiple times on different dates. Otherwise, the COUNT() function would count a product every time it appears in the table, which is not the result you want.

Finally, the code’s output is grouped by the columns first_name and last_name because I want to see the result for every salesperson. Here’s the result:

first_namelast_namenumber_of_products
FrankCoyle3
NatashaHorvat3
YolandaMartinez3

There are three salespeople and each of them sells three different products.

The COUNT() function is interesting in that it ignores NULL values. Because of this characteristic, you should be careful when deciding what you want to count and how. Here’s an article that discusses the nuances of the COUNT() function. It can help you with those decisions.

If you want to strengthen your knowledge of the GROUP BY statement, try our Creating Basic SQL Reports course as a practice set for GROUP BY. In this course, GROUP BY is explained in detail, which might be helpful.

SUM() Without GROUP BY

After counting rows, now’s the time that you learn to sum all the values in a table. This time, your task is to get the total number of items sold. Do you have an idea of how to do it? Don’t rush it; take your time before you take a look at my solution.

OK, I’ll trust you’ve taken your time; here’s the code:

SELECT SUM(items_sold) AS total_items_sold
FROM sales;

This simple query sums the column items_sold from the table sales. The result will show up in the column total_items_sold; here it is:

total_items_sold
1275.00

SUM() with GROUP BY

Now that you’re familiar with the SUM() function, let’s complicate things a bit. How about you make everybody happy and calculate the number of items sold by product? Here’s how to do that:

SELECT	product,
		SUM(items_sold) AS items_sold_per_product
FROM sales
GROUP BY product;

This code selects the column product from the table sales. Then it sums the number of items sold and shows the result in the column items_sold_per_product. Since your task is to show the number of items per product, you should group the result by product. Voila, the result is:

productitems_sold_per_product
Product 1442.00
Product 2639.00
Product 3194.00

If you’re interested in checking the result, add all the values above and you’ll get 1,275. This sum is precisely the result you got in the previous example.

Let me now show you what the AVG() function does.

AVG() Without GROUP BY

As you already know, the AVG() function calculates the average value of a set of values. To show you how it works, let’s imagine you need to calculate the average number of items sold. Here’s a query that’ll give you the correct result:

SELECT AVG(items_sold) AS avg_number_of_items_sold
FROM sales;

This code is similar to the SUM() function example. It now uses the AVG() function to calculate the average values in the column items_sold. The result of the code will show up in the column avg_number_of_items_sold.

Small code returns a small table:

avg_number_of_items_sold
47.222222

Be careful when you’re using AVG() on a column with NULL values. This function will not take into account the rows which contain the NULL values, so the average value might be different than you expect. Let me show you what I mean. I’ve modified the sales table to show you how this works:

idfirst_namelast_nameitems_soldproductdate
1FrankCoyle42Product 12020-12-01
2FrankCoyle81Product 22020-12-01
3FrankCoyle14Product 32020-12-01
4NatashaHorvatNULLProduct 12020-12-01

What do you think the average value of items_sold will be? Is your hunch saying the AVG() function would treat the NULL value as zero? Something like this:

AVG = (42+81+14+0)/4 = 34.25

Nope, your hunch is wrong! Ignoring the NULL values means the row is treated like it doesn’t exist at all. Like this:

AVG = (42+81+14)/3 = 45.67

AVG() with GROUP BY

This time, you need a report that’ll show the average items sold by date. How would you do that using the AVG() function with GROUP BY? You’ve probably figured it out by yourself. In case you didn’t, here’s the solution:

SELECT	date,
		AVG(items_sold) AS avg_items_per_date
FROM sales
GROUP BY date;

This query selects the column date from the table sales. Again, the average of the items sold is calculated and the result is shown in the column avg_items_per_date. You want the result to be displayed by date, so you need to group the result by the date column.

dateavg_items_per_date
2020-12-0147
2020-12-0259
2020-12-0335.666666

Three aggregate functions down, two to go. Let’s now have a shot at the MIN() and MAX() functions. You can almost think of them as one function.

MIN() and MAX() Without GROUP BY

The MIN() and MAX() functions can be seen as the opposite poles of one function. They work the same way, only one function returns the minimum and the other maximum value in a set of values.

To show you how those two functions work, let’s show the minimum and the maximum number of items sold in one day. How would you do that? Since the data in the table sales is on a date level, it’s simple:

SELECT	MIN(items_sold) AS min_daily_sale,
		MAX(items_sold) AS max_daily_sale
FROM sales;

The code first uses the MIN() function on the items_sold column to find the smallest value. The result will appear in the column min_daily_sale. The MAX() function finds the largest value, with the result being shown in the column max_daily_sale. Run the code and this is the result you’ll get:

min_daily_salemax_daily_sale
7.00122.00

MIN() and MAX() with GROUP BY

In the previous example, we get nothing but the smallest and largest daily sale values. We don’t know anything besides that. To spice this report up, let’s show the minimum and maximum of the sold items by salesperson and by product. Ready to see the solution? Here it is:

SELECT	first_name,
		last_name,
		product,
		MIN(items_sold) AS min_sold_per_product,
		MAX(items_sold) AS max_sold_per_product
FROM sales
GROUP BY first_name, last_name, product;

Your last query for today deserves to be the longest. Don’t let it scare you; there’s nothing you don’t know already. The query selects the columns first_name, last_name, and the product from the table sales. Then comes the MIN() function; it’s used to calculate the smallest amount of items sold per product; the result is shown in the column min_sold_per_product. Then there’s the greatest number of items sold per product in the column max_sold_per_product. Finally, the result needs to be grouped by the salespersons’ first and last names and the name of the product. Here’s the report:

first_namelast_nameproductmin_sold_per_productmax_sold_per_product
FrankCoyleProduct 121.0066.00
FrankCoyleProduct 267.0099.00
FrankCoyleProduct 314.0025.00
NatashaHorvatProduct 112.0069.00
NatashaHorvatProduct 244.00122.00
NatashaHorvatProduct 324.0031.00
YolandaMartinezProduct 128.00112.00
YolandaMartinezProduct 230.0067.00
YolandaMartinezProduct 37.0033.00

As I promised, this is your last task! Now the best thing you can do is practice the SQL aggregate functions on your own. Maybe try our SQL Practice Set, which has a nice section on GROUP BY. Or just take a look at another article, which gives five examples of GROUP BY; maybe that’s exactly what you need.

Do You Find SQL Aggregate Functions Useful?

Reading this article is not all you need to be proficient in aggregate functions. However, I tried to give you a practical overview of SQL’s aggregate functions – what they do and how they do it. I’ve also shown you how to use aggregate functions with and without GROUP BY. Now might be a good time to delve deeper into GROUP BY, considering how useful GROUP BY and its extensions can be in the working world.

Feel free to share your experience with the SQL aggregate functions in the comments section.