6th Jun 2023 15 minutes read How to Use SUM() with OVER(PARTITION BY) in SQL Martyna Sławińska sql PARTITION BY window functions Discover real-world use cases of the SUM() function with OVER(PARTITION BY) clause. Learn the syntax and check out 5 different examples. We use SQL window functions to perform operations on groups of data. These operations include the mathematical functions SUM(), COUNT(), AVG(), and more. In this article, we will explain what SUM() with OVER(PARTITION BY) does in SQL. We’ll show you the most common use cases in real-world applications to determine the ratio of the individual row value to the total value, calculate running totals, and find a custom order total that includes discounts for certain products. To get an in depth guide to using SUM() with OVER() and other window functions, take our interactive Window Functions course. It explains all SQL window function concepts in detail and has over 200 exercises. Let’s get started. The OVER() Clause in SQL In SQL, the OVER() clause is used to introduce window functions. The general syntax is: SELECT … <window function> OVER(...) … OVER() tells the database that we want to use window functions. The window function can be an aggregate function, like SUM(), or another window function. A window function works on a “window frame”, or a set of rows related to the current row. OVER() defines the window frame for each row. An empty OVER() clause tells the database that the whole result set is the window frame. Let’s look at an example of the SUM() OVER() syntax. We’ll use the employees table from the sample data provided by the Oracle Live SQL. This table consists of the following columns: emp_id is the employee identifier. name is the employee's name. job is the job title. dept_id is the department identifier. salary is the employee’s salary. EMP_IDNAMEJOBDEPT_IDSALARY 7839KINGPRESIDENT105000 7698BLAKEMANAGER302850 … 7900JAMESCLERK30950 7934MILLERCLERK101300 We can use the SUM() function with the OVER() clause to get the total salary of all employees and display the total salary next to the salary of each employee. SELECT emp_id, name, job, dept_id, salary, SUM(salary) OVER() AS total_salary FROM employees; emp_idNAMEJOBDEPT_IDSALARYTOTAL_SALARY 7839KINGPRESIDENT10500029025 7698BLAKEMANAGER30285029025 … 7900JAMESCLERK3095029025 7934MILLERCLERK10130029025 The SUM() function is executed for each row of the table. Every time it is called, it fetches the salary information from all the rows in the table to calculate the total salary amount. The total salary is displayed next to the details of each individual row. The details of individual rows are preserved and displayed next to the total salary. In this example the window frame (the set of rows that SUM() is operating on) is the whole data set. You can add additional clauses in OVER() to change the window frame. Example 1A: Computing the Sum for Each Group - SUM() with OVER(PARTITION BY …) The OVER() clause can contain details about the way we want to partition the data. We use the PARTITION BY clause in OVER() to divide the data into partitions, or groups. The use of PARTITION BY is similar to using GROUP BY in that the rows are partitioned into groups based on the value of some columns. When we use SUM() OVER(PARTITION BY …), we can calculate the sum of values for each group, or partition, of data. For example, we can calculate the total salary for each department: SELECT emp_id, name, job, dept_id, salary, SUM(salary) OVER(PARTITION BY dept_id) AS dept_total_salary FROM employees; Here’s the result: EMP_IDNAMEJOBDEPT_IDSALARYDEPT_TOTAL_SALARY 7782CLARKMANAGER1024508750 7934MILLERCLERK1013008750 7839KINGPRESIDENT1050008750 7902FORDANALYST20300010875 7788SCOTTANALYST20300010875 7566JONESMANAGER20297510875 7369SMITHCLERK2080010875 7876ADAMSCLERK20110010875 7521WARDSALESMAN3012509400 7654MARTINSALESMAN3012509400 7844TURNERSALESMAN3015009400 7900JAMESCLERK309509400 7499ALLENSALESMAN3016009400 7698BLAKEMANAGER3028509400 We use SUM(salary) OVER(PARTITION BY dept_id) to get the total salary per department. The PARTITION BY clause divides the rows into groups based on the dept_id column. Rows with dept_id equal to 10 are put into one group (marked in yellow in the table above), the rows with dept_id equal to 20 are put into another group (marked in green), and finally the rows with dept_id equal to 30 are put in yet another group (marked in red). The SUM() function computes the sum of rows in each group. Note that when using SUM() OVER(PARTITION BY), you keep the details of individual rows. You can, for example, see the details of the employee named Ford: his position, his salary, and how it compares to the total salaries in his department. This is the most typical use of SUM() OVER(PARTITION BY): you compute the sum value for each data group and you keep the details of individual rows. Let’s see a similar example. Example 1B: Computing Salary for Each Job Title We can use SUM(salary) OVER(PARTITION BY job) to get the total salary per job title. Let’s take a look: SELECT emp_id, name, job, dept_id, salary, SUM(salary) OVER(PARTITION BY job) AS job_total_salary FROM employees; Here’s the result: EMP_IDNAMEJOBDEPT_IDSALARYJOB_TOTAL_SALARY 7782CLARKMANAGER1024508275 7698BLAKEMANAGER3028508275 7566JONESMANAGER2029758275 7934MILLERCLERK1013004150 7369SMITHCLERK208004150 7876ADAMSCLERK2011004150 7900JAMESCLERK309504150 7902FORDANALYST2030006000 7788SCOTTANALYST2030006000 7521WARDSALESMAN3012505600 7654MARTINSALESMAN3012505600 7844TURNERSALESMAN3015005600 7499ALLENSALESMAN3016005600 7839KINGPRESIDENT1050005000 This time, rows are grouped based on the job value rather than the department ID. Employees with the same job position are put in one group and we compute the total salary of people in this position. The SUM() function is applied to all salaries in each group: the total salary for the “Manager” group is the sum of 2450, 2850, and 2975, which are the salaries of the three managers that are in our table. Example 2: Compute the Ratio of a Single Row Value to the Total Typically, we want to see the comparison between each individual row and the total sum. Let’s calculate each individual salary’s percentage of the total salaries in their department. SELECT emp_id, name, job, dept_id, salary, ROUND(100.0 * salary / SUM(salary) OVER(PARTITION BY dept_id), 2) AS percentage_of_dept_total_salary FROM employees; EMP_IDNAMEJOBDEPT_IDSALARYPERCENTAGE_OF_DEPT_TOTAL_SALARY 7782CLARKMANAGER10245028 7934MILLERCLERK10130014.86 7839KINGPRESIDENT10500057.14 7902FORDANALYST20300027.59 7788SCOTTANALYST20300027.59 7566JONESMANAGER20297527.36 7369SMITHCLERK208007.36 7876ADAMSCLERK20110010.11 7521WARDSALESMAN30125013.3 7654MARTINSALESMAN30125013.3 7844TURNERSALESMAN30150015.96 7900JAMESCLERK3095010.11 7499ALLENSALESMAN30160017.02 7698BLAKEMANAGER30285030.32 This time, rows are grouped based on the job value rather than the department ID. Employees with the same job position are put in one group and we compute the total salary of people in this position. The SUM() function is applied to all salaries in each group: the total salary for the “Manager” group is the sum of 2450, 2850, and 2975, which are the salaries of the three managers that are in our table. Example 2: Compute the Ratio of a Single Row Value to the Total Typically, we want to see the comparison between each individual row and the total sum. Let’s calculate each individual salary’s percentage of the total salaries in their department. SELECT emp_id, name, job, dept_id, salary, ROUND(100.0 * salary / SUM(salary) OVER(PARTITION BY dept_id), 2) AS percentage_of_dept_total_salary FROM employees; EMP_IDNAMEJOBDEPT_IDSALARYPERCENTAGE_OF_DEPT_TOTAL_SALARY 7782CLARKMANAGER10245028 7934MILLERCLERK10130014.86 7839KINGPRESIDENT10500057.14 7902FORDANALYST20300027.59 7788SCOTTANALYST20300027.59 7566JONESMANAGER20297527.36 7369SMITHCLERK208007.36 7876ADAMSCLERK20110010.11 7521WARDSALESMAN30125013.3 7654MARTINSALESMAN30125013.3 7844TURNERSALESMAN30150015.96 7900JAMESCLERK3095010.11 7499ALLENSALESMAN30160017.02 7698BLAKEMANAGER30285030.32 We can see, for example, that the analyst named Scott has an individual salary of 3000; this is 27.59% of the total salaries in his department. Here is a breakdown of the functions used to accomplish this task: We take each individual salary and divide it by the total salary for the department: salary / SUM(salary) OVER(PARTITION BY dept_id) To get a percentage, we multiply it by 100%: 0 * salary / SUM(salary) OVER(PARTITION BY dept_id) Next, we use the ROUND() function to get two decimal digits: ROUND(100.0 * salary / SUM(salary) OVER(PARTITION BY dept_id), 2) Computing the ratio of the individual row to the sum for a group is another very common use case of SUM() with OVER(PARTITION BY). In a similar way, you could compute the difference between the total for the group and the individual row. What’s the difference between SUM() with OVER(PARTITION BY …) and using SUM() with GROUP BY? The PARTITION BY clause accomplishes a similar task to the GROUP BY clause. Both OVER(PARTITION BY) and GROUP BY divide the dataset into partitions, or groups. When using the SUM() function, both expressions compute the sum for each group. So, which one should you choose? When using GROUP BY, the details of individual rows are collapsed. Take a look: SELECT job, SUM(salary) AS total_salary FROM employees GROUP BY job; JOBTOTAL_SALARY ANALYST6000 CLERK4150 SALESMAN5600 MANAGER8275 PRESIDENT5000 Here, we get the total salary value for each job title without considering individual employees’ salaries. However, if you want to compare the total sum value with individual row values (e.g. to compute the ratio of the individual value to the total), the way to go is OVER(PARTITION BY …). It returns all individual rows along with the total sum value for each row. This total sum value may differ, depending to which partition the row belongs. SELECT name, job, salary, SUM(salary) OVER(PARTITION BY job) AS total_salary FROM employees; NAMEJOBSALARYTOTAL_SALARY FORDANALYST30006000 SCOTTANALYST30006000 SMITHCLERK8004150 JAMESCLERK9504150 ADAMSCLERK11004150 MILLERCLERK13004150 BLAKEMANAGER28508275 JONESMANAGER29758275 CLARKMANAGER24508275 KINGPRESIDENT50005000 TURNERSALESMAN15005600 ALLENSALESMAN16005600 WARDSALESMAN12505600 MARTINSALESMAN12505600 The total_salary value matches the results of the previous query with GROUP BY. But here, you can see individual salaries as well. For example, there are two analysts that earn 6000 in total; each of them earns 3000. The rule of thumb is: If you want to know just the sum value for each group and you are not interested in the details for each individual row, you should use the GROUP BY clause. If you’re interested both in the sum for each group and the details of individual rows, you should use SUM() OVER(PARTITION BY). Example 3: Computing a Running Total Using SUM() with OVER(PARTITION BY) Another common usage of the SUM() OVER(...) syntax is to calculate the running total. A running total is the cumulative sum of the previous numbers in a column. We use running totals to calculate values that accumulate over time. For example, with a running total you can calculate the monthly mobile data usage by adding each subsequent day’s value to the sum of the previous days’ values. Similarly, you can compute how the number of registered users increases each day or how the total revenue increases with each transaction. We use the SUM() with OVER(PARTITION BY … ORDER BY …) syntax to compute the running total. Let’s look at an example. We’ll use the orders table from sample data provided by Oracle Live SQL. This table stores historical order data. Each order has its date (order_date), sales representative (sales_rep_id), and total value (order_total). Here we compute the running total value for each sales representative: SELECT order_date, sales_rep_id, order_total, SUM(order_total) OVER(PARTITION BY sales_rep_id ORDER BY order_date) AS running_total FROM orders; ORDER_DATESALES_REP_IDORDER_TOTALRUNNING_TOTAL 29-MAR-07 02.22.40.536996 PM15310794.610794.6 16-AUG-07 03.34.12.234359 PM15378279.689074.2 04-OCT-07 09.53.34.362632 PM15312989203.2 21-NOV-07 10.22.33.263332 AM15313824103027.2 16-DEC-07 08.19.55.462332 PM15311188.5114215.7 27-JUL-06 12.22.59.662632 PM15452471.952471.9 27-JUL-06 01.34.16.562632 PM154364656117.9 29-JUN-07 09.53.41.984501 AM1544856165.9 01-JUL-07 04.49.13.615512 PM15422056385.9 02-JUL-07 03.34.44.665170 AM15460056985.9 01-SEP-07 09.53.26.934626 AM154545162436.9 02-OCT-07 05.49.34.678340 PM1546653.469090.3 10-NOV-07 03.49.25.526321 AM15450125119215.3 19-NOV-07 02.41.54.696211 PM15442283.2161498.5 17-DEC-07 05.03.52.562632 PM15410474.6171973.1 To do so, we partition our table by sales representative and then order each partition by date: SUM(order_total) OVER(PARTITION BY sales_rep_id ORDER BY order_date) The new syntax element here is ORDER BY. It defines how the rows are sorted in each window frame. The SUM() function is applied to each row; it adds up the current order_total value to the previous running_total value; running totals are calculated separately for each partition (here, for each sales rep ID). Let’s take a closer look at the sales representative with an ID of 153. Their first order was on March 29th and had a total value of 10794.6. At this point, the running total (the sum) is equal to the order value. Their second order was on August 16th for 78279.6; now the running total is equal to the sum of their first and second order values (10794.6 + 78279.6 = 89074.2). After their third order, the running total is equal to the previous running total plus the third order value (89074.2 + 129 = 89203.2). This process is similar for the sales representative with the ID of 154. Computing a running total is a common pattern when using SQL for data analysis. You can read how to compute a running total in SQL elsewhere in our blog. Example 4: Counting Objects in Custom Categories The SUM() function is often teamed with the CASE WHEN statement to count objects in custom categories. For example, you may want to compute the total salary of employees in management positions in one department and display it next to each employee’s details. You can achieve this by using SUM() OVER(PARTITION BY) in combination with CASE WHEN. Let’s first review the CASE WHEN statement. It is similar to the if statement common to many programming languages. We use it to define the value of an expression in different situations, or cases. In the following example, we use the CASE WHEN statement to identify each employee as management (managers and presidents) or as a regular employee (all other job positions). Take a look: SELECT emp_id, name, job, dept_id, CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN ‘MANAGEMENT’ ELSE ‘REGULAR’ END AS emp_type salary FROM employees; The CASE statement evaluates all WHEN conditions. If it finds the matching condition, it returns the value in the THEN branch. If it doesn’t find the matching condition, it returns the value given after ELSE. In our example, employees in the president or manager position are labeled as ‘Management’. All other job positions are assigned the label ‘Regular’. Here’s the result of the query: EMP_IDNAMEJOBDEPT_IDEMP_TYPESALARY 7782CLARKMANAGER10MANAGEMENT2450 7934MILLERCLERK10REGULAR1300 7839KINGPRESIDENT10MANAGEMENT5000 7902FORDANALYST20REGULAR3000 7788SCOTTANALYST20REGULAR3000 7566JONESMANAGER20MANAGEMENT2975 7369SMITHCLERK20REGULAR800 7876ADAMSCLERK20REGULAR1100 7521WARDSALESMAN30REGULAR1250 7654MARTINSALESMAN30REGULAR1250 7844TURNERSALESMAN30REGULAR1500 7900JAMESCLERK30REGULAR950 7499ALLENSALESMAN30REGULAR1600 7698BLAKEMANAGER30MANAGEMENT2850 You can read more about CASE in our article How to Use CASE in SQL. You can also use the CASE statement with SUM() to sum values in custom categories. Here’s the query: SELECT dept_id, SUM( CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN salary ELSE 0 END) AS dept_management_salary, SUM( CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN 0 ELSE salary END) AS dept_regular_salary FROM employees GROUP BY dept_id; DEPT_IDDEPT_MANAGEMENT_SALARYDEPT_REGULAR_SALARY 1074501300 2029757900 3028506550 For employees in management positions, the first CASE statement returns the value of the column salary. The SUM() function in combination with GROUP BY then sums all salaries for management employees with the same dept_id. This way, we compute the total management salary per department. The second CASE statement computes the total department salary for all regular employees. You can compute the total salary for employees in management position in a department and display it next to employee details using SUM() with PARTITION BY: SELECT emp_id, name, job, dept_id, CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN ‘MANAGEMENT’ ELSE ‘REGULAR’ END AS emp_type salary, SUM( CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN salary ELSE 0 END) OVER(PARTITION BY dept_id) AS dept_management_salary FROM employees; EMP_IDNAMEJOBDEPT_IDEMP_TYPESALARYDEPT_MANAGEMENT_SALARY 7782CLARKMANAGER10MANAGEMENT24507450 7934MILLERCLERK10REGULAR13007450 7839KINGPRESIDENT10MANAGEMENT50007450 7902FORDANALYST20REGULAR30002975 7788SCOTTANALYST20REGULAR30002975 7566JONESMANAGER20MANAGEMENT29752975 7369SMITHCLERK20REGULAR8002975 7876ADAMSCLERK20REGULAR11002975 7521WARDSALESMAN30REGULAR12502850 7654MARTINSALESMAN30REGULAR12502850 7844TURNERSALESMAN30REGULAR15002850 7900JAMESCLERK30REGULAR9502850 7499ALLENSALESMAN30REGULAR16002850 7698BLAKEMANAGER30MANAGEMENT28502850 The rows are partitioned into groups based on the dept_id column. There are three groups, one for each department. The SUM() function is applied to the CASE WHEN expression. Instead of computing the total salary in each department, we compute the total salary for employees in management positions in the department. The CASE returns 0 for regular employees (the sum is not increased) and the salary value for management employees. You can use this pattern in many different situations. In fact, in the next section we shall see another example of combining CASE WHEN with SUM() and OVER(PARTITION BY). Example 5: Calculating an Order Total with Discounts Using SUM() with CASE WHEN In this example, we want to calculate the total value for each order, including discounts for some order items. We’ll use the SUM() OVER(PARTITION BY …) syntax to sum all the item values per order. And to incorporate the product discounts, we’ll use the CASE WHEN statement. We use the order_items table from sample data provided by the Oracle Live SQL. This table stores all order items (product_id) belonging to each order (order_id). It contains information on the unit price of the product (unit_price) and the order quantity (quantity). Now we want to calculate the order values: one per each product in the order and a total for the order; the order total should include any discounts applied to the products. The order total is equal to the unit price multiplied by the order quantity. However, when there is a discount applied to a product, we’ll use a CASE WHEN statement to account for it. SELECT order_id, product_id, unit_price, quantity, SUM(CASE -- 20% discount for this product WHEN product_id = 3143 THEN unit_price*quantity*0.8 -- 40% discount for this product WHEN product_id = 3106 THEN unit_price*quantity*0.6 ELSE unit_price*quantity END) OVER(PARTITION BY order_id, product_id) AS order_product_total_with_discount, SUM(CASE -- 20% discount for this product WHEN product_id = 3143 THEN unit_price*quantity*0.8 -- 40% discount for this product WHEN product_id = 3106 THEN unit_price*quantity*0.6 ELSE unit_price*quantity END) OVER(PARTITION BY order_id) AS order_total_with_discount FROM order_items; ORDER_IDPRODUCT_IDUNIT_PRICEQUANTITYORDER_PRODUCT_TOTAL_WITH_DISCOUNTORDER_TOTAL_WITH_DISCOUNT 2354310648611756.844916.2 2354311496.8434162.444916.2 235431237947371344916.2 235431294147192744916.2 235431392148100844916.2 235431431653678.444916.2 23543150175898644916.2 235431633061183044916.2 235431653764236844916.2 235431675168346844916.2 23543170145.2701016444916.2 23543176113.3728157.644916.2 235431826177469744916.2 2355228946200920094513.5 23552308571851054594513.5 2355231186.918816337.294513.5 2355232219188357294513.5 2355232317190323094513.5 235523261.1192211.294513.5 235523301.1197216.794513.5 2355233925199497594513.5 23552359226.620446226.494513.5 To calculate the total per product in an order, we use the following syntax: SUM(CASE WHEN product_id=3143 THEN unit_price*quantity*0.8 -- 20% discount WHEN product_id=3106 THEN unit_price*quantity*0.6 -- 40% discount ELSE unit_price*quantity END) OVER(PARTITION BY order_id, product_id) AS order_product_total_with_discount We pass the CASE WHEN statement as an argument to the SUM() function; if the current row’s product ID is 3143, then we apply a discount of 20%; for the product ID of 3106, the discount is 40%. Next, we partition the dataset by order ID and product ID to get total values for each product in an order. Note that we use two different PARTITION BY clauses in the query. To calculate the product total value, we partition the dataset by order ID and product ID. To calculate the total per order, we partition the dataset by order ID only. The order total value is equal for all rows with the same order ID. That is, if you sum up all product total values of a certain order, you get the order total value. For example, for the order ID of 2355, we get the following: 9200 + 10545 + 16337.2 + 3572 + 3230 + 211.2 + 216.7 + 4975 + 46226.4 = 94513.5. Here are some examples of the SUM() function with CASE WHEN. Go Ahead and SUM() with OVER() and PARTITION BY Now that you know how to use the SUM() window function and its various syntax options, you can try it out yourself. We presented examples on how to compute a percentage of a single row value to the total of the partition and showed how to calculate the running total and the custom order total. We encourage you to practice with your own data. To learn more, try our interactive Window Functions course, which explains all window functions concepts in detail. Or check out our Window Functions Cheat Sheet if you want a quick and easy reference for SQL window functions. Good luck! Tags: sql PARTITION BY window functions