23rd May 2023 11 minutes read How to Write a Recursive CTE in SQL Server Nicole Darnley CTE SQL Server recursion A guide to understanding and using recursive CTEs in SQL Server SQL Server offers a lot of powerful tools for working with data, including Common Table Expressions (CTEs). A CTE is a temporary named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. CTEs can break down long queries into smaller, more manageable pieces of logic and make them more readable. SQL Server offers recursive Common Table Expressions. A recursive Common Table Expression (CTE) in SQL Server allows you to perform recursive queries on hierarchical or graph-based data structures, such as organizational charts, family trees, transportation networks, etc. Recursive queries are used to loop through relationships between the data elements. In this article, we'll start with standard CTE syntax and then explore how to write recursive CTEs in SQL Server. A great way to further your understanding of recursive CTEs is to take our course on Recursive Queries in MS SQL Server. It will teach you how to master the most challenging type of query and effectively organize long SQL queries. Basics of CTEs in SQL Server First, let's review the basics of CTEs. A CTE begins with the WITH clause, followed by the name of the CTE and the SELECT statement that defines it. The syntax of a CTE looks like this: WITH cte_name AS ( SELECT column1, column2, … FROM table … ) SELECT … FROM cte_name, …; Let’s start at the top. The first thing we are doing is giving our CTE a name, cte_name. This CTE selects some columns (column1, column2, ...) from table. The rest of the query (the outer query) can refer to the CTE and its columns as if the CTE were a regular table. You can think of a CTE as a temporary view that you can reference in the outer query, as we’re doing in the above example. Let’s say you wanted to write a query that returns the purchases managed by the five employees with the highest salary in the Sales department. The tables we will be using in this example are called employees (includes the columns employee_id, employee_name, and salary) and purchase (contains the columns id, date, customer_id, and total_amount). WITH top_5_sales_employees AS ( SELECT TOP 5 employee_id, employee_name FROM employees WHERE department_id = ‘Sales’ ORDER BY salary DESC ) SELECT p.id AS purchase_id, p.date, p.total_amount, e.id AS employee_id e.employee_name FROM top_5_sales_employees e JOIN purchase p ON p.employee_id = e.id; This CTE starts with the keyword WITH and the name of the CTE, top_5_sales_employees. In the parentheses, we select the columns we want to include in the CTE and specify the conditions for the query in the WHERE clause. Finally, we use the SELECT statement to select appropriate columns from the CTE as if it were a regular table. To read more about what CTEs are and how you write them, check out our article What Is a CTE in SQL Server? If you’re looking for more CTE examples, you’ll find them in this article. How to Use Recursive CTEs in SQL Server So, what is a recursive CTE in SQL Server? A recursive CTE is a type of CTE that references itself in the SELECT statement, creating a loop. Recursive CTEs are used to traverse hierarchical data structures, such as organizational charts or networks. Let's say we have a table called employees with columns for the employee's name, department, and manager. This is shown in the table below, where each record includes an employee and who they report to within the organization. idnamedepartmentmanager_idmanager_name 124John DoeIT135Jane Miller 135Jane MillerHR146Sarah Smith 146Sarah SmithCEONULLNULL At a glance, it’s pretty easy to see who reports to who and what this organizational hierarchy looks like. However, if we had hundreds of employees, it would be much more difficult to make sense of the data. We can use a recursive CTE to generate a hierarchical tree of the company's employees. To do this, we would run this query: WITH employee_manager_cte AS ( SELECT id, name, department, manager_id, manager_name, 1 AS level FROM employees WHERE manager_id IS NULL UNION ALL SELECT e.id, e.name, e.department, e.manager_id, e.manager_name, level + 1 FROM employees e INNER JOIN employee_manager_cte r ON e.manager_id = r.id ) SELECT * FROM employee_manager_cte; Let's break down this query step by step. First, we define the recursive CTE with the name employee_manager_cte. We select the columns we want to include in the query: id, name, department, manager_id, manager_name, and level. The level column is used to track the depth of the tree. We’ll start with the level 1; as we progress through the loop, this number will increase. This section before UNION ALL is referred to as the anchor member. In the anchor member, we start our loop. In our example, we select all the employees whose manager is NULL. In our organization chart, this will be the employees at the very top. In this case, there is only one employee at this level: Sarah Smith, the CEO. The piece after the UNION ALL is called the recursive member. In the recursive member, we add new rows to the rows that have already been computed. In our example, we join the employees table with the employee_manager_cte CTE on the manager_id column. This creates a loop that traverses the tree from the top down. We add 1 to the level column to track the depth of each node. Finally, we select all the columns from the employee_manager_cte CTE. When you execute this query, SQL Server first processes the anchor member, which selects Sarah Smith as the root of the tree. It then processes the recursive member, which joins Sarah Smith with her direct report (Jane Miller). It then joins Jane Miller with her direct report (John Doe), and John Doe with his direct report (none). Since there are no more rows to add to the result set, SQL Server stops processing the CTE and returns the final result. Here's what the result set looks like: idnamedepartmentmanager_idmanagerlevel 146Sarah SmithCEONULLNULL1 135Jane MillerHR146Sarah Smith2 124John DoeIT135Jane Miller3 Another great explanation article you can check out is What Is a Recursive CTE in SQL? There you will find more practical examples of recursive CTEs in SQL. Tips for Writing Recursive CTE in SQL Server 1. Start with the anchor member The anchor member is the starting point of the recursive CTE. It's the part of the query that defines the base case, or the first set of rows that will be returned. In our organization chart example, this is the highest level of management. Make sure the anchor member returns all the columns you need in the final result set. 2. Ensure the number of columns match The recursive member and anchor members must have the same number of columns and the same data types corresponding columns: In a recursive CTE, the recursive member references the CTE itself, so it's important to ensure that the recursive member and anchor member have the same number of columns and the same data types in corresponding columns. 3. Always UNION ALL In SQL Server you may only use UNION ALL to combine an anchor member and a recursive member: When combining the anchor member and recursive member, you must use UNION ALL, not UNION. UNION ALL preserves all rows, including duplicates, whereas UNION removes duplicates. UNION is not allowed in recursive queries in SQL Server. 4. Beware Infinite Loops! Make sure to write a termination check in your query. A termination check is a condition that stops the recursive CTE from looping indefinitely. Without a termination check, the query will by default run a maximum of 100 recursions and then error out. The termination check is typically included in the WHERE clause of the recursive member, and it specifies when the recursion should stop. By following these tips, you can write efficient and effective recursive CTEs that help you solve complex hierarchical problems in SQL Server. Be sure to check out this article on CTE best practices for more details. Recursive CTEs in SQL Server – More Examples Example 1: Recursive CTE for a Transportation Network Another example of when you might use a recursive CTE is in showing a transportation network. Suppose we have a table called routes that stores information about transportation routes between cities. It includes the source city, the destination city, and the distance between the cities. We want to write a query that returns all the cities that are reachable from a given starting city, along with the total distance to each city. Our data lives in a table called routes: source_citydestination_citydistance New YorkBoston215 New YorkPhiladelphia95 PhiladelphiaWashington140 BostonChicago985 WashingtonAtlanta640 AtlantaMiami660 Here is the query: WITH recursive_cte AS ( SELECT source_city, destination_city, distance, source_city AS visited_cities FROM routes WHERE source_city = ‘New York’ UNION ALL SELECT r.source_city, r.destination_city, r.distance + rc.distance, rc.visited_cities + ‘,’ + r.destination_city FROM routes r INNER JOIN recursive_cte rc ON r.source_city = rc.destination_city WHERE rc.distance < 2000 AND CHARINDEX(',' + r.destination_city + ',', ',' + rc.visited_cities + ',') = 0 ) SELECT destination_city, distance FROM recursive_cte This recursive CTE starts with the anchor member, which selects all the routes that start at New York. In the recursive member, we join the routes table with the recursive_cte CTE on the source_city column to find all the cities reachable from New York. We have added a new column called visited_cities that stores the list of visited cities as a comma-separated string. We initialize this column in the anchor member of the CTE by setting it to the source city. In the recursive member, we concatenate the current city to the list of visited cities and check if the destination city has already been visited using the CHARINDEX function. If the destination city has not been visited, we add it to the list of visited cities and continue the recursion We keep adding rows to the result set until there are no more cities to add or until we hit our termination check. The result set shows all the cities that are reachable from New York and the total distance to each city. destination_citydistance Boston215 Philadelphia95 Chicago1200 Washington235 Atlanta875 Miami1535 Example 2: Using Recursive CTE for Task Dependencies in a Project Another scenario in which we could use a recursive CTE would be to understand task dependencies for a project. Suppose we have a table called tasks that stores information about tasks in a project, including the task_id, the task_name, the ID of the task it depends on (depends_on_task_id), and the time_required to complete the task. We want to write a query that calculates the total time required to complete a given task, including all its dependent tasks. The data is shown below: task_idtask_namedepends_on_task_idtime_required 1DesignNULL5 2Development110 3Testing25 4Documentation13 5Deployment32 6MarketingNULL7 Let’s write a query that pulls the total time required for the development task and its dependencies. WITH recursive_cte AS ( SELECT task_id, task_name, depends_on_task_id, time_required FROM tasks WHERE task_id = 2 UNION ALL SELECT t.task_id, t.task_name, t.depends_on_task_id, t.time_required + rc.time_required FROM tasks t INNER JOIN recursive_cte rc ON t.depends_on_task_id = rc.task_id WHERE rc.total_time < 20) SELECT task_name, time_required as total_time FROM recursive_cte; This recursive CTE starts with the anchor member, which selects the row in the tasks table with task_id = 2 (Development). In the recursive member, we join the tasks table with the recursive_cte CTE on the depends_on_task_id and task_id columns to find all the tasks that depend on Development. We calculate the total time required for each task by adding the time required for the current task to the total time required for its previous tasks. Finally, we query the results by task_name and the time_required for each task. Here are the results: Task_nameTotal_time Development10 Testing15 Deployment17 The task_name column shows the name of each task, and the total_time column shows the total time required to complete the task and all its previous tasks. For example, the first row shows that the total time required to complete Development and all its dependencies is 15, which is the sum of the time required for Development (10) and Testing (5). The other rows show the same concept, with the task_name and total_time columns indicating the name of each task and its total time requirement. Learn More About Recursive CTEs in SQL Server Recursive CTEs in SQL Server can be useful in a variety of scenarios such as: Employee hierarchies: A recursive CTE can be used to traverse a tree of employees and their managers to calculate the total compensation of all employees in a given branch. Product categories: A recursive CTE can be used to traverse a tree of product categories to find all the products in a given category and its subcategories. Social networks: A recursive CTE can be used to traverse a graph of social network connections to find all the friends of a given user and their connections. Task dependencies: A recursive CTE can be used to traverse a tree of task dependencies to calculate the time required to complete a project. In general, any scenario where data is organized hierarchically or in a graph structure can potentially benefit from using a recursive CTE. By following best practices, such as starting with the anchor member, ensuring that the recursive and anchor members have the same number of columns and data types, and writing a termination check, you can write efficient and accurate queries that traverse hierarchical data structures and solve complex problems. Reinforce your understanding of recursive CTEs by taking our Recursive Queries in MS SQL Server course. Its 112 interactive exercises will help you become a master of this advanced topic! Tags: CTE SQL Server recursion