Even more on SQL Subqueries

Correlated Subquery and Common Table Expression (CTE).

In a previous blog, we covered simple subqueries.

This blog will cover a couple more examples of SQL Subquery: The Correlated Subquery and Common Table Expression.

Image for post
Image for post
Correlated | Photo by Lucas George Wendt on Unsplash

Correlated Subquery

A correlated subquery compares values in each row against a table referenced in the main or outer query. It is like a simple subquery in that it has its own SELECT clause and FROM clause. It is different from a simple subquery in that you can’t run it independently. You can’t execute a correlated subquery on its own because it’s dependent on values in the outer query.

A correlated subquery is evaluated in loops, once for each row generated by the data set. With a simple subquery, the inner query runs first and executes one time, returning values to be used by the main query. A correlated subquery, however, executes once for each row used by the outer query.

Adding correlated subqueries will slow down your query’s performance because the correlated has to run once for each row, recalculating information repeatedly.

It is useful for solving problems that you can’t easily solve with just one query. Although, like many things in SQL, correlated subqueries are not your only possible solution.

Here are two examples where you would use a correlated subquery, using an imaginary basketball database of games and players.

What’s the second highest salary? This problem doesn’t seem so hard on the surface. You could solve with a nested subquery. However, what if you need to find the 4th, 5th, …, nth highest salary? These higher-level problems are where a correlated subquery is helpful.

SELECT salary
FROM players p1
WHERE n-1 =
(SELECT COUNT(*)
FROM players p2
WHERE p1.salary < p2.salary);

This query will find the nth largest salary from the players table.

Image for post
Image for post
Show me the money| Photo by Jp Valery on Unsplash

Another example is to find all the rows in the outer query which have no rows in the related table. For example, say we want to find players on the roster who have played no minutes this season to give them some game time.

SELECT first_name, last_name
FROM players p
WHERE NOT EXISTS
(SELECT *
FROM game g
WHERE g_player_id = p.id
AND season = ‘2020/2021’);

This query finds the first and last names of players who have not been active in any games this season.

Image for post
Image for post
Get him in the game! Photo by Channey on Unsplash

Look at the WHERE clause in both of the code examples above. Notice that the salary and the id are matching on tables referenced in the outer query.

Image for post
Image for post
Common Table | Photo by Elliott Stallion on Unsplash

Common Table Expression

Common table expressions (CTE) are a subquery that you write before the main query. CTE’s are similar to a simple subquery, except instead of placing the subquery inside the WHERE or FROM … clause, you name the CTE ahead of time and then reference it using that name. CTE’s are referenced just like with any other table in your database.

We build CTE’s using the WITH keyword, followed by a CTE name, and AS and then the subquery body. A generalized example looks like this:

WITH cte_name (Column1, Column2 …)
AS
(Subquery body)

There are two parts to the CTE. In the first part, we set the CTE name and the columns within the CTE. The second part is the subquery body, with the SELECT / FROM statements.

You can have multiple CTEs. You only need one WITH statement. List the CTEs one after another with a comma in between each CTE. Just like columns in a SELECT statement, there is no comma after the last CTE.

CTE’s organize long and complicated code and make it more readable. Instead of creating one large query, you can create several CTE’s and combine them later in the query.

CTE’s can reference information from earlier CTE’s. CTE’s can also refer to themselves through a process called recursion. Here, the CTE runs repeatedly, returning subsets of data each time, until it completes. Recursion is useful when working with hierarchical data, such as organization charts or bill of materials.

CTE’s run once and get stored in memory. The CTE can be used multiple times in the query by referencing the CTE name. This use of memory improves performance efficiency compared with some, but not all, other methods.

CTE’s are also useful in other ways. You can substitute a CTE for a view (or virtual table), and use CTE’s in ranking functions like ROW_NUMBER(), RANK() …

Thanks for reading. I hope this helps you build your own correlated subqueries and common table expressions.

https://www.w3resource.com/sql/subqueries/corelated-subqueries-using-aliases.php

https://www.essentialsql.com/introduction-common-table-expressions-ctes/

https://www.oreilly.com/library/view/head-first-sql/9780596526849/

Andrew has an analytics background with over 20 years of experience in various industries, working with world-leading brands.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store