SQL Intermediate Cheatsheet — Joins

After covering aggregation in the previous section, I will now go over one of the most important SQL concepts — joins. Often times, the data we need are not just from one table but instead span over multiple ones. We use joins to combine multiple tables containing common identifiers (i.e. unique ID). Note that we can also join a table to itself.

I will first go over the core concepts of SQL joins, which include some venn diagram visual representation that will hopefully help with your understanding. Then, I will cover some specific examples using the different types of joins. Lastly, I will introduce UNION, a concept similar to joins.

The Syntax of JOIN

The above query combines the person and the address tables with person_ID as the common identifier. Note: we rename (provide aliases) both tables as “p” and “a” respectively, which can be especially useful when dealing with long table names.

Joins can be broken down into two types: inner joins and outer joins. Mastering the key differences can come a long way when dealing with different use cases. I will provide some examples that illustrate why one type of join might be the superior option in certain use cases.


INNER JOIN outputs the intersection of 2 tables, which includes only the elements that are common in both tables. In other words, rows from either table that are unmatched in the other table are not returned.

Here’s an example where using INNER JOIN is the appropriate method. Let’s say we want to count the number of unique users/day who logged in from both iPhone and web.

Why do we use INNER JOIN? Because we want to display the users from both iPhone and web (i.e. elements that are common in both tables). Note that JOIN and INNER JOIN are equivalent and can be used interchangeably.


Unmatched rows in one or both tables can be returned in an outer join. We will cover the three types of outer joins: LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN.


LEFT JOIN returns all rows in table1 in the FROM clause and only matching rows from table2 (after the LEFT JOIN clause).

Let’s say we want to return a list of cities without any users:

Why not use an INNER JOIN here? Because joining by city_ID will display all of the cities that have users, and all of the cities without users will be filtered out. Using a LEFT JOIN will keep all rows from the cities table, even if a city doesn’t contain any user (in which user_ID is NULL). Ultimately, we use the WHERE clause to include only those cities that do not have any user.


Similarly, RIGHT JOIN returns all rows from table2 (after the RIGHT JOIN clause) and only matching rows from table1 in the FROM clause. We rarely use RIGHT JOIN in practice because we can achieve the same results by switching the order of the two tables in a LEFT JOIN.


FULL JOIN displays the union of 2 tables, which includes all unmatched rows from both tables. Regardless of which rows have null values, all of our data will be displayed.


While joins allow you to combine two tables horizontally (side-by-side), UNION will stack tables vertically. More specifically, UNION allows you to stack one SELECT statement on top of another.

There are two types of UNION: UNION only appends distinct values while UNION ALL appends all values including duplicates.

SQL has two strict rules for appending data using UNION: there must be the same number of columns, and the selected columns must have the same data types in the same order.

The above query returns the number of messages by each user by date.



M.S.E Data Science @Johns Hopkins University with a B.S. in Applied Mathematics. Previously @MongoDB, current Data Science Intern @EA