SQL stands for Structured Query Language, which is one of the most important coding languages to learn especially for Data Scientists and Data Analysts. Its main use case is to query and analyze data from a relational database, which consists of rows (records) and columns (features). For example, we can have a table of transactions, with each row indicating a different customer transaction, and columns such as user_ID, transaction_date, and transaction_amount.

I’ll be breaking down some important SQL concepts into beginner, intermediate, and advanced sections. In the beginner portion, we will cover the basic structures such as SELECT, WHERE, and ORDER BY. We’ll also go over various comparison and logical operators used in the WHERE clause to filter your data.

These basic concepts are the foundational building blocks for writing complex queries, so mastering the basic syntax and structure is definitely important but don’t worry, they’re relatively intuitive.

Basic SQL Syntactic Structure

Does this look confusing? Let’s go over what each line means in terms of querying the specific data we want.


We use SELECT to specify the data we would like to pull from a database.

SELECT all columns

SELECT specific columns


We use the WHERE clause to set up specified filters for the values we want to pull.

Filtering results can be done in a number of ways using comparison and logical operators. Let’s dive into what we can specify in the WHERE condition.

Comparison Operators

Comparison operators are used to compare values- for example, if we only want to look at the rows where the customer’s first name is Peter. I’ll demonstrate examples of comparison operators using a table of customer transactions.

Logical Operators

We can use logical operators to apply multiple filters/operators (i.e. using AND/OR). We’ll also go over some crucial logical operators, which are LIKE (match similar values), IN (specify a list of values to include), BETWEEN (select rows within a certain range), IS NULL (select rows with no data in a given column), and NOT (selects rows that do not match some condition).


ORDER BY is a useful function to present your results in sorted order.

Here we’re pulling all transaction details from the highest to the lowest transaction amount (in descending order). Note that ORDER BY sorts values in ascending order by default.

Now we’re sorting the results by transaction amount, and then by year (in ascending order).


We use LIMIT to specify how many rows we want to display in the final results. Let’s say we want to output the top 5 customers who spent the most money in a single transaction:


Hopefully you found this tutorial helpful in beginning your SQL journey. In the next intermediate section, I’ll go over two very important concepts: aggregation and joins. Aggregate functions (such as COUNT, SUM, and GROUP BY) are used to aggregate data similar to how you would in an Excel pivot table, which helps to process data using summary statistics. Joins are used to combine multiple tables that are related to one another- this is especially applicable, since we rarely only work with one table in the real world.

M.S.E Data Science @Johns Hopkins University with a B.S. in Applied Mathematics. Previously @MongoDB, current Data Science Intern @EA