Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

datagy

Data Science Tutorials for Python, Excel, and SQL!

SQL for Beginners Tutorial (Learn SQL in 2020)


by Nik / April 29, 2020

Welcome to our SQL for Beginners Tutorial! In this guide, you’ll learn everything you need to know to
get started with SQL for data analysis.

We cover off fundamental concepts of the SQL language, such as creating databases and tables,
select records, updating and deleting records, etc.
To download the full guide in a beautiful PDF, a SQL
Cheat Sheet, and a database file to play along with,
click here!

We also cover off some more intermediate concepts such as joining tables. We do this by providing
many SQL examples to guide you through the process.

Table of Contents – SQL for Beginners Tutorial

What is SQL?

How is SQL Used?

Why Should You Learn SQL?

How Long Does it Take to Learn SQL?

What is SQLite?

Starting Our Tutorial

How do You Create Tables in SQL?

How do You Modify Records with SQL?

How do You Select Records with SQL?

How do You Limit SQL Outputs?

How do You Select Records Conditionally with SQL?


How do You Aggregate Data with SQL?

How Do You Change a Column Name in SQL?

How do You Join Tables in SQL?

Conclusion
The main SQL commands you need to know to get started!

What is SQL?
SQL stands for Structured Query Language and is a standard programming language designed for
storing, retrieving, and managing data stored in a relational database management system (RDBMS).

SQL is the most popular database language but has been implemented differently by different
database systems. For the purposes of this tutorial, we’ll use SQLite3 – a trimmed down version of
SQL that is easier to implement if you want to follow along.

SQL can be pronounced as both sequel or S-Q-L.

How is SQL Used?


In short, SQL is used where databases are used. SQL is used, for example, by music streaming
applications to provide information on songs, albums, artists and playlists. It’s also used in the finance
industry to update bank accounts, for example.

SQL is used to create, maintain, and update databases. Because databases are everywhere in
technology, whether on your iPhone or on this website, SQL is used almost everywhere.

Why Should You Learn SQL?


SQL is one of the key languages to learn on your journey to becoming a data analyst or data scientist.
It’s used everywhere, it’s in high demand, and it isn’t showing any sign of going anywhere.

It’s also in incredibly high demand in terms of data jobs, as this Indeed study found:
Nearly a quarter of tech jobs require a knowledge of SQL according to an Indeed study. Source.

How Long Does it Take to Learn SQL?


It’s possible to learn the fundamentals of SQL in a matter of days. This post will walk you through
everything you need to get started with analyzing data using SQL.

A more complete answer would be: it depends on what your previous knowledge is. If you have an
understanding of relational databases or other programming languages, you might have an easier
time.

The best way to learn is to dive into it with beginner exercises. Later, you can apply what you’ve
learned to large, more complex examples to better prepare you for the real world.

What is SQLite?
SQLite is a relational database management system that is embedded in the end program. This makes
it an easier solution for this tutorial to follow along with, as it’s something you can set up immediately
on your machine. It’s quite similar in structure to another iteration of SQL called PostgreSQL.

Tip! If you want to follow along with entering code, you can download the free DB Browser
for SQLite.
SQL for Beginners Tutorial – What We’re Creating

Following along with this SQL for beginners tutorial, I’ll walk you through all the code you need to
create the following database. It’s a simple one, but it’ll teach you basic and intermediate SQL skills!

The entity relationship diagram of the database we’ll be creating


during this tutorial.

If you’re not familiar with database structures, let’s go over a few key elements before diving in:

Table Names are listed in blue. In this database, we have two tables: clients and orders.

Primary Keys of tables are in bold. Primary keys uniquely identify a record in a table.

A line is drawn between columns that have a relationship. In this case, the client_id in the clients
table connects with userid in the orders table. Each client can have multiple orders – this means
that the client table has a one-to-many relationship.

How Do You Create Tables in SQL?


To create a table in SQL, you following the structure below:

CREATE TABLE [IF NOT EXISTS] tableName (


column1 data_type PRIMARY KEY,
column2 data_type NOT NULL,
column3 data_type DEFAULT 0,
table_constraints
);

Let’s take a look at these commands in a little bit more detail:

CREATE TABLE is the command used to instruct SQL to create a new table,

IF NOT EXISTS only makes SQL create the table is the table doesn’t already exist,

tableName reflects the name of the table to be created,

Within brackets, columns are defined by providing: the column name and any constraints.

SQLite supports PRIMARY KEY, UNIQUE, NOT NULL, CHECK column constraints.

Within the brackets, table constraints such as PRIMARY KEY, FOREIGN KEY, and UNIQUE.

We end with a semi-colon, which let’s SQL know that the command is complete.

Assigning a PRIMARY KEY value to a column or multiple columns means that the column(s) uniquely
identify a record in the table.

In order to create the two tables for our sample database, we would write the following code:

CREATE TABLE IF NOT EXISTS clients (


client_id INTEGER PRIMARY KEY,
fname TEXT,
lname TEXT,
dob TEXT,
gender TEXT,
join_date TEXT NOT NULL
);

CREATE TABLE IF NOT EXISTS orders (


orderid INT PRIMARY KEY,
order_date TEXT NOT NULL,
userid INT NOT NULL,
total INT NOT NULL
);

When we run this command, we create our two tables. We’re including the IF NOT EXISTS command
so that we can re-run the command without SQL throwing an error that the tables already exist.
Back to the Table of Contents of our SQL for Beginners Tutorial?

How Do You Insert Data with SQL?


In this section, you’ll learn how to insert data with SQL. We’ll be loading data into the tables that we
generated in the previous section. Inserting data with SQL is a straightforward process. Let’s get
started!

The general process looks like this:

INSERT INTO table (column1,column2 ,..) VALUES( value1, value2 ,...);

Let’s look at the INSERT statement in a bit more detail:

We first specify the name of the table we want to add values into

We then specify a list of all the columns in the table. While this list is optional, it’s good practice to
include it.

We then follow with a list of values we want to include. If we don’t spell out all the column names,
we have to include a value for each column.

If the table has some constraints, such as UNIQUE or NOT NULL, these need to be maintained in our
INSERT statement.

Let’s now insert a few records into both of our tables:

INSERT INTO clients(client_id, fname, lname, dob, gender, join_date) VALUES


(1, 'Bruce', 'Wayne', 19880101, 'male', 20200201),
(2, 'Jane', 'Doe', 19891223, 'female', 20200302),
(3, 'Peter', 'Parker', 19801209, 'male', 20200401);

INSERT INTO orders(orderid, order_date, userid, total) VALUES


(1, 20200201, 1, 134),
(2, 20200203, 1, 267),
(3, 20200302, 2, 99),
(4, 20200401, 1, 341),
(5, 20200401, 3, 87),
(6, 20200403, 2, 95);

We’ve now successfully inserted records into both of our tables!


I'm learning SQL with @datagyio!
Click to Tweet

How Do You Modify Records with SQL?


To modify records in SQL, you use the UPDATE statement to update existing rows within a table.

The UPDATE statement works like below:

UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;

Let’s explore this in a bit more detail:

We follow UPDATE with the name of the table where we want to update records,

SET is followed by a list of column = value pairings of which columns we want data to be updated
in

The WHERE statement identifies the records where we want data to be updated

Note! The WHERE statement identifies the records to be updated. While this field is
optional, if it’s left blank, it causes all records in that table to be overwritten.

Let’s try this out to update one of our records in our client table:

UPDATE clients
SET fname = 'Jean', lname = 'Grey'
WHERE client_id = 2;

In this example, we updated our second record in the clients table to change the client’s first name
from Jane to Jean the last name from Doe to Grey.

If we had left the WHERE statement blank, all first names in the table would have become Jean and all
last names would have become Grey!

How Do You Delete Records with SQL?


To delete a record, or multiple records, from a table in SQL, you use the DELETE statement. This
follows a similar structure to the UPDATE statement:

DELETE FROM table_name WHERE condition;

Let’s explore this in more detail:

DELETE FROM is followed by the table in which we want to delete,

WHERE is followed by the condition(s) which are used to tell SQL which records to delete

Note! The WHERE statement is optional, but if it’s left blank, all the records in the table will
be deleted.

Let’s now practice by deleting a record from our orders table:

DELETE FROM orders WHERE orderid = 6;

In the above example, we specified that we wanted to delete the record where order_id is equal to 6,
from the orders table.

Back to the Table of Contents of our SQL for Beginners Tutorial?

How Do You Select Records with SQL?


The SELECT statement is used to select and retrieve data from one or multiple tables. The SELECT
statement can be modified to either select everything from a table or only a subset of records. Later
on, we’ll also cover off how to select data from multiple tables using JOIN statements.

Selecting and retrieving data is an important skill for data analysis and data science. Because of this,
we’ll dedicate a significant amount of time to this to provide helpful examples!

Selecting All Records in a Table with SQL

The most straightforward to select data with SQL is to select all the records in a table. This is
accomplished using the structure below:

SELECT * FROM table_name;


Let’s explore this in more detail:

The asterisk (*) is used as a wildcard character in SQL. We ask SQL to return all columns in a table.

FROM is used to specify from which table we want to return data.

Let’s try this with one of our tables:

SELECT * FROM clients;

This would provide the following output:

client_id fname lname dob gender join_date


1 Bruce Wayne 19880101 male 20200201
2 Jean Grey 19891223 female 20200302
3 Peter Parker 19801209 male 20200401

Select Only Some Columns from a Table

If we wanted to only return a number of columns from a table, we could specify the column names in
our SELECT statement. This follows the structure below:

SELECT column1, column2, ... FROM table_name;

Let’s try this out with one of our tables:

SELECT client_id, fname, lname FROM clients;

client_id fname lname


1 Bruce Wayne
2 Jean Grey
3 Peter Parker

How Do You Limit SQL Outputs?


There may be times when you’re only interested in a smaller subset of data from your query and want
to only select a number of a rows.

In true databases, tables will have many, many more rows than our sample tables. By limiting outputs,
you can also improve the performance of your queries, which is especially useful when you’re testing
a query.

Let’s see what this looks like! For the purposes of our SQL for beginners tutorial, we will follow SQLite
syntax, which follows the MySQL syntax:

SELECT column_names
FROM table_name
LIMIT num_of_rows;

This follows a similar structure to a regular select statement, except we add a LIMIT clause at the end
with a number of rows we want to limit the query to.

Let’s true this out on our database:

SELECT orderid, order_date, total


FROM orders
LIMIT 2;

This returns the following:

orderid order_date total


1 20200201 134
2 20200203 267

If we were writing this in Microsoft SQL-Server, we would write the following:

SELECT TOP num_of_rows


column_names
FROM table_name

Back to the Table of Contents of our SQL for Beginners Tutorial?


SQL WHERE Clause: How Do You Select Records Conditionally with SQL?
The WHERE clause is used in many different places in SQL, including the SELECT, UPDATE, and DELETE
statements. In the SELECT statement, the WHERE clause extracts only the records that meet the
conditions specified in the WHERE clause.

The WHERE clause is used in the SELECT statement in the following way:

SELECT column_list FROM table_name WHERE condition;

Let’s break this down further:

The SELECT statement is used to identify the column(s) to be selected,

The FROM statement is used to identify the table from which to extract records,

The WHERE statement is followed by either a single condition or multiple conditions.

Filter Records with WHERE Clause in SQL

Let’s say that we only wanted to select records from our orders table where the total price was higher
than 100, we could write:

SELECT *
FROM orders
WHERE total > 100;

This returns the following table:

orderid order_date userid total


1 20200201 1 134
2 20200203 1 267
4 20200401 1 341

Filter Records with Different Operators

To be able to more accurately filter data, we can use different operators, which are listed out below.
Since this is a SQL for beginners tutorial, we’ll only cover off some of them in this tutorial.
Operator Description Example

= Equal to WHERE client_id = 1

> Greater than WHERE total > 100

< Less than WHERE total < 100

>= Greater than or equal to WHERE client_id >= 3

<= Less than or equal to WHERE total <=100

LIKE Matching a pattern WHERE fname LIKE ‘Pe’

IN Matches values in a list or subquery WHERE name IN (‘Peter’, Jean’)

BETWEEN Matches a value between a range of values WHERE total BETWEEN 50 AND 100

The different operators available in SQL.

SQL AND & OR Operators

We can also apply multiple conditions to a WHERE clause. Within this, we can use the different
operators that we showed above. We can apply this with AND and OR statements to further filter
data.

Combining Conditions with AND Statements

The AND operator is used to evaluate whether two conditions are TRUE. It’s used in combination with
the WHERE statement. This follows the format below:

SELECT column_names
FROM table_name
WHERE condition1 AND condition2;

This returns only records where both condition1 and condition2 are met.

If we wanted to, for example, return all orders where the user_id is equal to 1 and the order total is
greater or equal to 200. We could do this using the following code:

SELECT *
FROM orders
WHERE userid = 1 AND total >= 200;

This returns the following table:

orderid order_date userid total


2 20200203 1 267
4 20200401 1 341

Combining Conditions with OR Statements

OR statements are used when only one condition needs to be true. This is helpful in situations where
it doesn’t matter which condition is true.

This follows the format below:

SELECT column_names
FROM table_name
WHERE condition1 OR condition2;

Let’s look at an example:

SELECT *
FROM orders
WHERE userid = 1 OR total >= 90;

This returns the following table:

orderid order_date userid total


1 20200201 1 134
2 20200203 1 267
3 20200302 2 99
4 20200401 1 341

The table above includes any record where there userid is equal to 1 or where the total is greater or
equal to 90.

Combining Conditions with both AND & OR Statements


Conditions can also be combined with both AND & OR statements to further refine our queries. In
the sample below, make note of how the brackets are used to contain the OR statement:

SELECT column_names
FROM table_name
WHERE condition1 AND (condition2 OR condition3);

Let’s try this out with another example:

SELECT *
FROM orders
WHERE total < 200 AND (userid = 1 OR userid = 3);

This returns the following table:

orderid order_date userid total


1 20200201 1 134
5 20200401 3 87

I'm learning SQL with @datagyio!


Click to Tweet

How Do You Aggregate Data with SQL?


SQL is not only useful for selecting data or maintaining databases, but also for aggregating data. SQL
has a number of helpful aggregation functions, including COUNT, SUM, AVG, MIN, and MAX.

As part of our SQL for beginners tutorial, let’s take a look at an example. We may be asked, “What is
the average value of each order?”. We can do this easily in SQL using our sample database by writing
the following code:

SELECT AVG(total) FROM orders;

This returns:
AVG(total)
185.6

How Do You Group Data with SQL?


GROUP BY is used with the SELECT statement and aggregation functions to group records by a
common value.

The code for this follows the convention below:

SELECT column1, ..., function1(column_name), ...


FROM table_name
GROUP BY column_name;

Let’s break this down a little:

The SELECT statement lists out columns and aggregate functions applied to columns.

The FROM statement identifies which table to pull data from,

The GROUP BY statement identifies which column to group by. It’s helpful to have this column in
the SELECT statement.

Let’s try this with an example. Say we wanted to know what the total value of orders and count of
orders were, by client, we could write:

SELECT userid, SUM(total), COUNT(total)


FROM orders
GROUP BY userid;

This returns the following table:

userid SUM(total) COUNT(total)


1 742 3
2 99 1
3 87 1

How Do You Change a Column Name in SQL?


To change a column name in SQL, an alias is used.

In the example above, we can see that the column names of SUM(total) and COUNT(total) are
accurate, but not easy to understand. We may want to change the column names to
“Total_Sale_Value” and “Total_Number_of_Sales”.

Similarly, we may want to capitalize userid.

In SQL, this is done with what is known as an alias. Let’s see how this is accomplished:

SELECT column1 as alias1, ... FROM table_name

The “as” is optional, but makes the code easier to read. The same would be accomplished using:

SELECT column1 alias1, ... FROM table_name

If we wanted to apply this to our query from the Aggregating Data example from earlier in our SQL
for beginners tutorial, we could write:

SELECT userid as UserID, SUM(total) as Total_Sale_Value, COUNT(total) as


Total_Number_of_Sales
FROM orders
GROUP BY userid;

This returns the table below:

UserID Total_Sale_Value Total_Number_of_Sales


1 742 3
2 99 1
3 87 1

Back to the Table of Contents of our SQL for Beginners Tutorial?

How Do You Join Tables in SQL?


So far, all the queries we’ve looked at have retrieved data from a single table. However, most
databases have many more tables and queries often require joining data from multiple tables. This is
done using the JOIN statement.

Let’s take a quick look at our database we created for this SQL tutorial for beginners:

The diagram above shows that client_id in the clients table has a one-to-many relationship with the
userid field in the orders table. Practically, this means that a single client can have multiple orders.

In terms of databases, this means that userid is a foreign key for the client_id field. Because this
relationship exists, we know that we can join these two tables.

There are a number of different types of joins. Let’s take a look at these now.
The different types of joins available in SQL

Inner Join

An Inner Join only the rows of tables that exist in both tables. Take the two tables as an example. If we
created a new client that did not yet have any orders, that new client would not show up as he or she
would not be represented within the orders table.

Let’s go through the syntax of how to write these joins:

SELECT table1name.column1, ... table2name.column1


FROM table1name
INNER JOIN table2name
ON tablename1.column1=table2name.column1;

Let’s explore this a little more:

In the SELECT statement, we include all the fields we want to bring in, from both tables. We prefix
the column names with the table name as best practice, in case there is an overlap between
column names.

If you knew that you wanted to return all records from one table, you could write table_name.*

The FROM statement is followed by an INNER JOIN statement that identifies the table the join.

The ON statement identifies which fields to merge on. This identifies the two fields in each table
that have a foreign key relationship.

Let’s demonstrate this with an example. Say we wanted to join in the first and last names of clients
onto the orders table. To demonstrate this better, let’s create a customer in the clients table, but not
any orders for that customer.

INSERT INTO clients(client_id, fname, lname, dob, gender, join_date) VALUES


(4, 'Claire', 'Voyant', 19890617, 'female', 20200407);

Now, let’s do an inner join of the two tables. If this runs correctly, we should not see our new client
returned in the table.

SELECT orders.*, clients.fname, clients.lname


FROM orders
INNER JOIN clients
ON orders.userid = clients.client_id;

This returns the following table:

orderid order_date userid total fname lname


1 20200201 1 134 Bruce Wayne
2 20200203 1 267 Bruce Wayne
3 20200302 2 99 Jean Grey
4 20200401 1 341 Bruce Wayne
5 20200401 3 87 Peter Parker

Outer Joins

There are three types of outer joins: left join, right join, and outer (or full) join.

Left Join

A left join includes all the records from the table on the “left” and only matching records from the
table on the right. If you’re familiar with VLOOKUP in Excel, you can think of a left join as being a
VLOOKUP from one table to another.

Let’s take a look how to write a left join in SQL:

SELECT table1name.column1, ... table2name.column1


FROM table1name
LEFT JOIN table2name
ON tablename1.column1=table2name.column1;

The format is quite similar to an inner join. Let’s explore this in more detail:

In the SELECT statement, we list out all the fields we want to bring in. We prefix the column names
with the table name.

If you knew that you wanted to return all records from one table, you could write table_name.*

The FROM statement is followed by a LEFT JOIN statement that identifies the table the join.

The ON statement identifies which fields to merge on.

Let’s now write a statement that merges in order data into the client table. What we would expect to
see is that any client that does not yet have any orders would still exist in the returned data, but not
have any data in the columns relating to orders.

SELECT clients.*, orders.*


FROM clients
LEFT JOIN orders
ON clients.client_id=orders.userid;

This returns the following:

client_id fname lname dob gender join_date total


1 Bruce Wayne 19880101 male 20200201 134
1 Bruce Wayne 19880101 male 20200201 267
1 Bruce Wayne 19880101 male 20200201 341
2 Jean Grey 19891223 female 20200302 99
3 Peter Parker 19801209 male 20200401 87
4 Claire Voyant 19890617 female 20200407

Note here that client_id 4 exists in the table, but does not return any values for the total column. This
is because that client hasn’t placed any orders (therefore doesn’t exist in the right table), but exists in
the left table.

Right Join

A right join includes all the records from the table on the “right” and only matching records from the
table on the left. However, note that SQLite doesn’t support a right join, but other implementations of
SQL do. As this is a SQL for beginners tutorial, we won’t cover off other languages here.

Full Join / Full Outer Join

A full outer join will return records from both tables, regardless if they exist in one and not in the
other. However, note that SQLite doesn’t support a right join, but other implementations of SQL do.

Enjoyed This Tutorial? Share it!

Conclusion: SQL for Beginners Tutorial


This brings us to the end of our SQL for beginners tutorial! Thanks so much for reading and I hope
you learned everything you need to get started!

You can download our printable PDF of this guide along with a complete database file by signing up
for our mailing list below!

OK, You’ve Learned SQL – Want to Learn Python?


We provide tons of free Python resources – check them out here! If you’re ready to dive into learning
everything you need to know to get started with Python for Data Science, check out our eBook here!
Want to learn Python for
Data Science? Check out my
ebook for as little as $10!
BUY NOW! LEARN MORE!
Tags: mysql SQL sqlite tutorial

Check Out My eBook!


datagy on YouTube!

Python Pivot Table…


Table…
y

00:0 09:2
0 7

Download Your FREE Pandas Guide!

Get your FREE download now!

email address

SUBSCRIBE

Download Your FREE Guide to Learn SQL!

You might also like