Professional Documents
Culture Documents
MATH6183 Workshop 1 SQL
MATH6183 Workshop 1 SQL
Southampton Sciences
MATH6183
Data Mining and Analytics
Introduction to
SQL: Workshop 1
Contents
1 Introduction 3
1.1 Installing SQLite on Your Computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Working with SQLite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Exercise 8
1 Introduction
In this workshop, we will be using SQL to extract information from a database. We will be working
in SQLite, but the SQL commands should work in other database applications, e.g. Access, mySQL,
PostgreSQL, Oracle, etc. There is a wide range of literature on SQL and you will need to find a book or
e-book that suits you best. The following two books take their readers from the very basics of SQL and
databases up to some relatively advanced ideas.
• SQL for Dummies, 8th Edition: Allen G. Taylor, John Wiley and Sons, New Jersey, USA, 2013.
• A Beginner’s Guide to SQL. 3rd Edition: Andy Oppel and Robert Sheldon, McGraww Hill, USA,
2009.
Both contain descriptions of all of the SQL commands that you are likely to need for this module. In
addition, they are able to place SQL in a wider context so that you can read about the other issues that
need to be taken into account when building databases that we do not have time to cover in this module,
e.g. keeping them secure.
We will be working with two different databases in the SQL workshops: the library database and the fruit
and vegetables database. Both are described below. There is an exercise at the end of this worksheet that
you should work through to check your understanding of what is covered in this tutorial.
• SQLiteStudio www.sqlite.org
• DBeaver https://dbeaver.io/
• DB Browser for SQLite https://sqlitebrowser.org/
While we will not be working with these directly in the tutorials, you may find them useful when working
with SQLite on your own machines.
1. Open the shell or command prompt (search on the task bar if you are unsure where to find this) and
type sqlite3. You should then see some text appear to show that sqlite3 has opened.
2. Double click on sqlite3.exe in File Explorer.
We go through some of the most useful commands in SQLite below but for a more detailed description,
including a list of commands, see https://www.sqlite.org/cli.html.
Introduction to SQL 4
To either create a new database or open an existing called test.db type the following command into the
command prompt.
.open test.db
If test.db does not exist sqlite will create a new database.
To ensure that the database is created or opened from the same folder, you are able to include the full
pathname. For example, to open test.db in the folder called “work” in the C drive, type the following.
.open c:/work/test.db
It is not straightforward to check what your working directory is. If you are working within the Command
Prompt on Windows, the following command should return the path of the working directory.
.shell cd
(The command .shell will allow you to run any commands typically used within the program/shell you are
using to access SQLite. For Windows, the command cd will either change the directory or list the currend
directory.)
Our aim in this part of the course is to write SQL to extract useful information from a database. Sometimes
that information might just be one answer but often the output is a data table. SQLite will, as a default,
use the LIST mode for outputs in which each row of a query result is written on one line of output and
each column within that row is separated by a specific separator string. The default separator is a pipe
symbol (“—”). List mode is especially useful when you are going to send the output of a query to another
program.
There are 14 output modes available: ascii, box, csv, column, html, insert, json, line, list, markdown,
quote, table, tabs, tcl. You are likely to be mainly using list or csv. The second of these outputs a csv file
including the query results.
• .mode on its own will tell you which output mode is being used.
• .mode csv will change the output mode to csv (and similar for other available output modes).
For example, to output the results of a query on the Supplier table to a csv file called dataout.csv stored
in the downloads folder on the C drive you would type.
.headers on
.mode csv
.once c:/users/username/downloads/dataout.csv
SELECT * FROM Supplier;
Note that the SQL code in this example is the “SELECT * FROM Supplier;”. Check what the “.headers
on” command does in this example. The computers in some of the university teaching space will try to
open csv files in Minitab. It is actually easier to view them in Notepad or Excel. To do this, right click on
the file in File Explorer and click Open With, then choose the program to use to view the results.
Check that you have understood this by downloading both of the example databases from
Blackboard and saving them on your machine, then try to open them in SQLite. As you go
through the remainder of the worksheet, run the SQL queries on the Fruit and Veg database
and check the output matches that given in the worksheet, before working on the exercises.
Introduction to SQL 5
1.3 Examples
Example: Fruit and Veg
A warehouse wishes to set up a database to record the movement of fruit and vegetables. These arrive at
the warehouse from farms and are then sent out to supermarkets or other customers. Currently, everything
is recorded manually but a database is needed to allow the end customers to trace the origin of their food.
The standard form attached to each consignment is given below and a database is provided on Blackboard
for you to download. We do not cover structuring of databases as part of the syllabus but you may want
to think about how you would split the information given here into tables within a database to avoid
repeating information.
Product Information
Description Apples Picked 27/09/08
Arrived 29/09/08 Quality High
Origin Cider Farm Checked in by Bob
Contact Mrs G. Smith Telephone 01938 340928
Address Cider Farm, Gloucester Postcode GL1 2NH
Destination Waitrose, Cirencester Lorry B78
Contact Mr J. Lewis Telephone 01234 567890
Address Waitrose, High Street, Cirencester Postcode CL1 2GH
Example: Library
You have been asked to work with a database that can keep track of the availability of books in a small
lending library and who currently has these books on loan. The library would like to be able to contact
customers who have overdue books and customers would like to be able to search for books on different
subjects. The dataset was created via simulation and a copy is available to download from Blackboard.
Table 1: A selection of the datatypes that you are most likely to use in SQL. Note that some formats may
vary from one package to another.
Introduction to SQL 7
Look at the output from this query - it does not look tidy. This can be improved by adding a space in
between the output of the two fields. See if you can do this for yourself (Hint, you will need to use quotes
“ ”).
(Note if you get fed up with seeing a huge number of results add WHERE DispatchID = 1 before the
semi-colon and it will just return the output for the first entry in the Dispatch table.)
SQL has a standard date format yyyy-mm-dd and the DATE function will not work if dates are included
in the database in an incorrect format.
There is also a DATETIME data type which we use in the library database to record the date and time of
each transaction. To find the date of a DATETIME field you can use the DATE() function. For example,
to find the date on which a transaction took place in the library database we would use DATE(true date).
Similarly we may want to find the time that a transaction took place and we can then use TIME(true date)
to return just the time part of the true date field.
You may also want to extract either the month or the year from a date variable. To do this, you can
use the function strftime(). See https://www.w3resource.com/sqlite/sqlite-strftime.php for full
details of how to use this function.
You may wish to use a more complicated predicate and in this case, you will need to make use of logic
operators: AND, OR, NOT. For example, if you wish to run the date query where the dispatchID is equal
to 1 or 2, you would add the line
If you wish to view suppliers that are called products that are high quality and grapes then you would use
the following query.
Remember that ‘SELECT *’ means that the query will return all of the fields of the data table.
Test this out and variations of this query to be sure that you understand how it works.
Introduction to SQL 8
For example, in the Fruit and Veg database, we may want to return the number of times that a particular
lorry is used to dispatch an order:
SELECT COUNT(DispatchID)
FROM Dispatch
WHERE Lorry = ‘L4’;
Combining with these aggregate functions, we may wish to return the top few entries in an ordered list.
We can do this by first ordering by the quantity of interest and then using the LIMIT function to specify
how many entries to output. For example, the following returns the two lorries that have recorded the
most trips in the Dispatch table.
The command DESC is needed to ensure that we are sorting in descending order, while the GROUP BY
command ensures that we are counting the number of trips per lorry. We will see GROUP BY again in
workshop 2.
3 Exercise
In carrying out this exercise, you will need to know the names of the tables and fields for the library
dataset. These are listed below.
• transactions: transaction id, true date, book id, user id, trans type.
There is a way of finding out field names and details in SQLite, which may be useful when working with
databases that you do not have the full details of (or if you have forgotten the details). After opening the
database, type the following to find out details about a table called table-name.
PRAGMA table info(table-name);
Alternatively, to view information about all of the tables and their fields, use .schema. For the library
database, this outputs the following.
Introduction to SQL 9
CREATE TABLE books(book_ID INTEGER NOT NULL, title VARCHAR(50), author1 VARCHAR(20),
author1_intial VARCHAR(10), author2 VARCHAR(20), author2_initial VARCHAR(10),
year INTEGER, topic VARCHAR(20), num_copies INTEGER);
CREATE TABLE transactions(transaction_id INTEGER NOT NULL, true_date DATE, book_id INTEGER,
user_id INTEGER, trans_type VARCHAR(10));
CREATE TABLE users(user_id INTEGER NOT NULL, first_name VARCHAR(30),
surname VARCHAR(30), email VARCHAR(50));
Use SQL to obtain the following information from the library database:
1. Output a list of library users who have the first name Michael, including their full name and their
e-mail addresses, where their first and second names are included in the first column and their e-mail
addresses in the second column.
2. Assuming that library books are allowed on loan for 3 weeks, output a list of library books that
have been loaned out, detailing the date that they were loaned out and the date they were due to be
returned. To reduce the amount of text that you output, do this only for the first 10 transactions in
the database by placing a condition on transaction id.
3. The number of times that Studies in Optimisation has been withdrawn from the library.
4. The most recent transaction in the library.
5. List all of the loans that user 7 has made.
6. (Extension) List all of the transactions that user 7 has made during October.
Command Description
SELECT An important command that is used to select a set of data that fulfils
the criteria provided in the SQL query
DATE Use to convert text to a date or to manipulate a date, e.g. add or
subtract a number of days or years.
WHERE Allows the user to specify conditions on the data to be used in the query
CREATE Use to create a table. In Access, CREATE can only be used to create a
table but in other implementations of SQL it allows you to create other
objects.
DROP Use to delete anything that has been created by a CREATE statement.
ALTER Use to alter tables or columns
DELETE Use to delete rows
INSERT Adds new rows to a table
Table 2: A selection of key SQL Commands Used in this Workshop. The final few are used in the following
optional section.
CREATE TABLE
For example, if we return to the fruit and vegetable example, we would create the Supplier table by typing
the following into SQLite.
CREATE TABLE Supplier (
Each attribute has its data type specified and any constraints that you wish to place on it, e.g. the primary
key (ContactID) has been specified as being NOT NULL.
If you need to change the structure of a table or remove it from the database, use the ALTER TABLE
or DROP commands. For example, to add a yes/no column named Testing to the Suppliers table in the
Fruit and Veg database, write
ALTER TABLE Supplier
ADD COLUMN Testing BIT;
If you now want to delete the column,
ALTER TABLE Supplier
DROP COLUMN Testing;