Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 55

Introduction to SQL

CS1703 Data and Information


Week 4 (15/10/15)
Dr Alan Serrano
Dept. of Computer Science, Brunel University London

1
Data
Degrees

Data
Modules
Students
grades
friends

Data

You postings
£££ accounts

Data
transactions
How do they store data?
Relational Database
Tables

8
Relational data structure
Relational
Database

9
Structured Query Language (SQL)
Main language for relational DBMSs.
Main characteristics:
Relatively easy to learn
Declarative - you specify what information you
require, rather than how to get it
 Vs. imperative/procedural languages like Java/Python
Essentially free-format
Consists of standard English words like SELECT,
INSERT, and UPDATE
Can be used by range of users

10
Objectives of SQL
Ideally database language should let the user:
 create database and table structures
 perform basic tasks like adding (insert), amending (update), and removal (delete)
 perform both simple and complex queries
Must perform these tasks with minimal user effort.
Must be portable – must conform to recognised standard (i.e. International
Standards – ISO)

SQL has two main components:


 Data Definition Language (DDL) for defining DB structure and controlling access to
data
 Data Manipulation Language (DML) for retrieving and updating data

In this lecture we will be focusing on DML

11
Writing SQL Commands
SQL statement consists of reserved words and
user-defined words
Reserved words (e.g. SELECT, INSERT) are a
fixed part of SQL and must be spelled exactly as
required and cannot be split across lines
User-defined words: made up by user and
represent names of various database objects such
as tables, columns, views

12
Literals
Literals are constants used in SQL statements.
All non-numeric literals must be enclosed in
single quotes (eg. ‘New York’)
Numeric literals are not enclosed in quotes (eg.
650.00)
String field Numeric
Example: field
INSERT INTO BOOK (bTitle, bPrice)
VALUES (‘Hocus Pocus’, 5.99);

Literal string Literal numeric


13
Stayhome Online Rentals
example
Example of a DB for a DVD rental company
Convention for representing a relational DB is to give
each table name followed by their column names in
brackets
Underlined column names are primary keys:
Table (Relation) Columns (Fields, Attributes)

DistributionCenter (dCenterNo, dStreet, dCity, dState,


dZipCode, mgrStaffNo)

Staff (staffNo, name, position, salary, eMail,


dCenterNo)
DVD (catalogNo, title, genre, rating)

Actor (actorNo, actorName)


DVDActor (actorNo, catalogNo, character)
:

14
Syntax for SELECT statement
Vertical bar | indicates a
SELECT [DISTINCT | ALL] choice among alternatives

{* | [columnExprn [AS newName]] [,...] }


Curly brackets {...} indicate a
FROM TableName [alias] [, ...] required element

[WHERE condition] Square brackets [...] indicate


an optional element or
[GROUP BY columnList] [HAVING condition] optional repetition [, ...]

[ORDER BY columnList]

columnExprn / columnList represents a column name


newName represents a new name to serve as a heading in the result table
TableName represents the name of an existing table
alias is an optional abbreviation for TableName
condition is some logical expression (e.g. salary > 40000)

15
Description of clauses
SELECT Specifies which columns are to appear in
output.
FROM Specifies table(s) to be used.
WHERE Filters rows subject to some conditions.
GROUP BY Forms groups of rows with same column
value.
HAVING Filters groups subject to some condition.
ORDER BY Specifies the order of the output.
Order of the clauses cannot be changed.
Only SELECT and FROM are mandatory.
16
A very simple query
Using just SELECT and FROM
Query: “List the full details of all DVDs held”
The columns to retrieve
SELECT catalogNo, title, genre, rating
FROM DVD;
Terminate statement with a semi-colon

Or just (as we are specifying all columns)

SELECT *
Asterisk indicates all columns
FROM DVD;

17
Result table:

18
Retrieve specific columns, all
rows
Query: “List the catalog number, title, and genre of all
DVDs held”

SELECT catalogNo, title, genre


FROM DVD;

19
DISTINCT keyword
SELECT does not eliminate duplicate values by default
Replace SELECT with SELECT DISTINCT
SELECT genre vs. SELECT DISTINCT genre
FROM DVD; FROM DVD;

20
Calculated fields
Allow you to transform existing fields according to
some arithmetic operation
For example in the Staff table, the salary column
holds annual salary values
To compute monthly salary we just divide salary by
12:

SELECT staffNo, name, position, salary / 12


FROM Staff;

21
Calculated fields: AS clause
This is the result of the last query:
monthlySalary

To give the column a meaningful name, we use the AS


clause:
SELECT staffNo, name, position, salary / 12 AS monthlySalary
FROM Staff;

22
WHERE clause
Follows the FROM clause
Only retrieve rows that satisfy some condition
Five basic search conditions (or predicates):
 Comparison: compare the value of one expression to the
value of another
 Range: test whether value falls within a specified range
 Set membership: test whether the value of an expression
equals one of a set of values
 Pattern match: test whether a string matches a specified
pattern
 Null: test whether a column has a unknown value

23
Comparison search
Query: “List all staff with a salary greater than
$40,000”

SELECT staffNo, name, position, salary


FROM Staff
WHERE salary > 40000

24
Common operators
= equals<> is not equal to
> is greater than < is less than
<= is less than or equal to
>= is greater than or equal to

Logical operators and brackets can be used to express more complex conditions
AND – both conditions must be satisfied
OR – one or both conditions must be satisfied
NOT – condition is false

Expressions are always evaluated from left to right


Subexpressions (in brackets) are evaluated first
NOTs are evaluated before ANDs and Ors
ANDs are evaluated before ORs

25
Range search condition
Query: “List all staff with a salary between $45,000 and $50,000”

SELECT staffNo, name, position, salary


FROM Staff
WHERE salary >= 45000 AND salary <= 50000;

Q: How would you express the condition that salaries


should be either below $25,000 or above $50,000?
WHERE salary < 25000 OR salary > 50000

Alternative way of specifying a range?


Check text book (p.55) for use of BETWEEN test

26
Set membership condition
Query “List all DVDs in the Sci-Fi or Children genre”

SELECT catalogNo, title, genre


FROM DVD
WHERE genre = ‘Sci-Fi’ OR genre = ‘Children’;

27
Pattern matching condition
 Sometimes you need to find records that contain some approximate string
value
 E.g. All names beginning with “B”

 SQL has two pattern-matching symbols:


% any sequence of zero or more characters
_ any single character

 Use the LIKE keyword:


 LIKE ‘S%’  any string starting with the letter “S”
 LIKE ‘S_ _ _ _’  any five letter string starting with “S”
 LIKE ‘%S’  any string ending with the letter “S”

 On the other hand if you want to filter a pattern out:


 NOT LIKE ‘S%’

28
Pattern matching example
Query: “Find all staff whose first name is Sally”

SELECT staffNo, name, position, salary


FROM Staff
WHERE name LIKE ‘Sally%’;

29
NULL search condition
Sometimes rows will have a null value (rather like
missing values in SPSS)
You cannot just specify an empty string or a zero in
the WHERE clause
Use IS NULL keyword:
Query: “List rentals that have no return date specified”

SELECT deliveryNo, DVDNo


FROM DVDRental
WHERE dateReturn IS NULL;

30
Sorting the result table
Usually, the rows of a results table are presented in no
particular order (or at most by primary key)
ORDER BY clause allows rows to be sorted by one or
more specified columns
Rows will be ordered by columns in order of their
presentation
Each column name must be separated by a comma
Use ASC or DESC keyword to specify ascending or
descending order

31
Sorting example
Query: “List all DVDs sorted in descending order of
genre”

SELECT *
FROM DVD
ORDER BY genre DESC;

32
Sorting by two columns
We can add a minor ordering clause to sort the same
genres on catalogNo:

ORDER BY genre DESC, catalogNo ASC;

33
Using the SQL Aggregate Functions
ISO SQL standard defines five aggregate functions:
COUNT - Returns number of values in specified
column.
SUM - Returns sum of values in specified column.
AVG - Returns average of values in specified column.
MIN - Returns lowest value in specified column.
MAX - Returns highest value in specified column.

34
Using the SQL Aggregate Functions
Each operates on a single column of a table and
returns a single value.
COUNT, MIN, and MAX apply to numeric and
non-numeric fields, but SUM and AVG only for
numeric fields.
Apart from COUNT(*), each function eliminates
nulls first and operates only on remaining non-
null values.

35
Using the SQL Aggregate Functions
COUNT(*) counts all rows of a table, regardless
of whether nulls or duplicate values occur.
Can use DISTINCT before column name to
eliminate duplicates.
DISTINCT has no effect with MIN/MAX, but may
have with SUM/AVG.

36
Use of COUNT and SUM
Query: “List total number of staff with salary
greater than $40,000 and the sum of their
salaries.”
SELECT COUNT(staffNo) AS totalStaff, SUM(salary)
AS totalSalary
FROM Staff
WHERE salary > 40000;

37
Use of MIN, MAX, AVG
Query: “List the minimum, maximum, and
average staff salary.”
SELECT MIN(salary) AS minSalary,
MAX(salary) AS maxSalary,
AVG(salary) AS avgSalary
FROM Staff;

38
Grouping Results
Use GROUP BY clause to get sub-totals
SELECT and GROUP BY are closely integrated:
each item in SELECT list must be single-valued
per group, and SELECT clause may only contain:
column names
aggregate functions
constants
expression with combination of above

39
Grouping Results
All column names in SELECT list must appear in
GROUP BY clause unless used only in an
aggregate function.
If used, WHERE is applied first, then groups are
formed from remaining rows satisfying predicate.
ISO considers two nulls to be equal for purposes
of GROUP BY.

40
Use of GROUP BY
Query: “Find number of staff working in each
distribution center and the sum of their salaries.”

SELECT dCenterNo, COUNT(staffNo) AS totalStaff,


SUM(salary) AS totalSalary
FROM Staff
GROUP BY dCenterNo
ORDER BY dCenterNo;

41
Restricting returned groups
HAVING clause designed for use with GROUP BY
to restrict groups that appear in final result table
Similar to WHERE, but WHERE filters
individual rows whereas HAVING filters groups
Column names in HAVING clause must also
appear in the GROUP BY list or be contained
within an aggregate function

42
Use of HAVING
 Query: “For each distribution center with more
than 1 member of staff, find number of staff in
each center and sum of their salaries.”
SELECT dCenterNo, COUNT(staffNo) AS totalStaff,
SUM(salary) AS totalSalary
FROM Staff
GROUP BY dCenterNo
HAVING COUNT (staffNo) > 1
ORDER BY dCenterNo;

43
Multi-Table Queries
If result columns come from more than one table must use a
join.
To perform join, include more than one table in FROM clause.
Use comma as separator with typically a WHERE to specify
join column(s).
Also possible to use an alias for a table named in FROM
clause.
Alias is separated from table name with a space.
Alias can be used to qualify column names when there is
ambiguity.

44
Simple Join
Query: “List all actors and the characters they have
played in DVDs.”

SELECT a.actorNo, actorName, character


FROM Actor a, Role r
WHERE a.actorNo = r.actorNo;

45
Statements for modifying data
SQL is not just for retrieving data, there are
commands for modifying the data as well
INSERT – adds new rows of data to an existing table
UPDATE – modifies existing data in a table
DELETE – removes rows of data from a table

46
Syntax for INSERT statement
To add new row(s) to table:

INSERT INTO TableName [ (columnList) ]


VALUES (dataValueList)
columnList is optional; if omitted, SQL assumes a list of all columns in
their original CREATE TABLE order
Any columns omitted must have been declared as NULL or a DEFAULT
value specified when table was created
dataValueList must match columnList as follows:
 number of items in each list must be same;
 must be direct correspondence in position of items in two lists;
 data type of each item in dataValueList must be compatible with data type of
corresponding column

47
Use of INSERT
Insert a row into the DVD table:

DVD (catalogNo, title, genre, rating)

INSERT INTO DVD


VALUES (‘207132’, ‘Casino Royale’, ‘Action’ ‘PG-13’);

48
UPDATE existing record in table
The syntax of the UPDATE statement is:
UPDATE TableName
SET columnName1 = dataValue1
[, columnName2 = dataValue2...]
[WHERE searchCondition]
SET clause specifies names of one or more
columns that are to be updated.

49
UPDATE existing data in table
WHERE clause is optional:
if omitted, named columns are updated for all
rows in table;
if specified, only those rows that satisfy
searchCondition are updated.
New dataValue(s) must be compatible with data
type for corresponding column.

50
Use of UPDATE
Robert Chin (S3250) has been promoted to
manager:
UPDATE Staff
SET position = ‘Manager’ , supervisorStaffNo =
NULL
WHERE staffNo = ‘S3250’;

51
Use of UPDATE
Update the salary for all Managers with a 2%
bonus:
UPDATE Staff
SET salary = salary * 1.02
WHERE position = ‘ Manager’;

52
DELETE rows of data from a table
DELETE FROM TableName
[WHERE searchCondition]
searchCondition is optional; if omitted, all rows are
deleted from table. This does not delete table. If
searchCondition specified, only those rows that
satisfy condition are deleted.

53
Query 3.22 Delete rows in a table
Delete rental DVDs for catalog number 634817:

DELETE FROM DVDcopy


WHERE catalogNo = ‘634817’;

54
Review this lecture
Connolly, T., Begg, C. & Holowczak, R (2008)
Business Database Systems
 Chapter 3

SQL tutorial already available


Download SQL Lab Interface from Resources to
practice commands
More advanced tutorial (covering joins etc.) available
next week

55

You might also like