Week 1 686 F2022

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

MIS 686: Enterprise

Database Management
Fall 2022 | Section: 1 | Schedule: 22637
Tuesday, 4:00 – 6:40 PM | GMCS 305
DM Goldberg

1
Today
• Course admin/policy
• Syllabus information
• Canvas
• Course materials

• Introduction to data management

2
Masks
• SDSU policy is that masks are required in class until
September 15th

• Please keep a mask on and over both your nose


and mouth while in the classroom. This is to
protect everyone during the pandemic

• I will carry a box of masks. If you ever need one,


you can have one for free – no questions asked

3
Instructor
• David Goldberg

• Email: dgoldberg@sdsu.edu

• Office phone: (619)-594-0341

4
Office hours over Zoom
• My office is small… Office hours are over Zoom for now

• Always use https://dgoldberg.sdsu.edu/zoom

• Tuesday, 12:00 – 1:30 PM (open); Thursday, 12:00 –


1:30 PM (by appointment); or by request

5
This course
• Database management course

• No formal database/programming experience


required

• Why database?
• One of the most important industry skills for
business/data analytics
• Versatile and useful for many applications

6
Plan for this semester
• Part 1: database design
• Before any programming, setting up a database
efficiently to avoid problems in the future

• Part 2: database programming


• Using Structured Query Language (SQL) to create and
interact with databases

• Part 3: enterprise (web) database applications


• Building dynamic websites with PHP that interact with
the databases we created
7
Grading
Grading weights Final grade cutoffs
Homeworks 50% A 93%
Quizzes 50% A- 90%
B+ 87%
B 83%
B- 80%
C+ 77%
C 73%
C- 70%
D+ 67%
D 63%
D- 60%
F 0%

8
Homework
• 6 homeworks assigned throughout the semester
• At least one week to complete each homework
• Submit assignment through the course website (Canvas)
• Each homework is due before midnight on the deadline
(11:59 PM)

• Late work: 1.5% penalty for each hour late

9
Quizzes
• 6 quizzes given in-class throughout the semester

• Quiz dates are announced beforehand (see the


syllabus or course website; I’ll also remind you the
week before)

• Make-ups only for legitimate and documented


absences. Please notify me beforehand if you know
you’ll miss a quiz

10
Honor code
• Everything in this class is an individual assignment
unless explicitly stated otherwise

• It’s Ok to talk to one another about different


approaches, but any work you submit should be
your own

• If you have any questions about the honor code,


please ask!

11
Course materials
• You don’t need a textbook!

• If you think a textbook would be helpful to you,


there are some good free textbooks online

12
Installation – important for later
• Not yet, but later this semester, we will need a
database development environment to create and
work with our databases

• This semester, we will use XAMPP. This is free


industry-standard software and is compatible with
Windows, Mac, and Linux

13
Canvas
• Our course website this semester is on Canvas

• Our course website:


https://sdsu.instructure.com/courses/106798

14
History of data management
• Computerized file-based systems
• Composed of large “flat” files – think large Excel
spreadsheets
• Each business unit maintains its own files
• Dedicated staff (“data processing specialist”) in each
business unit to manage data

15
History of data management
• Example library “flat” file:
ISBN Title AuID AuName PubID PubName PubPhone Price
1-1111- C++ 4 Roman 1 Big House 123-456- $29.95
1111-1 7890
0-99- Emma 1 Austen 1 Big House 123-456- $20.00
999999-9 7890
0-91- Hamlet 5 Shakespeare 2 Alpha 999-999- $20.00
045678-5 Press 9999
0-11- Moby 2 Melville 3 Small 714-000- $49.00
345678-9 Dick House 0000
0-91- Fairie 3 Spencer 1 Big House 123-456- $15.00
335678-7 Queene 7890

16
Some terminology
• Table or “relation”: matrix of intersecting rows and
columns, or the entire flat file

• Entity: the category of object or concept about


which data is collected and stored
• In our library example, we’re really thinking about three
entities: books, authors, and publishers
• An entity is just referring to the category, not a specific
object. For example, we are referring to books generally
but not specifically Moby Dick

17
Some terminology
• Field or “attribute”: a characteristic of an entity
(column in the table)
• For example, ISBN, Title, etc.

• Record or “tuple”: the data about one specific


instance of an entity (row in the table)
• For example, the row of data on Moby Dick, the row of
data on Hamlet, etc.

18
Some terminology
Table or “relation”: matrix of intersecting rows
and columns, or the entire flat file

ISBN Title AuID AuName PubID PubName PubPhone Price


1-1111- C++ 4 Roman 1 Big House 123-456- $29.95
1111-1 7890
0-99- Emma 1 Austen 1 Big House 123-456- $20.00
999999-9 7890
0-91- Hamlet 5 Shakespeare 2 Alpha 999-999- $20.00
045678-5 Press 9999
0-11- Moby 2 Melville 3 Small 714-000- $49.00
345678-9 Dick House 0000
0-91- Fairie 3 Spencer 1 Big House 123-456- $15.00
335678-7 Queene 7890

19
Some terminology
Field or “attribute”: a characteristic of an entity
(column in the table)

ISBN Title AuID AuName PubID PubName PubPhone Price


1-1111- C++ 4 Roman 1 Big House 123-456- $29.95
1111-1 7890
0-99- Emma 1 Austen 1 Big House 123-456- $20.00
999999-9 7890
0-91- Hamlet 5 Shakespeare 2 Alpha 999-999- $20.00
045678-5 Press 9999
0-11- Moby 2 Melville 3 Small 714-000- $49.00
345678-9 Dick House 0000
0-91- Fairie 3 Spencer 1 Big House 123-456- $15.00
335678-7 Queene 7890

20
Some terminology
Record or “tuple”: the data about one specific
instance of an entity (row in the table)

ISBN Title AuID AuName PubID PubName PubPhone Price


1-1111- C++ 4 Roman 1 Big House 123-456- $29.95
1111-1 7890
0-99- Emma 1 Austen 1 Big House 123-456- $20.00
999999-9 7890
0-91- Hamlet 5 Shakespeare 2 Alpha 999-999- $20.00
045678-5 Press 9999
0-11- Moby 2 Melville 3 Small 714-000- $49.00
345678-9 Dick House 0000
0-91- Fairie 3 Spencer 1 Big House 123-456- $15.00
335678-7 Queene 7890

21
History of data management

22
Problems with file-based systems
• Data isolation: sharing of information between
departments is challenging

• Security: each file must be secured separately


• It’s usually easier to ensure the security of one system
than many different systems

• Data anomalies: due to the structure of data, we


may see more redundancy or mistakes in the data
• (More on the next few slides)
23
Data anomalies
• Update anomaly: due to duplication in the data
structure, a single update may need to be made
multiple times throughout the flat file
• If we forget to make one update, then we don’t know
the correct values going forward

• For example: in our library flat file, we have the


same publisher’s information multiple times

24
Update anomaly
• What if we need to update Big House’s phone number?
ISBN Title AuID AuName PubID PubName PubPhone Price
1-1111- C++ 4 Roman 1 Big House 123-456- $29.95
1111-1 7890
0-99- Emma 1 Austen 1 Big House 123-456- $20.00
999999-9 7890
0-91- Hamlet 5 Shakespeare 2 Alpha 999-999- $20.00
045678-5 Press 9999
0-11- Moby 2 Melville 3 Small 714-000- $49.00
345678-9 Dick House 0000
0-91- Fairie 3 Spencer 1 Big House 123-456- $15.00
335678-7 Queene 7890

25
Data anomalies
• Insertion anomaly: due to extra fields in the data
structure, we may not be able to insert a new
record until we can fill in every field

• For example: in our library flat file, we cannot enter


information for a new publisher until they have
published a book

26
Insertion anomaly
• What if we need to add information for a new publisher,
“Beta Press”, with a phone number of 444-444-4444?

ISBN Title AuID AuName PubID PubName PubPhone Price


1-1111- C++ 4 Roman 1 Big House 123-456- $29.95
1111-1 7890
0-99- Emma 1 Austen 1 Big House 123-456- $20.00
999999-9 7890
0-91- Hamlet 5 Shakespeare 2 Alpha 999-999- $20.00
045678-5 Press 9999
0-11- Moby 2 Melville 3 Small 714-000- $49.00
345678-9 Dick House 0000
0-91- Fairie 3 Spencer 1 Big House 123-456- $15.00
335678-7 Queene 7890
27
Data anomalies
• Deletion anomaly: due to extra fields in the data
structure, if we delete one record, we may lose
more information than desired

• For example: in our library flat file, if we delete a


book no longer being published, then we might
delete the publisher’s and author’s information too

28
Deletion anomaly
• What happens to the data for the publisher “Small
House” if we want to delete Moby Dick?

ISBN Title AuID AuName PubID PubName PubPhone Price


1-1111- C++ 4 Roman 1 Big House 123-456- $29.95
1111-1 7890
0-99- Emma 1 Austen 1 Big House 123-456- $20.00
999999-9 7890
0-91- Hamlet 5 Shakespeare 2 Alpha 999-999- $20.00
045678-5 Press 9999
0-11- Moby 2 Melville 3 Small 714-000- $49.00
345678-9 Dick House 0000
0-91- Fairie 3 Spencer 1 Big House 123-456- $15.00
335678-7 Queene 7890
29
One solution: RDBMS
• Relational Database Management Systems (RDBMS) are
a popular solution to the problems with flat files. The
basic idea is to break the entities into different tables.
We can reconcile the data in the different tables later

• RDBMS were proposed by Edgar Codd in 1970, who was


working at IBM at the time

• IBM didn’t really see the point, but others did, and
Oracle started to popularize RDBMS around 1980
30
Example relational structure
Authors
AuID AuName
Books
4 Roman
ISBN Title AuID PubID Price
1 Austen
1-1111- C++ 4 1 $29.95
5 Shakespeare 1111-1
2 Melville 0-99- Emma 1 1 $20.00
999999-9
3 Spencer 0-91- Hamlet 5 2 $20.00
045678-5
Publishers 0-11- Moby 2 3 $49.00
345678-9 Dick
PubID PubName PubPhone
0-91- Fairie 3 1 $15.00
1 Big House 123-456-7890 335678-7 Queene
2 Alpha Press 999-999-9999
3 Small House 714-000-0000

31
Example relational structure
• Addressing the data anomalies:

• Update anomaly: since publishers are stored in their


own table, we no longer have duplication. Whenever we
update publisher information, we can just update it in
one place

32
Example relational structure
Authors
AuID AuName
Books
4 Roman
ISBN Title AuID PubID Price
1 Austen
1-1111- C++ 4 1 $29.95
5 Shakespeare 1111-1
2 Melville 0-99- Emma 1 1 $20.00
999999-9
3 Spencer 0-91- Hamlet 5 2 $20.00
045678-5
Publishers 0-11- Moby 2 3 $49.00
345678-9 Dick
PubID PubName PubPhone
0-91- Fairie 3 1 $15.00
1 Big House 123-456-7890 335678-7 Queene
2 Alpha Press 999-999-9999
3 Small House 714-000-0000

33
Example relational structure
• Addressing the data anomalies:

• Insertion anomaly: if we want to insert information for a


new publisher, we can do so now even if they haven’t
published a book yet

34
Example relational structure
Authors
AuID AuName
Books
4 Roman
ISBN Title AuID PubID Price
1 Austen
1-1111- C++ 4 1 $29.95
5 Shakespeare 1111-1
2 Melville 0-99- Emma 1 1 $20.00
999999-9
3 Spencer 0-91- Hamlet 5 2 $20.00
045678-5
Publishers 0-11- Moby 2 3 $49.00
345678-9 Dick
PubID PubName PubPhone
0-91- Fairie 3 1 $15.00
1 Big House 123-456-7890 335678-7 Queene
2 Alpha Press 999-999-9999
3 Small House 714-000-0000
4 Beta Press 444-444-4444
35
Example relational structure
• Addressing the data anomalies:

• Deletion anomaly: if we delete a book, the publisher


and author information is safe in other tables

36
Example relational structure
Authors
AuID AuName
Books
4 Roman
ISBN Title AuID PubID Price
1 Austen
1-1111- C++ 4 1 $29.95
5 Shakespeare 1111-1
2 Melville 0-99- Emma 1 1 $20.00
999999-9
3 Spencer 0-91- Hamlet 5 2 $20.00
045678-5
Publishers 0-11- Moby 2 3 $49.00
345678-9 Dick
PubID PubName PubPhone
0-91- Fairie 3 1 $15.00
1 Big House 123-456-7890 335678-7 Queene
2 Alpha Press 999-999-9999
3 Small House 714-000-0000

37

You might also like