Professional Documents
Culture Documents
Fundamentals of Information Systems, Ninth Edition: Database System and Big Data
Fundamentals of Information Systems, Ninth Edition: Database System and Big Data
Chapter 3
Database System and Big Data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain
1
product or service or otherwise on a password-protected website for classroom use.
Objectives
After completing this chapter, you will be able to:
Identify and briefly describe the members of the hierarchy of data
Identify the advantages of the database approach to data
management
Identify the key factors that must be considered when designing a
database
Identify the various types of data models and explain how they
are useful in planning a database
Describe the rational database model
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service
or otherwise on a password-protected website for classroom use.
Objectives
After completing this chapter, you will be able to (cont’d):
Define the role of the database schema, data definition language,
and data manipulation language
Discuss the role of a database administrator and data
administrator
Identify the common functions performed by all database
management systems
Define the term big data
Explain why big data represents a challenge and an opportunity
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service
or otherwise on a password-protected website for classroom use.
Objectives
After completing this chapter, you will be able to (cont’d):
Define the term data management
Define the terms data warehouse, data mart, and data lakes and
explain how they are different
Outline the extract, transform, load process
Explain how a NoSQL database is different from an SQL database
Discuss the whole Hadoop computing environment
Define the term in-memory database and explain its advantages in
processing big data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service
or otherwise on a password-protected website for classroom use.
Introduction
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 5
or otherwise on a password-protected website for classroom use.
Data Fundamentals
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 6
or otherwise on a password-protected website for classroom use.
Hierarchy of Data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 7
or otherwise on a password-protected website for classroom use.
Hierarchy of Data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 8
or otherwise on a password-protected website for classroom use.
Data Entities, Attributes, and Keys
• Entity: a person, place, or thing for which data is collected, stored, and
maintained
• Attribute: a characteristic of an entity
• Data item: the specific value of an attribute
• Primary key: a field or set of fields that uniquely identifies the record
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 9
or otherwise on a password-protected website for classroom use.
Data Entities, Attributes, and Keys
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 10
or otherwise on a password-protected website for classroom use.
The Database Approach
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 11
or otherwise on a password-protected website for classroom use.
The Database Approach
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 12
or otherwise on a password-protected website for classroom use.
Data Modeling and Database Characteristics
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 13
or otherwise on a password-protected website for classroom use.
Data Modeling
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 14
or otherwise on a password-protected website for classroom use.
Data Modeling
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 15
or otherwise on a password-protected website for classroom use.
Data Modeling
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 16
or otherwise on a password-protected website for classroom use.
Relational Database Model
• Relational model: a simple but highly useful way to organize data into
collections of two-dimensional tables called relations
• Each row in the table represents an entity
• Each column represents an attribute of that entity
• Domain: range of allowable values for a data attribute
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 17
or otherwise on a password-protected website for classroom use.
Relational Database Model
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 18
or otherwise on a password-protected website for classroom use.
Manipulating Data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 19
or otherwise on a password-protected website for classroom use.
Manipulating Data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 20
or otherwise on a password-protected website for classroom use.
Manipulating Data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 21
or otherwise on a password-protected website for classroom use.
Data Cleansing
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 22
or otherwise on a password-protected website for classroom use.
Data Cleansing
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 23
or otherwise on a password-protected website for classroom use.
Relational Database Management Systems (DBMSs)
• Creating and implementing the right database system ensures that the
database will support both business activities and goals
• Capabilities and types of database systems vary considerably
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 24
or otherwise on a password-protected website for classroom use.
SQL Databases
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 25
or otherwise on a password-protected website for classroom use.
SQL Databases
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 26
or otherwise on a password-protected website for classroom use.
SQL Databases
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 27
or otherwise on a password-protected website for classroom use.
Database Activities
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 28
or otherwise on a password-protected website for classroom use.
Providing a User View
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 29
or otherwise on a password-protected website for classroom use.
Creating and Modifying the Database
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 30
or otherwise on a password-protected website for classroom use.
Creating and Modifying the Database
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 31
or otherwise on a password-protected website for classroom use.
Storing and Retrieving Data
• When an application program needs data, it requests the data through the
DBMS
• Concurrency control deals with the situation in which two or more users or
applications need to access the same record at the same time
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 32
or otherwise on a password-protected website for classroom use.
Manipulating Data and Generating Reports
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 33
or otherwise on a password-protected website for classroom use.
Manipulating Data and Generating Reports
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 34
or otherwise on a password-protected website for classroom use.
Database Administration
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 35
or otherwise on a password-protected website for classroom use.
Database Administration
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 36
or otherwise on a password-protected website for classroom use.
Popular Database Management Systems
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 37
or otherwise on a password-protected website for classroom use.
Popular Database Management Systems
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 38
or otherwise on a password-protected website for classroom use.
Using Databases with Other Software
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 39
or otherwise on a password-protected website for classroom use.
Big Data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 40
or otherwise on a password-protected website for classroom use.
Sources of Big Data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 41
or otherwise on a password-protected website for classroom use.
Big Data Uses
• Examples:
• Retail organizations monitor social networks to engage brand advocates, identify
brand adversaries
• Advertising and marketing agencies track comments on social media
• Hospitals analyze medical data and patient records
• Consumer product companies monitor social networks to gain insight into consumer
behavior
• Financial service organizations use data to identify customers who are likely to be
attracted to increasingly targeted and sophisticated offers
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 42
or otherwise on a password-protected website for classroom use.
Challenges of Big Data
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 43
or otherwise on a password-protected website for classroom use.
Data Management
• Data management
• An integrated set of functions that defines the processes by which data is obtained,
certified fit for use, stored, secured, and processed in such a way as to ensure that the
accessibility, reliability, and timeliness of the data meet the needs of the data users
within an organization
• Data governance
• Defines the roles, responsibilities, and processes for ensuring that data can be trusted
and used by an entire organization
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 44
or otherwise on a password-protected website for classroom use.
Data Management
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 45
or otherwise on a password-protected website for classroom use.
Data Management
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 46
or otherwise on a password-protected website for classroom use.
Data Management
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 47
or otherwise on a password-protected website for classroom use.
Data Warehouses, Data Marts, and Data Lakes
• Data warehouse: a large database that collects business information from many
sources in the enterprise in support of management decision making
• ETL process
• Extract
• Transform
• Load
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 48
or otherwise on a password-protected website for classroom use.
Data Warehouses, Data Marts, and Data Lakes
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 49
or otherwise on a password-protected website for classroom use.
Data Warehouses, Data Marts, and Data Lakes
• Data mart: a subset of a data warehouse that is used by small- and medium-
sized businesses and departments within large companies to support decision
making
• A specific area in the data mart might contain greater detailed data than the
data warehouse
• Data lake: takes a “store everything” approach to big data, saving all the data in
its raw and unaltered form
• Also called an enterprise data hub
• Raw data is available when users decide just how they want to use the data
• Only when the data is accessed for a specific analysis is it extracted from the data lake
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 50
or otherwise on a password-protected website for classroom use.
NoSQL Databases
• NoSQL database
• Provides a means to store and retrieve data that is modeled using some means other
than the simple two-dimensional tabular relations used in relational databases
• Advantages:
• Ability to spread data over multiple servers so that each server contains only a subset
of the total data
• Do not require a predefined schema
• Data structures are more flexible and can provide improved access speed and
redundancy
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 51
or otherwise on a password-protected website for classroom use.
NoSQL Databases
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 52
or otherwise on a password-protected website for classroom use.
Hadoop
• Hadoop
• An open-source software framework that includes several software modules that
provide a means for storing and processing extremely large data sets
• Has two primary components:
• A data processing component (MapReduce)
• A distributed file system (Hadoop Distributed File System, HDFS)
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 53
or otherwise on a password-protected website for classroom use.
Hadoop
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 54
or otherwise on a password-protected website for classroom use.
In-Memory Databases
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 55
or otherwise on a password-protected website for classroom use.
In-Memory Databases
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 56
or otherwise on a password-protected website for classroom use.
Summary
© 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service 57
or otherwise on a password-protected website for classroom use.