Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 28

Chapter 11 Database

Data
Data refers to raw facts, figures, or information that can be in various forms, such as
numbers, text, images, or multimedia. Data is the foundation of information, and it
becomes meaningful and useful when processed and organized. There are two primary
types of data:
• Structured Data: Highly organized and formatted data that fits into predefined
categories, often found in databases and spreadsheets. Examples include tables of
numbers, dates, or categorical information.
• Unstructured Data: Information that lacks a specific format and is not easily
organized, such as text documents, emails, images, and videos. Unstructured data
requires advanced processing techniques, like natural language processing or image
recognition, to extract meaningful insights.
• Logical view (meaning, content and context of data) and physical view(actual format
and location of data)
Data organization
Data organization
• Character: A character is the most basic logical data element. It is a single letter,
number, or special character, such as a punctuation mark, or a symbol, such as $.
• Field: The next higher level is a field, or group of related characters. In our
example, Brown is in the data field for the Last Name of an employee. It consists
of the individual letters (characters) that make up the last name. A data field
represents an attribute (description or characteristic) of some entity (person,
place, thing, or object). For example, an employee is an entity with many
attributes, including his or her last name.
• Record: A record is a collection of related fields. A record represents a collection
of attributes that describe an entity. In our example, the payroll record for an
employee consists of the data fields describing the attributes for one employee.
These attributes are First Name, Last Name, Employee ID, and Salary.
Data organization
• Table: A table is a collection of related records. For example, the
Payroll Table would include payroll information (records) for the
employees (entities).
• Database: A database is an integrated collection of logically related
tables. For Page 268 example, the Personnel Database would include
all related employee tables, including the Payroll Table and the
Benefits Table.
Key feild
Each record in a table has at least one distinctive field, called the key
field. Also known as the primary key, this field uniquely identifies the
record. Tables can be related or connected to other tables by common
key fields.
For most employee databases, a key field is an employee identification
number. Key fields in different tables can be used to integrate the data
in a database. For example, in the Personnel Database, both the Payroll
and the Benefits tables include the field Employee ID. Data from the
two tables could be related by combining all records with the same key
field (Employée ID).
Batch versus Real-Time Processing
• Traditionally, data is processed in one of two ways. These are batch processing
(later), and real-time processing(now). These two methods have been used to
handle common record-keeping activities such as payroll and sales orders.
• Batch processing: In batch processing, data is collected over several hours,
days, or even weeks. It is then processed all at once as a "batch." If you have a
credit card, your bill probably reflects batch processing. That is, during the
month, you buy things and charge them to your credit card. Each time you
charge something, an electronic copy of the transaction is sent to the credit
card company. At some point in the month, the company's data processing
department puts all those transactions (and those of many other customers)
together and processes them at one time. The company then sends you a
single bill totaling the amount you owe.
Batch versus Real-Time Processing
Batch versus Real-Time Processing
• Real-time processing: Real-time processing, also known as online
processing, occurs when data is processed at the same time the
transaction occurs.
• For example, whenever you request funds at an ATM, real-time
processing occurs. After you have provided account information and
requested a specific withdrawal, the bank's computer verifies that you
have sufficient funds in your account. If you do, then the funds are
dispensed to you, and the bank immediately updates the balance of
your account.
Batch versus Real-Time Processing
Databases
• Many organizations have multiple files on the same subject or person. For example,
a customer's name and address could appear in different files within the sales
department, billing department, and credit department. This is called data
redundancy. If the customer moves, then the address in each file must be updated.
If one or more files are overlooked, problems will likely result.
• For example, a product ordered might be sent to the new address, but the bill might
be sent to the old address. This situation results from a lack of data integrity.
Moreover, data spread around in different files is not as useful. The marketing
department, for instance, might want to offer special promotions to customers who
order large quantities of merchandise. To identify these customers, the marketing
department would need to obtain permission and access to files in the billing
department. It would be much more efficient if all data were in a common
database. A database can make the needed information available.
Need for Databases
For an organization, there are many advantages to having databases:
• Sharing: In organizations, information from one department can be readily shared with others.
Billing could let marketing know which customers ordered large quantities of merchandise.
• Security: Users are given passwords or access only to the kind of information they need. Thus,
the payroll department may have access to employees' pay rates, but other departments would
not.
• Less data redundancy: Without a common database, individual departments have to create and
maintain their own data, and data redundancy results. For example, an employee's home
address would likely appear in several files. Redundant data causes inefficient use of storage
space and data maintenance problems.
• Data integrity: When there are multiple sources of data, each source may have variations. A
customer's address may be listed as "Main Street" in one system and "Main St." in another. With
discrepancies like these, it is probable that the customer would be treated as two different
people.
Database Management
• In order to create, modify, and gain access to a database, special software is required. This
software is called a database management system, which is commonly abbreviated DBMS.
Some DBMSs, such as Microsoft Access, are designed specifically for personal computers.
Other DBMSs are designed for specialized database servers. DBMS software is made up of
five parts or subsystems: DBMS engine, data definition, data manipulation, application
generation, and data administration.
• The DBMS engine provides a bridge between the logical view of the data and the physical
view of the data. When users request data (logical perspective), the DBMS engine handles
the details of actually locating the data (physical perspective).
• The data definition subsystem defines the logical structure of the database by using a data
dictionary or schema. This dictionary contains a description of the structure of data in the
database. For a particular item of data, it defines the names used for a particular field. It
defines the type of data for each field (text, numeric, time, graphic, audio, and video). An
example of an Access data dictionary form is presented in
Database management
Database management
• The data manipulation subsystem provides tools for maintaining and
analyzing data. Maintaining data is known as data maintenance. It
involves adding new data, deleting old data, and editing existing data.
Analysis tools support viewing all or selected parts of the data,
querying the database, and generating reports. Specific tools include
query-by-example and a specialized programming language called
structured query language (SQL).
• The application generation subsystem provides tools to create data
entry forms and specialized programming languages that interface or
work with common and widely used programming languages such as
C++.
Database management
• The data administration subsystem helps to manage the overall
database, including maintaining security, providing disaster recovery
support, and monitoring the overall performance of database
operations. Larger organizations typically employ highly trained
computer specialists, called database administrators (DBAs), to
interact with the data administration subsystem. Additional duties of
database administrators include determining processing rights or
determining which people have access to what kinds of data in the
database.
Database structure
DBMS programs are designed to work with data that is logically structured or arranged in a particular way.
This arrangement is known as the database model. These models define rules and standards for all the data
in a database. For example, Microsoft Access is designed to work with databases using the relational data
model.
Five common database models are hierarchical, network, relational, multidimensional, and object-oriented.
• Hierarchical Database
At one time, nearly every DBMS designed for mainframes used the hierarchical data model.In a hierarchical
database, fields or records are structured in nodes. Nodes are points connected like the branches of an
upside-down tree. Each entry has one parent node, although a parent may have several child nodes. This is
sometimes described as a one-to-many relationship. To find a particular field, you have to start at the top
with a parent and trace down the tree to a child.
An example of a hierarchical database is a system to organize music files. The parent node is the music
library for a particular user. This parent has four children, labeled "artist." Coldplay, one of the children, has
three children of its own. They are labeled "album." The Greatest Hits album has three children, labeled
"song."
Database structure(cont)

The problem with this


database is that if one
parent node is
deleted, so are all
subordinate child
nodes.
Database structure(cont)
• Network database
Responding to the limitations of the hierarchical data model, network models were
developed. A network database also has a hierarchical arrangement of nodes.
However, each child node may have more than one parent node. This is sometimes
described as a many-to-many relationship. There are additional connections called
pointers-between parent nodes and child nodes. Thus, a node may be reached
through more than one path. It may be traced down through different branches.
• For example, a university could use this type of organization to record students
taking classes. If you trace through the logic of this organization, you can see that
each student can have more than one teacher. Each teacher also can teach more
than one course. Students may take more than a single course. This demonstrates
how the network arrangement is more flexible and, in many cases, more efficient
than the hierarchical arrangement.
Database structure(cont)
Database structure(cont)
• Relational Database
The most common type of organization is the relational database. In this structure,
there are no access paths down a hierarchy. Rather, the data elements are stored in
different tables, each of which consists of rows and columns. A table and its data are
called a relation.
• An example of a relational database is shown in Figure. The Vehicle Owner Table
contains license numbers, names, and addresses for all registered drivers. Within
the table, a row is a record containing information about one driver. Each column is
a field. The fields are License Number, Last Name, First Name, Street, City, State,
and Zip. All related tables must have a common data item, or shared key field, in
both tables, enabling information stored in one table to be linked with information
stored in another. In this case, the three tables are related by the License Number
field.
Database structure(cont)
Database structure(cont)
• Multidimensional Database
The multidimensional data model is an extension of the relational data
model. Whereas relational databases use tables consisting of rows and
columns, multidimensional databases extend this two-dimensional data
model to include additional or multiple dimensions, sometimes called a
data cube. Data can be viewed as a cube having three or more sides
and consisting of cells. Each side of the cube is considered a dimension
of the data. In this way, complex relationships between data can be
represented and efficiently analyzed. Two of the most significant
advantages are
Database structure(cont)
• Object-Oriented Database
The other data structures are primarily designed to access and summarize
structured data such as names, addresses, pay rates, and so on. Object-oriented
databases are more flexible and store data as well as instructions to manipulate the
data. Additionally, this structure is ideally designed to provide input for object-
oriented software development. databases organize data using classes, objects,
attributes, and methods.
• Classes are general definitions.
• Objects are specific instances of a class that can contain both data and instructions
to manipulate the data.
• Attributes are the data fields an object possesses.
• Methods are instructions for retrieving or manipulating attribute values.
Database structure(cont)

A health club might


use an object-oriented
employment database.
The database uses a
class, Employee, to
define employee
objects that are stored
in the database. This
definition includes the
attributes First name,
Last name, Address,
and method pay.
Types of database
• Individual: The individual database is also called a personal computer database. It is a collection of integrated files
primarily used by just one person. Typically, the data and the DBMS are under the direct control of the user. They are
stored either on the user's hard-disk drive or on a LAN file server.
• Company: Companies, of course, create databases for their own use. The company database may be stored on a
central database server and managed by a database administrator. Users throughout the company have access to
the database through their personal computers linked to local or wide area networks. A company databases are the
foundation for management information systems. For instance, a department store can record all sales transactions
in the database. A sales manager can use this information to see which salespeople are selling the most products.
The manager can then determine year-end sales bonuses. Or the store's buyer can learn which products are selling
well or not selling and make adjustments when reordering.
• Distributed: Many times the data in a company is stored not in just one location but in several locations. It is made
accessible through a variety of communications networks. The database, then, is a distributed database. That is, not
all the data in a database is physically located in one place. Typically, database servers on a client/server network
provide the link between the data. For instance, some database information can be at regional offices. Some can be
at company headquarters, and some even overseas.
• Commercial: A commercial database is generally an enormous database that an organization develops to cover
particular subjects. It offers access to this database to the public or selected outside individuals for a fee. Sometimes
commercial databases also are called information utilities or data banks. An example is LexisNexis, which offers a
variety of information-gathering and reporting services
Database uses and issues
• Database Uses:
• Efficient Data Storage: Databases store and organize large volumes of structured data for easy retrieval and management.
• Data Retrieval and Analysis: Facilitates quick and efficient retrieval of specific information, supporting data analysis and
decision-making.
• Data Integrity: Ensures data accuracy and consistency through features like constraints, normalization, and relational structure.
• Scalability: Adaptable to evolving data needs, allowing for seamless scaling as data volumes and user demands increase.
• Security Issues:
• Unauthorized Access: Risks of unauthorized users gaining access to sensitive data, emphasizing the need for robust
authentication and access controls.
• Data Breaches: Potential for data leaks due to vulnerabilities, highlighting the importance of encryption and secure coding
practices.
• SQL Injection: Malicious injection of SQL code, posing a threat to database integrity; prevention involves parameterized
queries and input validation.
• Insider Threats: Risks from individuals within the organization misusing data; monitoring and access auditing help mitigate
such threats.
• Lack of Regular Audits: Failing to regularly audit and update security measures may leave databases vulnerable to evolving
cyber threats.
Chapter ended 

You might also like