Professional Documents
Culture Documents
INFO-SHEET # 3 The Nature of Data
INFO-SHEET # 3 The Nature of Data
List of Modules
No. MODULE
MODULE TITLE
CODE
ASSESSMENT METHOD/S:
Quiz, Oral Recitation, Peer Learning
REFERENCES:
Watt, A. (2014, October 24). Chapter 8 The Entity Relationship Data Model –
Database Design – 2nd Edition. Pressbooks.
https://opentextbc.ca/dbdesign01/chapter/chapter-8-entity-relationship-
model/
Schmidt, C. (2021, March 16). The subtle 6: Types of metadata you need to
know. Canto. https://www.canto.com/blog/types-of-metadata/
Learning Objectives:
After reading this INFORMATION SHEET, YOU MUST be able to:
1. Familiarized and discussed the concepts of the level of data, and how to
use it in creating a database.
Internal Level/Schema
The internal schema defines the physical storage structure of the database. The
internal schema is a very low-level representation of the entire database. It
contains multiple occurrences of multiple types of internal records. In the ANSI
term, it is also called "stored record'.
Usability, size of memory, and the number of times the records are factors that
we need to know while designing the database.
Suppose we need to store the details of an employee. Blocks of storage and the
amount of memory used for these purposes are kept hidden from the user.
Conceptual Schema/Level
The conceptual schema describes the Database structure of the whole database
for the community of users. This schema hides information about the physical
storage structures and focuses on describing data types, entities, relationships,
etc.
This level comprises the information that is actually stored in the database in
the form of tables. It also stores the relationship among the data entities in
relatively simple structures. At this level, the information available to the user
at the view level is unknown.
We can store the various attributes of an employee and relationships, e.g., with
the manager can also be stored.
This logical level comes between the user level and physical storage view.
However, there is only single conceptual view of a single database.
External Schema/Level
An external schema describes the part of the database that a specific user is
interested in. It hides the unrelated details of the database from the user.
There may be an "n" number of external views for each database.
This is the highest level of abstraction. Only a part of the actual database is
viewed by the users. This level exists to ease the accessibility of the database
by an individual user. Users view data in the form of rows and columns. Tables
Data in Reality
The real world will be referred to as reality. Data collected about people, places,
or events, in reality, will eventually be stored in a file or database. To
understand the form and structure of the data, information about the data
itself is required. The information that describes data is referred to as
metadata.
ENTITIES
Any object or event about which someone chooses to collect data is an entity.
An entity may be a person, place, or thing (for example, a salesperson, a city,
or a product). Any entity can also be an event or unit of time such as a
machine breakdown, a sale, or a month or year.
An entity is an object in the real world with an independent existence that can
be differentiated from other objects. An entity might be
An object with physical existence (e.g., a lecturer, a student, a car)
An object with conceptual existence (e.g., a course, a job, a position)
Characteristic entities
Characteristic entities provide more information about another table. These
entities have the following characteristics:
They represent multivalued attributes.
They describe other entities.
They typically have a one-to-many relationship.
The foreign key is used to further identify the characterized table.
Options for a primary key are as follows:
1. Use a composite of foreign key plus a qualifying column
2. Create a new simple primary key. In the COMPANY database, these
might include:
Employee (EID, Name, Address, Age, Salary) – EID is the simple primary
key.
EmployeePhone (EID, Phone) – EID is part of a composite primary key.
Here, EID is also a foreign key.
Types of Attributes
There are a few types of attributes you need to be familiar with. Some of these
are to be left as is, but some need to be adjusted to facilitate representation in
the relational model. This first section will discuss the types of attributes. Later
on, we will discuss fixing the attributes to fit correctly into the relational model.
1. Simple attributes
Simple attributes are those drawn from the atomic value domains; they are also
called single-valued attributes. In the COMPANY database, an example of this
would be: Name = {John} ; Age = {23}
2. Composite attributes
Composite attributes are those that consist of a hierarchy of attributes. Using
our database example, Address may consist of Number, Street and Suburb. So
this would be written as → Address = {59 + ‘Meek Street’ + ‘Kingsford’}
3. Multivalued attributes
Multivalued attributes are attributes that have a set of values for each entity. An
example of a multivalued attribute from the COMPANY database, as seen in
Figure 8.4, are the degrees of an employee: BSc, MIT, PhD.
4. Derived attributes
Derived attributes are attributes that contain values calculated from other
attributes. An example of is Age that can be derived from the attribute Birthdate.
In this situation, Birthdate is called a stored attribute, which is physically saved
to the database.
RECORDS
A record is a collection of data items that have something in common with the
entity described. The record for an order placed with a mail-order company. The
ORDER-#, LAST NAME, INITIAL, STREET ADDRESS, CITY, STATE, and CREDIT
CARD are all attributes. Most records are of fixed length, so there is no need to
determine the length of the record each time.
KEYS
A key is one of the data items in a record that is used to identify a record. When
a key uniquely identifies a record, it is called a primary key. For example,
ORDER-# can be a primary key because only one number is assigned to each
customer order. In this way, the primary key identifies the real-world entity
(customer order).
Special care must be taken when designing the primary key. Often it is a
sequential number or a sequential number with a self-checking number (called
a check digit) at the end of the digits. At times there is some meaning built into
the primary key, but defining a primary key based on an attribute is considered
a risk. If the attribute changes, the primary key will also change, creating a
dependency between the primary key and the data.
An example of a primary key based on data is using a state abbreviation for the
state name or an airline luggage code for an airport name. An attribute or a
collection of attributes that can serve as a primary key is called a candidate
key. A primary key should also be minimal and contain no extra attributes
than are necessary to identify a record.
When it is not possible to identify a record uniquely by using one of the data
items found in a record, a key can be constructed by choosing two or more
data items and combining them. This key is called a concatenated, or
composite, key. When a data item is used as a key in a record, the description
is underlined. Therefore, in the ORDER record (ORDER-#, LAST NAME,
INITIAL, STREET ADDRESS, CITY, STATE, CREDIT CARD), the key is ORDER-
#. If an attribute is a key in another file, it should be underlined with a dashed
line.
Descriptive metadata
Descriptive metadata is, in its most simplified version, the identification of
specific data. This often refers to elements like titles, dates, and keywords. When
a user downloads a video file, for example, the runtime of the film would be
descriptive metadata.
Structural metadata
Structural metadata gives information concerning a specific object or resource.
This often relates to a piece of digital media. Here’s an illustrative example: A
film in a DVD format has numerous sections. Each section has a certain length
of film running time, and those sections fit together into the format in a certain
order.
In a broader sense, structural metadata records information about how a
particular object or resource might be sorted.
Preservation metadata
Preservation metadata offers information that can strengthen the entire
procedure of maintaining a certain digital object/file. This information may
include vital details required for a system to communicate or interact with a
specific file.
Preservation metadata upholds the integrity of a digital object or file from its
start to finish, or until it’s no longer in use or necessary.
Administrative metadata
Administrative metadata informs users what types of instructions, rules, and
restrictions are placed on a file. This type of data helps administrators limit
access to files based on the user qualifications. Administrative metadata is
comprehensive – giving information about certain data as a whole from start to
finish. This gives users a chance to administer a wide variety of data files.
Administrative metadata is like a basic version of a piece of data. Even if a
particular data set is extremely complex, its metadata will be much more
detailed. Thus, administrative metadata is about control – controlling these
complex pieces and simplifying them for clarity.
A spreadsheet
Relational databases (the most common type of database) store and provide
access not only data but also metadata in a structure called a data dictionary
or system catalog. It holds information about:
tables,
columns,
data types,
constraints
table relationships,