Professional Documents
Culture Documents
Files and Databases
Files and Databases
Database management systems (DBMS) are collections of tools used to manage databases.
Four basic functions performed by all DBMS are:
Create, modify, and delete data structures, e.g. tables
Add, modify, and delete data – Data Manipulation
Retrieve data selectively
Generate reports based on data
Field – this a single piece of information. It is an area (within a record reserved for a specific
piece of data. Examples: customer number, street address, city, current balance etc.
Record – A collection of values for all the fields pertaining to one entity: i.e., anything that
data will be collected on e.g. person, product, company, transaction etc.
Table/File – A collection of related records. E.g. employee table, product table, customer
table, student table, flight table etc. In a table, records are represented by rows and fields are
represented as columns. In a relational database a table may be referred to as a relation and a
row may be referred to as a tuple.
Database – A collection of related tables. It can also include other objects, such as queries,
forms and reports. The structure of a database is the relationships between its tables.
Entity – a person, place or thing on which data will be collected e.g. student, lecturer,
product, store.
Attribute – a characteristic or property of an entity e.g. First Name, ID No, Product Code,
branch name
Objective 2: Explain how files and databases are used in organizations
Organize: Databases are used to hold information that is useful in an organization and it may
be used to organize or arrange data in such a way that will improve the efficiency in data
response in an organization.
Search & Retrieve: Databases allows organizations to locate and retrieve information
quickly through use of given criteria, such as specific key terms.
Data Mining:
Generally, data mining is the process of analyzing data from different perspectives and
summarizing it into useful information - information that can be used to increase revenue, cuts
costs, or both. For example, companies often use Search Engine Optimization (SEO) techniques
to determine and learn about the interests of potential customers. It is then through data mining
that these businesses are able to retrieve and store meaningful information that will allow them to
gain or persuade potential customers.
Data Mart: This is the micro version of data warehouse. It is more useful or suitable to support
small organizations with very few departments. Data mart is often built and controlled by a
single department within an organization, eg: Technology, sales, finance, marketing, etc.
Objective 3: Explain how Data Storage & Retrieval have changed overtime.
Types of Files
Master Files
This is a long-lived file, it holds the data that is to be processed; descriptive data and the updated data
after transaction is completed. It is a permanent file.
Transaction Files
This is it namesake holds the transactions that are to be carried out on the master file. Basically the data
on this file makes changes to the master file. It is not permanent but rather temporary.
Extra: Grandfather-Father-Son Relationship
According to the people at pcmag.com this is a method for storing previous generations of master file
data that are continuously updated. The son is the current file, the father is a copy of the file from the
previous cycle, and the grandfather is a copy of the file from the cycle before that one.
File Organization
Serial
collection of records
no particular sequence
cannot be used as master file
used as temporary transaction file
records stored in order received
Advantages:
simple file design
can be stored on inexpensive devices
Disadvantages:
entire file must be processed even if single record must be accessed
overall processing slow
Sequential
A collection of records
stored in key sequence
adding/deleting record requires making new file
used as master file
Advantages:
simple file design
very efficient when most of records must be processed (like in payroll)
very efficient; data has natural order
can be stored on inexpensive devices like magnetic tape
Disadvantages:
entire file must be processed even if single record is to be searched
transactions have to be sorted before processing
overall processing slow
Direct/Random
Records are read directly from or written directly to the file
the records are stored at known address
address is calculated by applying a mathematical function to key field
stored on a direct access backing storage medium (example: magnetic disk, CD,DVD)
used in any information retrieval system (example: train timetable system)
Advantages:
Any record can be directly accessed
speed of recording processing is very fast
up-to-date file because of on-line updating
concurrent processing is possible
Disadvantages:
more complex than sequential
does not fully use memory location
more security and backup problems
Indexed Sequential
each record of a file has a key field which uniquely identifies that record
has an index which consists of keys and addresses
an index sequential file is a sequential file which has an index
a full index to a file is one in which applications where data needs to be accessed in either two
ways
accessed randomly and sequentially; randomly using index
file can be stored in random access device (example: magnetic discs, CD, DVD)
Advantages:
provides flexibility for users who need both types of access with the same file
faster than sequential
Disadvantages:
extra storage space required for index
Database Types
Personal Database
generally used by one person at a time
can stores information for an entire family
things it can store: pictures, music, games etc.
Work Group
shares information across a network
has security and data integrity checks
used by several people (2-25)
greater capacity than personal
provides backup
must be maintained on regular basis
Departmental Database
Work group on a bigger scale
used by 25 - 100 people
even greater capacity that work group
provides backup
maintenance important
Enterprise Database
manages scope of whole organization
back and maintenance important
used by over 100 people
provides greatest storage capacity
must have fast retrieval speed
Database Organization
Hierarchical
Advantages:
more efficient than flat file model because of redundancy
Disadvantage:
must be an expert to operate
only caters for one-to-many relationship (between parent and child)
changes in structure affects data access
not very flexible
Network
allows for each parent to have multiple children and each child to have multiple parents; hence
facilitating many-to-many relationships
an improvement on hierarchical
Example of Network Database
Advantage:
facilitates many-to-many relationships
more flexible than hierarchical
Disadvantage:
changes in structure affects data access
Relational
This model is defined as a model in which two or more linked tables are used to track
information. Data is stored in tables (more than one table).
data represented in terms of tuples grouped into relations
applies relations between tables
facilitates all cardinalities (many-to-many etc.)
has primary key that facilitates relationships
table has name that is distinct form all other tables in the database
no duplicate rows; all rows are distinct
entries in fields (columns) are atomic (no repeating of groups or multi-valued attributes)
each field has distinct name
examples: MS SQL SERVER, Oracle, My SQL, MS Access
Example of Relational Model
Advantages:
avoid redundancies of information
conceptual simplicity
Structural Independence: Changes in structure does not affect the data access
Design Implementation: Achieves both data independence and structural independence
flexible: data can be manipulated by operators
relations between tables ensure no ambiguity
more efficient that previous two
can point to specific piece of data directly without going through another piece of data
consistency is achieved by declaring constraints in database design
Object Oriented
Advantages:
Disadvantages: