Files and Databases

UNIT 2- MODULE 1 - INFORMATION MANAGEMENT TERMS
Objective 1: Differentiate among terms used in Information Management
 Database management systems (DBMS) are collections of tools used to manage databases.
Four basic functions performed by all DBMS are:
 Create, modify, and delete data structures, e.g. tables
 Add, modify, and delete data – Data Manipulation
 Retrieve data selectively
 Generate reports based on data
 Field – this a single piece of information. It is an area (within a record reserved for a specific
piece of data. Examples: customer number, street address, city, current balance etc.
 Record – A collection of values for all the fields pertaining to one entity: i.e., anything that
data will be collected on e.g. person, product, company, transaction etc.
 Table/File – A collection of related records. E.g. employee table, product table, customer
table, student table, flight table etc. In a table, records are represented by rows and fields are
represented as columns. In a relational database a table may be referred to as a relation and a
row may be referred to as a tuple.
 Database – A collection of related tables. It can also include other objects, such as queries,
forms and reports. The structure of a database is the relationships between its tables.
 Entity – a person, place or thing on which data will be collected e.g. student, lecturer,
product, store.
 Attribute – a characteristic or property of an entity e.g. First Name, ID No, Product Code,
branch name
Objective 2: Explain how files and databases are used in organizations
 Organize: Databases are used to hold information that is useful in an organization and it may
be used to organize or arrange data in such a way that will improve the efficiency in data
response in an organization.
 Store: A computerized database is used to store data in tables.
 Search & Retrieve: Databases allows organizations to locate and retrieve information
quickly through use of given criteria, such as specific key terms.
 Eliminate Redundancies: Databases are helpful in eliminating redundancies thus, removing

repetition of data.
Data Mining:
Generally, data mining is the process of analyzing data from different perspectives and
summarizing it into useful information - information that can be used to increase revenue, cuts
costs, or both. For example, companies often use Search Engine Optimization (SEO) techniques
to determine and learn about the interests of potential customers. It is then through data mining
that these businesses are able to retrieve and store meaningful information that will allow them to
gain or persuade potential customers.
Data warehousing: Is the electronic storage of a large amount of information by a business.

Data warehousing is more suitable to support an entire corporation environment. Warehoused
data must be stored in a manner that is secure, reliable, easy to retrieve and easy to manage. The
purpose of data warehouse is to analyze historical data.
Data Mart: This is the micro version of data warehouse. It is more useful or suitable to support
small organizations with very few departments. Data mart is often built and controlled by a
single department within an organization, eg: Technology, sales, finance, marketing, etc.
Objective 3: Explain how Data Storage & Retrieval have changed overtime.
STORAGE EST. TIME GENERAL METHOD OF RELATIV TYPES OF

DEVICE OF FACTS ACCESS E SPEED DATA
INVENTIO OF STORED
N ACCESS
PUNCHED Around -primary sequential Ranged Multi-
CARDS 1725 method of from 49- character
storage in 262 cards (words,
early per numbers);
1900’s minute Text based
-used for For some
controlling models
textile And for
looms other
models
91-355
PUNCH TAPE Around -in 1946 Sequential Faster Binary;
1845 used to than multi-
send punched character;
telegrams cards; up text-based
to 120
character
s per
second
MAGNETIC 1928 -changed Serial/sequenti 128 Text,

TAPE computing al character sound,
landscape s per inch video;
by making of tape multimedia
long-term
storage of
vast
amount of
data
possible
-portable
-cheap
AUDIO 1963 -originally Serial 17/8 Sound files

CASSETTE intended for inches
use in per
dictation second
machines
-portable
-cheap
HARD DRIVE 1950’s -more than Direct/random 12 ms Multi-

(1952) 1 TB of media
storage
-there’s a
fixed hard
disk and a
removable
hard disk
CD/DVD 1980-CD -CD stores Direct/random 300 ms Multi-
(OPTICAL 1995-DVD up to 700 media;
DISC) MB mainly
-DVD video/soun
stores 4.7 d files
GB up to
17GB
-uses red
laser
-CD holds
up to 90
minutes of
audio
BLU-RAY Between -next Direct/random 120 ms Multi-
DISC 2003 generation media;
optical disc mainly for
-for HD video/audi
video o
-olds about
50GB
-uses blue
laser
instead of
red
HOLOGRAPHI Between -optical disc Direct/random Faster Multi-
C 2004 - that stores than Blu- media
VIDEO DISC 2006 several ray and
Terabytes CD/DVD
of data
-uses green
and red
laser
beams
-currently
still in
developme
nt stage
CLOUD 2000’s -stores Direct/random 10 to 100 Multi-
STORAGE extremely seconds media
large
volumes of
data
-relatively
inexpensive
-virtually
unlimited
storage
-accessible
via internet
-no
physical
presence
-limited by
bandwidth
Objective 4: Explain the advantages of using a database approach compared to using
traditional file processing;
CHARACTERISTIC FILE DATABASE

SPEED Access speed is slower due Access speed is faster due to
to fixe path of storage central storage location
EFFICIENCY Not very efficient approach Much more efficient

when speed and data quality, approach than traditional file
data handling and processing approach
is looked at
COST Cost of developing and Cost of developing and
maintaining higher; initial maintaining lower; initial
cost lower cost higher
DATA QUALITY General quality encompasses Quality is better than
completeness, validity, traditional file approach
consistency, timeliness,
accuracy; if these are good
quality is good; general
quality is lower than
database approach
COMPLETENESS Now way to ensure data is Validations can be written to
complete ensure that data is complete;
fields that are primary keys
must be present
VALIDITY No validation checks Has validation checks;
validations can be written
CONSISTENCY Have inconsistent and Inconsistency can be

redundant data avoided; this is because of
mechanisms in place to
ensure data integrity
(referential integrity), control
redundancies and ensure data
is consistent; once data has
reduced redundancies data is
more consistent
TIMLINESS Data may not necessarily be Timeliness of data is better
timely; in fact it is not easily than file system
updated so updated data
would not be available I
timely manner
ACCURACY Stores anything/ everything; Has validation checks; data
lacks accuracy integrity (referential
integrity); data can be
normalized
RETRIEVAL Fixed order of retrieving Multiple ways of accessing
data; functionality not data; central storage location
offered making retrieval easy; faster
access than file system
DATA HANDLING Not much features in place Queries can be written to get
for data handling; information on data; data can
limited/little control for be accessed through forms
management of data and reports; generally stores
data in tables and well
organized
Not much features in place reports, help with data
DATA PROCESSING for data processing; processing; built in
limited/little control for functions; forms help input
management of data data; easily updated;
SECURITY Usernames; passwords; Same as files though
encryption generally have better
security; macros can be
written to further ensure
different levels of access of
data
CONCURRENT ACCESS Generally does not allow Allows multiple users to log
concurrent access on to system
Objective 5: Describe the different types and organization of files and databases;
Types of Files
Master Files
This is a long-lived file, it holds the data that is to be processed; descriptive data and the updated data
after transaction is completed. It is a permanent file.
Transaction Files
This is it namesake holds the transactions that are to be carried out on the master file. Basically the data
on this file makes changes to the master file. It is not permanent but rather temporary.
Extra: Grandfather-Father-Son Relationship
According to the people at pcmag.com this is a method for storing previous generations of master file
data that are continuously updated. The son is the current file, the father is a copy of the file from the
previous cycle, and the grandfather is a copy of the file from the cycle before that one.
File Organization
Serial
 collection of records
 no particular sequence
 cannot be used as master file
 used as temporary transaction file
 records stored in order received
Advantages:
 simple file design
 can be stored on inexpensive devices
Disadvantages:
 entire file must be processed even if single record must be accessed
 overall processing slow
Sequential
 A collection of records
 stored in key sequence
 adding/deleting record requires making new file
 used as master file
Advantages:
 simple file design
 very efficient when most of records must be processed (like in payroll)
 very efficient; data has natural order
 can be stored on inexpensive devices like magnetic tape
Disadvantages:
 entire file must be processed even if single record is to be searched
 transactions have to be sorted before processing
 overall processing slow
Direct/Random
 Records are read directly from or written directly to the file
 the records are stored at known address
 address is calculated by applying a mathematical function to key field
 stored on a direct access backing storage medium (example: magnetic disk, CD,DVD)
 used in any information retrieval system (example: train timetable system)
Advantages:
 Any record can be directly accessed
 speed of recording processing is very fast
 up-to-date file because of on-line updating
 concurrent processing is possible
Disadvantages:
 more complex than sequential
 does not fully use memory location
 more security and backup problems
Indexed Sequential
 each record of a file has a key field which uniquely identifies that record
 has an index which consists of keys and addresses
 an index sequential file is a sequential file which has an index
 a full index to a file is one in which applications where data needs to be accessed in either two
ways
 accessed randomly and sequentially; randomly using index
 file can be stored in random access device (example: magnetic discs, CD, DVD)
Advantages:
 provides flexibility for users who need both types of access with the same file
 faster than sequential
Disadvantages:
 extra storage space required for index
Database Types
Personal Database
 generally used by one person at a time
 can stores information for an entire family
 things it can store: pictures, music, games etc.
Work Group
 shares information across a network
 has security and data integrity checks
 used by several people (2-25)
 greater capacity than personal
 provides backup
 must be maintained on regular basis
Departmental Database
 Work group on a bigger scale
 used by 25 - 100 people
 even greater capacity that work group
 provides backup
 maintenance important
Enterprise Database
 manages scope of whole organization
 back and maintenance important
 used by over 100 people
 provides greatest storage capacity
 must have fast retrieval speed
Database Organization
Hierarchical
 This model defines hierarchical arranged data

 Systems that use this model: windows-based directory management systems (like windows
Explorer)
 Relationships though of in terms of children and parents where child has only one parent
and a parent can have multiple children
 Children and parents are related by links called pointers. Pointers are physical addresses inside
file systems.
 Parent has list of pointers to each of their children
Advantages:
 more efficient than flat file model because of redundancy
Disadvantage:
 must be an expert to operate
 only caters for one-to-many relationship (between parent and child)
 changes in structure affects data access
 not very flexible
Network
 allows for each parent to have multiple children and each child to have multiple parents; hence
facilitating many-to-many relationships
 an improvement on hierarchical
Example of Network Database
Advantage:
 facilitates many-to-many relationships
 more flexible than hierarchical
Disadvantage:
 changes in structure affects data access
Relational
 This model is defined as a model in which two or more linked tables are used to track
information. Data is stored in tables (more than one table).
 data represented in terms of tuples grouped into relations
 applies relations between tables
 facilitates all cardinalities (many-to-many etc.)
 has primary key that facilitates relationships
 table has name that is distinct form all other tables in the database
 no duplicate rows; all rows are distinct
 entries in fields (columns) are atomic (no repeating of groups or multi-valued attributes)
 each field has distinct name
 examples: MS SQL SERVER, Oracle, My SQL, MS Access
Example of Relational Model
Advantages:
 avoid redundancies of information
 conceptual simplicity
 Structural Independence: Changes in structure does not affect the data access
 Design Implementation: Achieves both data independence and structural independence
 flexible: data can be manipulated by operators
 relations between tables ensure no ambiguity
 more efficient that previous two
 can point to specific piece of data directly without going through another piece of data
 consistency is achieved by declaring constraints in database design
Object Oriented
 data modelling centres around objects and classes

 a class is an entity that has a well-defined role in the application domain, as well as state,
behaviour and identity; example: tangible person, conceptual department
 an object is a particular instance of a class; object consists of attributes and methods
 involves inheritance; an object or class is based on another object or class; the object and class
inherits the characteristics of original object or class
 automatically inherits the data attributes and characteristics of class from which object is formed
 allows for reusability
 facilitates all cardinalities
Advantages:
 Able to tackle challenging problems

 improved communication between users, analysts, designers and programmers
 increased consistency in analysis design and programmers
 reduced maintenance cost due to reusability
 improved reliability
 improved flexibility
Disadvantages:
 added cost of training; not just anyone can use it

 inability to work with different systems
 inadequate for concurrent problems
 data and operations are separated
Key
Table name: PAINTER

Primary key: Painter_Num
Foreign Key: None
Painter_Num Painter_Lname Painter_Fname Painter_Initial

123 Ross Georgette P
126 Itero Julio G
Table name: PAINTING

Primary key: Painting_Num
Foreign Key: Painter_Num
Painting_Num Painting_Title Painter_Num

1338 Dawn Thunder 123
1339 Vanilla Roses Nowhere 123
1340 Tired Flounders 126
1341 Hasty Exit 123
1342 Plastic Paradise 126

Files and Databases

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Files and Databases

Uploaded by

Copyright:

Available Formats

UNIT 2- MODULE 1 - INFORMATION MANAGEMENT TERMS

Objective 1: Differentiate among terms used in Information Management

 Store: A computerized database is used to store data in tables.

 Eliminate Redundancies: Databases are helpful in eliminating redundancies thus, removing

Data warehousing: Is the electronic storage of a large amount of information by a business.

STORAGE EST. TIME GENERAL METHOD OF RELATIV TYPES OF

MAGNETIC 1928 -changed Serial/sequenti 128 Text,

AUDIO 1963 -originally Serial 17/8 Sound files

HARD DRIVE 1950’s -more than Direct/random 12 ms Multi-

CHARACTERISTIC FILE DATABASE

EFFICIENCY Not very efficient approach Much more efficient

CONSISTENCY Have inconsistent and Inconsistency can be

 This model defines hierarchical arranged data

 data modelling centres around objects and classes

 Able to tackle challenging problems

 added cost of training; not just anyone can use it

Table name: PAINTER

Painter_Num Painter_Lname Painter_Fname Painter_Initial

Table name: PAINTING

Painting_Num Painting_Title Painter_Num

You might also like