Assignment No 2: Name:: Zaheer Atta Reg No:: 16-Arid-5023

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Assignment No 2

Name :: Zaheer Atta


Reg No :: 16-Arid-5023
Domain:

 National Health Information System


National Health Information System is our domain in which a data warehouse
would be developed which will contain all the data regarding hospital’s, clinics,
health institutes and common man’s health.Human health is very important and
essential need of life. People are suffering from various diseases like COVID
nowadays so that people can have all information regarding a disease and its
treatment center or procedure. System can be developed to show the diseases rate
and health issues in the past and based on past future precautions can be taken.Data
would be collected from hospital’s, clinics & health centers. This collected will be
very useful for future use so that system will have information about a particular
region or particular disease. In this way precautions against a disease or health
issue can be made easily. It will contain further details as follows.
Duration:
Duration of historical data regarding the national health information sytem will be
of 10 years. Data of past 10 years will be collected from various health systems
and hospitals, clinics or health institutes. As we will collect the past ten year data
so it will not be outdated and will be useful and essential data.
Factors:
Following are the factors that influence us to collect historical data of past ten
years:
 Health is a basic need for every human
 Hospital’s , clinics and health centers are in large numbers
 Data regarding human health would be of huge volume as lot of people
suffer from health issues.
 Hospital’s and clinics have daily based record which would be a huge
amount of data so collection of past ten years of data will give the data
warehouse a complete analysis regarding people health living in a certain
region.
 As it is a basic issue regarding human life so it is important to collect data in
large volume to develop a best health system.
Assignment No 1
Summary:-
Not all very large databases are DW, but all data warehouses are pretty large
databases Nowadays a warehouse is considered to start at around a TB and goes up
to several PB It spans over several servers and needs an impressive amount of
computing power More specific, a collective data repository Containing snapshots
of the operational data Obtained through data cleansing Useful for analytics
Compared to other solutions it...
Ralph Kimball: "a copy of transaction data specifically structured for query and
analysis
" Bill Inmon: "A data warehouse is a: Subject oriented Integrated
Non-volatile Time variant collection of data in support of management's decisions.
Subject oriented
The data in the DW is organized in such a way that all the data elements relating
to the same real-world event or object are linked together.
The DW contains data from most or all the organization's operational systems and
this data is made consistent E.g.gender, measurement, conflicting keys,
consistency,...
Non-volatile Data in the DW is hardly ever over-written or deleted - once
committed, the data is static, read-only, and retained for future reporting
Data is loaded, but not updated When subsequent changes occur, a new version or
snapshot record is written Time-varying
The changes to the data in the DW are tracked and recorded so that reports show
changes over time Different environments have different time horizons associated
While for operational systems a 60-to-90 day time horizon is normal, DWs have a
5-to-10 year horizon.
Represents front-end analytics based on a DW repository Used for reporting and it
is decision oriented Properties of Operational Data Stores and DWs
A DW is the base repository for front-end analytics OLAP Knowledge discovery
in databases and data mining
Results are used for Data visualization Reporting
As a form of information processing OLAP needs to provide timely, accurate and
understandable information 'Timely' is however a relative term...
In OLTP we expect a query/update to go through in a matter of seconds In OLAP
the time to answer a query can take minutes, hours or even longer
KDD & Data Mining Constructs models of the data in question Models can be
seen as high level summaries of the underlying data E.g."customers older than 35
having at least 1 child and driving a minivan usually spend more than 100 for
grocery shopping"
Users of DW are called decision support system analysts and usually have a
business background
Their primary job is to define and discover information used in corporate decision-
making
The way they think "Show me what I say I want...and then I can tell you what I
really want"
They work in an explorative manner . Typical explorative line of work "When I see
what the possibilities are, I can tell what I really need to see.
Although it's a common OLTP query, it might be to complex for the operational
environment to handle
OLAP - the class of analytical operations running on the DW
Data Warehousing overview Simplified, a data warehouse is a collective data
repository built for analytical tasks Data is extracted from the operational
environment, it is transformed and finally loaded into the DW Typical usage
scenarios of DW are budgeting, resource planning, marketing.

Comparisons Explanation:-

ODS vs DW:
ODS mostly updates while DW mostly reads the data and information.
ODS has many small transations which are gathered to get data while DW has few
but complex queries.
Data size range is from MB-TB in ODS while in DW it is from GB to PB.
ODS contains raw data and information while DW has only summerized data that
is refined and useful data is kept only.
ODS has clerical staff that manages all operations and data and has up to dated
and latest data and information while DW has special decision makers or managers
that manage all the operations but the data in DW maybe outdated or old.
Classical SDLC vs. DW SDLC:-
Classical SDLC: Classical sytem development life cycle
Gathers requirements for developing a system so that all necessary requirements
are available and known.Next it performs an analysis on the gathered requirements
so that system can be designed on the basis of requirements and needs.Further the
system is programmed and developed according to the design and analysis.After
programming is completed the programmed system is tested in order to check any
updation or validation or any obligations.Next the system is integrated and at last
the system is implemented so that all requirements can be fulfilled.
DW SDLC : It is almost opposite of Classical SDLC as warehouse is implemented
at very first stage then data is integrated and coordinated.Further warehouse is
tested for biased data and performance and next programmed against the data
which states that programming depends on data not on requirements or
analysis.Next design is finalized and warehouse is designed according to the
data.Results are analyzed and checked so that improvements can be done easily.At
the end data warehouse requirements are understood so that any updation or
improvement can be done.

You might also like