Information-Management-Reviewer

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Information Management Reviewer

Module 4: Database Application Development (skip to XML ta)

Extensible Markup Language (XML)


• XML addresses the issue of representing data in a structure and format that can both be
exchanged over the Internet and be interpreted by different components
• XML does not replace Hypertext Markup Language (HTML), but it works with HTML to facilitate
the transfer, exchange, and manipulation of data
• XM Luses tags, short descriptions enclosed in angle brackets (< >), to characterize data. The use
of angle brackets in XML is like their use for HTML tags. HTML tags are used to describe the
appearance of content, XML tags are used to describe the content, or data, itself.

Three main techniques are used to validate that an XML document:


1. document structure declarations (DSDs)
2. XML Schema Definition (XSD)
3. Relax NG
All of these are alternatives to DTDs.
Document type declarations were included in the first version XML but have some limitations.
They cannot specify data types and are written in their own language, not in XML.
In addition, DTDs do not support some newer features of XML, such as namespaces.
Module 5: Data Warehousing

A data warehouse is described as a collection of data designed to support management decision-


making. It’s organized around key subjects like customers or products, integrated from various sources,
time-variant for trend analysis, and non-updateable by end-users.

Need for Data Warehousing: The necessity arises from the need for a company-wide view of high-
quality information and the separation of informational from operational systems to enhance
performance in managing company data.
• An operational system is a system that is used to run a business in real time, based on current
data.
• Informational systems are designed to support decision making based on historical point-in time
and prediction data.
Data Warehouse Architectures: different architectures like independent data marts, dependent data
marts with operational data stores, and real-time data warehouses, highlighting their structures and
purposes.

A star schema is a simple database design (particularly suited to ad hoc queries) in which dimensional
data (describing how data are commonly aggregated for reporting) are separated from fact or event
data (describing business activity). A star schema is one version of a dimensional model.

• Components: It consists of a central fact table and one or more dimension tables.
• Fact Table: Holds quantitative data like units sold or orders booked, which are numerical and
additive.
• Dimension Tables: Contain descriptive data that provide context for facts, often used in reports
and queries.
• Data Mart Usage: A data mart may have multiple star schemas, sharing dimension tables but
with distinct fact tables.

Steps in Designing Data Mart


1. Designing
 covers all the tasks between initiating the request for a data mart to gathering
information about the requirements. Finally, we create the logical and physical Data
Mart design.
2. Constructing
 This is the second phase of implementation. It involves creating the physical database
and the logical structures.
3. Populating
 In the third phase, data is populated in the data mart.
4. Accessing
 Accessing is a fourth step which involves putting the data to use: querying the data,
creating reports, charts, and publishing them. End-users submit queries to the database
and display the results of the queries
5. Managing
 This is the last step of the Data Mart Implementation process. This step covers
management tasks such as:
▪ Ongoing user access management
▪ System optimizations and fine-tuning to achieve the enhanced performance
▪ Adding and managing fresh data into the data mart.
▪ Planning recovery scenarios and ensuring system availability in the case when
the system fails.
Exercise:
Briefly define the following:

1. Data Warehouse A centralized, integrated collection of data that supports management


decision-making and business intelligence. It is subject-oriented,
integrated, time-variant, and non-updateable.

2. Data Mart A subset of a data warehouse, customized for the decision-making needs
of a specific user group.

3. Star Schema A database design that separates dimensional data (descriptive) from fact
data (quantitative), making it suitable for ad hoc queries.

4. Fact Tables Tables in a star schema that contain quantitative data about a business,
such as sales or transactions, linked to dimension tables with descriptive
data.

Module 6: Data Quality and Integration

Data Governance
• Purpose – ensures data within an organization is managed with a focus on availability, integrity
and regulatory compliance.
• System – it is also a system that defines authority and usage of data assets involving people,
processes, and technologies.
• Protection – aims to manage and safeguard data assets effectively.
• Goal – transparency within and outside the organization to regulators and increasing the value
of data maintained by the organization.
Most commonly involved stakeholders in Data Governance:
• Data Owners
o decision-makers responsible for data at an entity or attribute level, ensuring data is
managed as an asset.
• Data Stewards
o Subject matter experts who ensure daily adherence to data policies and standards,
responsible for the care of data assets.
• Data Custodians
o handle the technical and business processes for maintaining and updating data assets
throughout their lifecycle.
• Data Governance Committee
o A group that approves data policies and standards and addresses escalated data
governance issues.

In a typical enterprise, here are some folks who might make up a Data Governance Team:
• Manager, Master Data Governance
o Leads the design, implementation and continued maintenance of Master Data Control
and governance across the corporation.
• Solution and Data Governance Architect
o Provides oversight for solution designs and implementations.
• Data Analyst
o Uses analytics to determine trends and review information
• Data Strategist
o Develops and executes trend-pattern analytics plans
• Compliance specialist
o Ensure adherence to required standards (legal, defense, medical, privacy)

Goals of Data Governance


• Minimize risks
• Establish internal rules for data use
• Implement compliance requirements
• Improve internal and external communication
• Increase the value of data
• Facilitate the administration of the above
• Reduce costs
• Help to ensure the continued existence of the company through risk management and
optimization

Data Governance Data Management

It involves a set of processes and procedures typically refers to the technical and operational
focused on managing data with objectives like aspects of handling data, such as data storage,
availability, integrity, and compliance. retrieval, and maintenance.
Yoko na, taas kayo… ditso tas exercise haha

Exercise:
Define the following:

1. Data transformation modifying data values or formats to meet the requirements of a system or
application. It’s a crucial step in building a data mart, ensuring that data
from various sources is standardized and consolidated for analysis.

2. Data owners typically, senior management who have authority over specific data assets
within an organization. They are responsible for the availability, integrity,
and compliance of the data.

3. Data stewards individuals or groups responsible for managing data according to the
policies and guidelines set by the data governance program. They ensure
the quality and proper usage of data across different business units.

4. ETL (Extract, Transform, Load) process used in databases and data warehousing
to extract data from various sources, transform it to fit operational needs,
and load it into a target database or data mart. It’s essential for integrating
and refining data, which is a key part of data governance.

Module 7: Data and Database Administration

• Traditional Data Administration - is a high-level function that is responsible for the overall
management of data resources in an organization, including maintaining corporate-wide data
definitions and standards.
o Roles of traditional data administration:
▪ Data policies, procedures, and standards
▪ Planning A key administration function
▪ Data conflict resolution
▪ Managing the information repository
▪ Internal marketing
• Traditional Database Administration - is a technical function responsible for logical and
physical database design and for dealing with technical issues, such as security enforcement,
database performance, backup and recovery, and database availability.
o Roles assumed by database administration:
▪ Analyzing and designing the database
▪ Selecting DBMS and related software tools
▪ Installing and upgrading the DBMS
▪ Tuning database performance
▪ Improving database query processing performance
▪ Managing data security, privacy, and integrity
▪ Performing data backup and recovery

• Data Warehouse Administration - A DWA plays many of the same roles as do Das and DBAs for
the data warehouse and data mart databases for the purpose of supporting decision-making
applications. The role of a DWA emphasizes integration and coordination of metadata and data.
o DWA performs the following functions:
▪ Build and administer an environment supportive of decision support
applications.
▪ Build a stable architecture for the data warehouse.
▪ Develop service-level agreements with suppliers and consumers of data for the
data warehouse.

• SARBANES-OXLEY(SOX)ANDDATABASES - and other similar global regulations were designed


to ensure the integrity of public companies’ financial statements.
o Key focus of SOX audits:
▪ IT change management
▪ Logical access to data
▪ IT operations

o IT Change Management
▪ refers to the process by which changes to operational systems and databases are
authorized. Typically, any change to a production system or database must be
approved by a change control board that is made up of representatives from the
business and IT organizations.

o Logical Access to Data


▪ is essentially about the security procedures in place to prevent unauthorized
access to the data. From a SOX perspective, the two key questions to ask are:
Who has access to what? and Who has access to too much?
▪ Two types of security policies and procedures:
• personnel controls - controls of personnel.
• physical access controls - Limiting access to areas within a building.
o IT Operations

Kapoy naman basa mod 7 taas kayo, ditso nkos exercise with answers:
Module 8: Overview: Distributed Database, Object Oriented Data
Distributed DBMS
• To have a distributed database, there must be a database management system that coordinates
the access to data at the various nodes.

distributed DBMS will perform the following functions:

1. Data Location Tracking: Maintains a distributed data dictionary to track data locations.
2. Data Retrieval and Processing: Determines where to retrieve and process parts of a query.
3. Request Translation: Translates requests between nodes with different DBMS and data models.
4. Data Management: Manages security, concurrency, optimization, and recovery functions.
5. Data Consistency: Ensures consistency among data copies across remote sites.
6. Logical Database Presentation: Presents a single logical database that is physically distributed.
7. Scalability: Allows the database to dynamically adapt to changing business needs.
8. Procedure Replication: Distributes stored procedures across nodes, like data.
9. Performance Improvement: Utilizes residual computing power to enhance database processing.
10. DBMS Diversity: Supports different DBMSs at various nodes through middleware.
11. Application Code Versions: Allows different software versions across the distributed database
nodes.

Query Optimization
• With distributed databases, the response to a query may require a DBMS to assemble data from
several different sites (although with location transparency, the user is unaware of this need).
• A major decision for the DBMS is how to process a query, which is affected by both the way a
user formulates a query and the intelligence of the distributed DBMS to develop a sensible plan
for processing.

Unified Modeling Language (UML)


• UML is a collection of graphical notations that help in business modeling and software system
design.
• It represents various system perspectives through different diagram types like use-case, class,
and sequence diagrams.
• Class Diagrams: Focuses on the structural characteristics of a system, capturing the
responsibilities of classes without detailing behaviors.
• System Implementation: Enables the analysis, design, and implementation of a system based on
a consistent conceptual model.
Object-Oriented Data Modeling (OODM)
• revolves around the concept of classes, which are blueprints for objects.
• A class defines the role, state, behavior, and identity of entity types within an application
domain.
• It can represent tangible entities, concepts, events, or design artifacts.
• Objects are instances of classes, embodying data and behavior relevant to the entity.
• The state of an object includes its attributes and relationships, while its behavior is influenced by
its state and the operations it performs.
• Operations are services that objects provide in response to requests, encapsulating the object’s
functionality and enabling interaction between objects.

You might also like