Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 16

DATA

WAREHOUSING &
MINING
National Security Agency
Case Study
Data warehouse implementation
strategy for National Security
Agency. Objective of the Due to
increased cybercrime as well as the
physical crimes. The NSA want to
keep the details of all the activities of
the criminals once registered. It is
required from you to design a
warehouse that will help the agency
in finding out the record of the  

criminals.
What is Data Warehousing?
 Collecting, storing, and analysing large volumes of structured and
unstructured data to support decision-making processes.
 Centralized repository.
 The Process: Extraction, Transformation, and Loading (ETL)
 Star schema.
Data Collection
 Identify the sources of data
 Electronic Surveillance
 Law Enforcement and Intelligence Agencies
 Determine the frequency of data collection
 Determine the frequency of data collection
 Establish data quality standards
 Define data extraction and transformation processes
 Ensure data security
 Incorporate metadata
Attributes of Data Model
 Criminal Information Attributes
 Time Attributes
 Location Attributes
Data Warehouse
Architecture
 Designing a Data Warehouse Architecture
for the NSA to store and manage criminal
activity records requires careful
consideration of several components.
 The types of data warehouse architectures
include single-tier architecture, two-tier
architecture, three-tier architecture,
centralized architecture, distributed
architecture hybrid architecture which is a
combination of different architectures.
Data Model
 The data model for the data warehouse should be a star schema with the fact table being the criminal
activity table.
 The dimension tables should include the criminal table, offense table, arrest table, and location table.
The criminal table should have a primary key that will be used to link to other tables.
 The offense table should contain a list of possible offenses, including the offense code and
description. The arrest table should include details of the arrest, such as the arresting officer's name
and the date of the arrest.
 The location table should contain details of the crime location, arrest location, and sentencing
location.
 By incorporating these attributes and data model, the data warehouse for the NSA will enable quick
and efficient retrieval of criminal records.
Conceptual Schema Design
 A conceptual schema design typically consists of a high-level diagram that shows the
entities and relationships involved in the data warehouse. This can be a useful tool for
communicating the design to stakeholders and ensuring that everyone is on the same
page.
 A common approach to data modelling for data warehouses is the star schema, which
consists of a central fact table surrounded by dimension tables.
Fact table: The central table in the data warehouse will be the fact table, which will store
all the details of criminal activity, such as date, time, location, type of crime, and details
of the offender. Each record in the fact table will be uniquely identified by a crime ID.
Dimension tables: Several dimension tables will be used to provide additional
context and detail to the fact table. These may include tables for the following
dimensions:
 Offenders: This table will store details about the offenders, such as name, date of birth, gender, address, and any aliases or
known associates.
 Victims: This table will store details about the victims of crimes, such as name, age, gender, and any relationship to the offender.
 Law enforcement agencies: This table will store details about the agencies involved in investigating and prosecuting the crimes,
such as name, location, and contact information.
 Crime types: This table will store details about the different types of crimes, such as theft, assault, and cybercrime.
 Aggregations: Aggregations can be used to pre-calculate and store summary data for faster query performance. For example, the
data warehouse could store the number of crimes committed by each offender or the total number of crimes of each type.
 Security and access controls: Given the sensitivity of the data, the data warehouse should have robust security measures in place
to prevent unauthorized access or data breaches. Access controls should be implemented to ensure that only authorized
personnel can view or modify the data.
 Data integration: The data warehouse should be able to integrate with other data sources, such as criminal databases or social
media platforms, to provide a more comprehensive picture of criminal activity.
Hybrid Architecture
 For NSA, hybrid architecture could be an option if there is a need for both scalability
and flexibility. For example, the Ministry could use a
 Centralized architecture for storing and managing the core data,
 Whileusing a distributed architecture for serving the specialized data marts for different
departments or business units.
 This approach would provide better scalability, performance, and data consistency, while allowing
for flexibility and agility.
 Source Systems: Criminal activity data can come from various sources, such as law
enforcement agencies, courts, prisons, and other government organizations. The data
may be structured, semi-structured, or unstructured. The source systems must be
identified and integrated to ensure consistent data quality.
Analysis and Reporting
Roll Up of Dataset

You might also like