Professional Documents
Culture Documents
(Download PDF) Architecting A Modern Data Warehouse For Large Enterprises Build Multi Cloud Modern Distributed Data Warehouses With Azure and Aws 1St Edition Anjani Kumar 3 Full Chapter PDF
(Download PDF) Architecting A Modern Data Warehouse For Large Enterprises Build Multi Cloud Modern Distributed Data Warehouses With Azure and Aws 1St Edition Anjani Kumar 3 Full Chapter PDF
https://ebookmass.com/product/architecting-a-modern-data-
warehouse-for-large-enterprises-build-multi-cloud-modern-
distributed-data-warehouses-with-azure-and-aws-1st-edition-
anjani-kumar/
https://ebookmass.com/product/architecting-a-modern-data-
warehouse-for-large-enterprises-build-multi-cloud-modern-
distributed-data-warehouses-with-azure-and-aws-1st-edition-
anjani-kumar-2/
https://ebookmass.com/product/modern-data-architecture-on-azure-
design-data-centric-solutions-on-microsoft-azure-1st-edition-
sagar-lad/
https://ebookmass.com/product/modern-enterprise-architecture-
using-devsecops-and-cloud-native-in-large-enterprises-1st-
edition-jeroen-mulder/
Data Engineering with AWS: Acquire the skills to design
and build AWS-based data transformation pipelines like
a pro 2nd Edition Eagar
https://ebookmass.com/product/data-engineering-with-aws-acquire-
the-skills-to-design-and-build-aws-based-data-transformation-
pipelines-like-a-pro-2nd-edition-eagar/
https://ebookmass.com/product/maturing-the-snowflake-data-cloud-
a-templated-approach-to-delivering-and-governing-snowflake-in-
large-enterprises-andrew-carruthers/
https://ebookmass.com/product/maturing-the-snowflake-data-cloud-
a-templated-approach-to-delivering-and-governing-snowflake-in-
large-enterprises-1st-ed-edition-andrew-carruthers/
https://ebookmass.com/product/unlocking-dbt-design-and-deploy-
transformations-in-your-cloud-data-warehouse-cameron-cyr/
https://ebookmass.com/product/architecting-cloud-saas-software-
solutions-or-products-engineering-multi-tenanted-distributed-
architecture-software-sankaran-prithviraj/
Anjani Kumar, Abhishek Mishra and Sanjeev Kumar
Abhishek Mishra
Thane West, Maharashtra, India
Sanjeev Kumar
Gurgaon, Haryana, India
This work is subject to copyright. All rights are solely and exclusively
licensed by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in
any other physical way, and transmission or information storage and
retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed.
The publisher, the authors, and the editors are safe to assume that the
advice and information in this book are believed to be true and accurate
at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the
material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
1. Introduction
Anjani Kumar1 , Abhishek Mishra2 and Sanjeev Kumar3
(1) Gurgaon, India
(2) Thane West, Maharashtra, India
(3) Gurgaon, Haryana, India
Data Mart
A data mart is a subset of a larger data warehouse and is designed to
serve the needs of a particular business unit or department. Data marts
are used to provide targeted, specific information to end users, allowing
them to make better, data-driven decisions.
A data mart is typically designed to store data that is relevant to a
specific business area or function, such as sales, marketing, or finance.
Data marts can be created using data from the larger data warehouse,
or they can be created as standalone systems that are populated with
data from various sources.
Data Modeling
In the field of data warehousing, there are two main approaches to
modeling data: tabular modeling and dimensional modeling. Both
approaches have their strengths and weaknesses, and choosing the
right one for your specific needs is crucial to building an effective data
warehouse.
Tabular Modeling
Tabular modeling is a relational approach to data modeling, which
means it organizes data into tables with rows and columns. This
approach is well suited to handling large volumes of transactional data
and is often used in OLTP (online transaction processing) systems. In a
tabular model, data is organized into a normalized schema, where each
fact is stored in a separate table, and the relationships between the
tables are established through primary and foreign keys.
The advantages of tabular modeling include its simplicity, ease of use,
and flexibility. Because data is organized into a normalized schema, it is
easier to add or modify data fields, and it supports complex queries and
reporting. However, tabular models can become more complex to query
and maintain as the number of tables and relationships increases, and it
can be slower to process queries on large datasets.
Dimensional Modeling
Dimensional modeling is a more specialized approach that is optimized
for OLAP (online analytical processing) systems. Dimensional models
organize data into a star or snowflake schema, with a fact table at the
center and several dimension tables surrounding it. The fact table
contains the measures (i.e., numerical data) that are being analyzed,
while the dimension tables provide the context (i.e., descriptive data)
for the measures.
Also, dimensional modeling is optimized for query performance,
making it well suited for OLAP and especially reporting systems.
Because data is organized into a star or snowflake schema, it is easier to
perform aggregations and analyses, and it is faster to query large
datasets. Dimensional models are also easier to understand and
maintain, making them more accessible to business users. However,
dimensional models can be less flexible and more complex to set up,
and they may not perform as well with transactional data.
In conclusion, both tabular and dimensional modeling have their
places in data warehousing, and the choice between them depends on
the specific needs of your organization. Tabular modeling is more
suited to handling large volumes of transactional data, while
dimensional modeling is optimized for OLAP systems and faster query
performance.
The fact table contains the numerical values, and it is linked to the
dimension tables through foreign key relationships. The fact table is
typically wide and has fewer rows than the dimension table.
Best practices for designing facts include the following:
Choose appropriate measures: It is important to choose measures
that are meaningful and appropriate for the business.
Normalize the data: Normalizing the data in the fact table can help
to reduce redundancy and improve performance.
Use additive measures: Additive measures can be aggregated across
all dimensions, while non-additive measures are specific to a single
dimension.
Measures
Measures are the values that we use to aggregate and analyze the facts.
Measures are the result of applying mathematical functions to the
numerical data in the fact table.
For example, measures could include average sales, total sales, and
maximum sales. Measures can be simple or complex, and they can be
derived from one or more facts. Measures can be pre-calculated and
stored in the fact table, or they can be calculated on the fly when the
user queries the data warehouse.
Best practices for designing measures include the following:
Choose appropriate functions: It is important to choose functions
that are appropriate for the business and the data being analyzed.
Use consistent units of measurement: Measures should use
consistent units of measurement to avoid confusion.
Avoid calculations in the query: Pre-calculate measures that are
frequently used to improve performance.
OLAP
OLAP, or online analytical processing, is a critical component of data
warehousing that facilitates complex and interactive data analysis. It is
a technology used for organizing and structuring data in a way that
enables users to perform multidimensional analysis efficiently. OLAP
systems are designed to support complex queries and reporting
requirements, making them invaluable for decision-makers and
analysts in business intelligence and data analytics. In the next section
we will try to understand OLAP better.
Cubes provide a fast and efficient way to analyze data. They can
aggregate data across multiple dimensions, allowing users to drill down
and explore data at different levels of granularity. This makes it easier
to identify trends, patterns, and anomalies in the data.
Categorization of OLAP
OLAP can be categorized into different types based on its architecture
and the way it processes data. Here are some of the different types of
OLAP and their examples and query techniques:
ROLAP (Relational OLAP): This type of OLAP works directly with
relational databases and SQL queries. The data is stored in a
relational database, and queries are executed against the database to
generate reports. ROLAP is useful for handling large volumes of data
and complex queries.
Example: Microsoft SQL Server Analysis Services
Query Technique: SQL queries
MOLAP (Multidimensional OLAP): This type of OLAP stores data in
a multidimensional cube structure, which allows for faster
processing of data and complex calculations. MOLAP cubes can be
pre-aggregated, which improves query performance.
Example: IBM Cognos TM1
Query Technique: MDX (multidimensional expressions)
HOLAP (Hybrid OLAP): This type of OLAP combines the features of
both ROLAP and MOLAP. It stores the data in a relational database
and uses a multidimensional cube structure for querying.
Example: Oracle Essbase
Query Technique: SQL and MDX
DOLAP (Desktop OLAP): This type of OLAP is designed for small-
scale data analysis and runs on a desktop computer. It can be used to
perform ad-hoc queries and analysis.
Example: Microsoft Excel
Query Technique: Point-and-click interface or formula-based
calculations.
WOLAP (Web-based OLAP): This type of OLAP runs on a web
browser and allows users to access data remotely. It is useful for
collaborative data analysis and reporting.
Example: MicroStrategy Web
Query Technique: Point-and-click interface or SQL-like queries.
Querying Technique
While there are multiple programmable ways to query data from an
OLAP cube, such as SQL and other programming languages, MDX has
been one of the most prominent options to query cube data.
For almost two decades, MDX has been used as a primary powerful
way to query and analyze multidimensional data, allowing users to gain
insights from complex datasets that would be difficult to analyze using
Another random document with
no related content on Scribd:
3. Adenocarcinoma.—Sutton has also described an adenocarcinoma
of the peculiar sebaceous glands named after Tyson. These are
found particularly at the base of the prepuce, this form of tumor
being rare. Adenomas arising from the mucous glands, which are
usually transformed into cysts, are also known, as well as other
gland tumors springing from the glands of Bartholin, Cowper, etc.
(See Plate XXIII, Fig. 2, and Plate XXIV.)
Pituitary adenomas are either analogous to struma or belong to
the mixed tumors of dermoid or teratomatous type.
Prostatic adenoma is in large degree fibromyoma of that body,
with more or less hypertrophy of its glandular structures. Minute
cystic alterations may occur also, as well as growth resembling
intracanalicular fibro-adenoma.
Adenoma is occasionally observed in the salivary glands, where it
is usually encapsulated, and may undergo cystic changes. It has
been observed in the liver and pancreas. In the former its pseudo-
ducts often contain inspissated material of bile-green tint.
The lesions of the kidney referred to as cystadenoma are now
grouped among the teratomas, and are described under that
heading. They present interesting examples of mixed tumors.
In the testis, as in the ovary, epithelial tumors frequently present
themselves, but they partake less often of the type of pure adenoma,
and incline rather to that already described under Ovarian Cystoma.
Even in the paradidymis tumors of this same character are found,
with cystic or even papillary alterations.
In the mucous membrane of the stomach and bowels adenoma
usually presents as an ovoid tumor, attaining such size as to give
rise to mechanical obstruction either by pressure or by traction.
Adenoma of the pyloric region is a repetition in structure of the
pyloric glands. In the rectum it presents usually as a polypoid
outgrowth, often seen in young children. Such tumors are generally
small, and when solitary they often hang by a distinct stalk.
Similar polypoid tumors present in the cervical canal of the uterus,
where are also found sessile and racemose tumors, all of which are
structural repetitions of the glands met with in the cervix uteri.
Adenoma of the uterine cavity is seldom seen; it is also rare in the
Fallopian tube, but occasionally presents as a dendritic outgrowth
from the mucous membrane distending the tube.
Epithelioma.—Epithelioma is common, especially where there is
transition from one kind of epithelium to another, and,
of all other localities, particularly where skin and mucous membrane
meet—e. g., the lips, the vulva, and the anus. Epithelioma differs
from papilloma in that the former is no longer limited by basement
membrane, but passes beyond it into the underlying connective
tissue and presents down—rather than up—growth. Characteristic of
epithelioma are the so-called cell nests or pearly bodies, where there
seems to be a tendency to globular arrangement of cells with such
condensation or alteration that they lose their ability to take stains,
and appear as a more or less lustrous mass, showing off by contrast
among the standard surrounding tissue. On this account they are
often called pearly bodies. Recognition of these is tantamount to
diagnosis of epithelium. (See Plate XXIII.)
This form of neoplasm is essentially the same, no matter what its
clinical varieties. These comprise a wart-like growth or nodule, which
quickly becomes an ulcer with elevated edges, ulceration being due
to necrosis of cells farthest from the periphery; or, again, the disease
may start as an ulcerated fissure, ulceration and infiltration keeping
pace, in which case there is a sharply defined ulcer with undermined
edges. A third variety, often seen upon the lips, comprises a
projecting mass, with more or less horny surface. In nearly all of
these, however, the characteristic cell nests with their onion-like
arrangements of cells will be found.
Epithelioma, especially when exposed to the air or to surface
irritation, quickly ulcerates and tends to involve all the surrounding
tissues, while occasionally the distinctive cells proliferate so rapidly
as to give the ulcer more or less of a bursal or a cauliflower-like
arrangement. From such a surface there is a constant discharge of
foul-smelling detritus or of sloughs. Even bone cannot resist its
progressive invasion and slowly disintegrates before the advancing
mass. Cartilage is resistant, and usually preserves its integrity. In
other words, the tendency of epithelioma is toward constant
encroachment and infiltration, and toward a fatal termination from
hemorrhage by ulceration, from septic infection, exhaustion, or other
accidents. The wart-like forms run the slowest course of all, but even
here the malignant tendency is most evident.
Lymph-node Infection.—A striking characteristic of epitheliomas is
the invasion of the adjoining lymph nodes, which attain a size
disproportionate and bearing no necessary relation to that of the
primary growth. This constitutes one of the most serious
complications of the condition. This lymphatic invasion partakes of
the malignant character of the disease, and from every focus of this
character infiltration and destruction proceed. Infected nodes also
show an early tendency to central degeneration and to spurious cyst
formation. When the overlying skin becomes involved we have
extensive sloughing and the conversion into large malignant ulcers.
Dissemination to a distance (i. e., metastasis) is rare in epithelioma
—much more so than in carcinoma. (See Plate XXV, Fig. 2.)
About the mouth epithelioma is not common before the thirty-fifth
year, though I have seen it on the lip of a twenty-year-old woman. It
is vastly more common in men than in women, and more frequent on
the lower than the upper lip. In the tongue it seldom occurs before
the fortieth year. It seems to be more common both on the lip and
tongue in men with bad teeth and in confirmed smokers, thus giving
rise to the view often held that it is purely a matter of irritation. It may,
however, be due to contact infection should it be regarded as of
parasitic origin. In one-fifth of the cases of epithelioma of the tongue
there are preceding lesions, usually described as leukoplakia or
ichthyosis of the tongue—conditions characterized by epithelial
reduplication and the formation of dense plaques or scales. These
lesions are usually regarded as precancerous conditions. (See Plate
XXVI.)
PLATE XXIV
FIG. 1
Fig. 89 Fig. 90
Fig. 91 Fig. 92