Assignment No 01 Subject Data Ware House Topic Comparison of DWH Tools Group Members Muhammad Haseeb Khan Hashim Shoukat Mir Abdul Wahab Submitted To Proffessor Anwar Ali

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Assignment No 01

Subject
Data Ware House
Topic
Comparison Of DWH Tools
Group Members
Muhammad Haseeb Khan
Hashim Shoukat
Mir Abdul Wahab
Submitted To
Proffessor Anwar Ali
What Is a Data Warehouse?
In broad terms, a data warehouse (DWH) is a place that stores current and
historical data - from marketing, sales and other departments - in one
central repository so that it can be analyzed for market research,
analytical reports and decision support.
While traditional operational databases are updated in real-time to
maintain accurate, up-to-the-minute data, data warehouses are updated
less frequently and are designed to give a historical, long-range view of
data. The analytics that are run in a data warehouse, then, are usually
more of a comprehensive view of your company’s history rather than a
snapshot of the current state of your business.
Further, since the data in a data warehouse is already integrated and
transformed, it allows you to easily compare older, historical data and
track marketing and sales trends. These historical comparisons can be
used to track successes and failures and predict how to best proceed with
your business ventures in order to increase profit and long-term ROI.
Specifically, end users can use the information in their data warehouses
to:
 Monitor or modify marketing campaigns
 Manage and improve customer relationships
 Clean and organize company data
 Predict future growth, needs and pain points
 Track, understand and improve company performance
 Consolidate data from multiple sources, etc.
In short, data warehouses are great for long-range, comprehensive
analytics without negatively impacting daily operations, and this kind of
analytics is always necessary if businesses expect to improve their
strategies and make long-term progress.
The Data Warehouse Options
There are four primary types of data warehouse platforms, and many
options within each of these categories. These are:
 Traditional Database Management Systems (DBMS)
These are usually row-based relational platforms. Dominant brands of this
software include IBM DB2, Microsoft SQL Server, Oracle and SAP
2. Specialized Analytics DBMS
These are usually columnar data stores designed specifically for managing
and running analytics. Dominant brands of this software include Teradata,
HPE Vertica and Greenplum
3. Out-of-the-box data warehouse appliances
These typically combine software and hardware with a DBMS pre-
installed. In theory, all the customer needs to do is plug it in and turn it
on. Dominant brands of this solution include IBM Netezza, Oracle Exadata
and SAP Hana
4. Cloud-hosted data warehouse tools
Here, the DBMS is not installed - it is accessed over the internet. This
essentially offers data warehouses as a service without requiring any
hardware. Dominant brands here include Amazon Redshift, Google Big
Query and Microsoft Azure SQL Data Warehouse
Each of these are good options - making a choice for your business is all
about figuring out what will work best for your company, your budget,
your employees and your overall structure. Let’s get into what you should
look out for and how to choose.
Criteria to Keep in Mind When Choosing a Data Warehouse Platform
1. First thing’s first: cloud VS on-premise (usually DBMS or specialized
DBMS)
The first decision that one has to make when choosing a data warehouse
is between cloud and on-premise data warehouse systems, and there are
certainly advantages and disadvantages to both.
In a nutshell, cloud services offer:
 Scalability: with no on-premise software or hardware, it’s easy, cost-
effective, simple and flexible to scale with cloud services
 Low entry cost: with no servers, hardware, IT work or operational
costs, cloud services cost substantially less up-front.
 Connectivity: With cloud services, it is much easier to connect to other
cloud services, which means that it’s easier to digest, store and file
data.
 Security: Typical cloud providers stay hyper up-to-date with security
patches and protocols to keep their host of customers safe and happy
 Availability/Reliability: If you have a strong provider, cloud solutions
can have a very high up-time, which makes them reliable no matter
what.
Conversely, on-premise solutions offer:
 Speed: Cloud solutions usually rely on far-away servers, and the time
that it takes for data to bounce through those servers and reach the
end user from these servers can be unacceptable for many businesses.
On-premise solutions don’t have to deal with this problem - with local
servers, speed and latency can be better managed, at least for
businesses based in one geographical location.
 Security: on-premise warehouses allow organizations to have full
control over their security and access, which is important for
businesses in which this is a big priority.
 Availability/Reliability: If you have a good staff and exceptional
hardware, on-premise solutions can be highly available and very
reliable.
SEE ALSO: CLOUD VS ON-PREM DATA WAREHOUSE
2. Vendor selection
Once you choose between cloud and on-premise systems, there are
various vendors to choose from. For on-premise systems, some options
include:
 Oracle Database
 Microsoft SQL Server
 IBM DB2
 MySQL
 PostgreSQL
While there are many fantastic options here, we recommend PostgreSQL
for its proven track record of reliability, compatibility and open source
roots. Of course, in any case, it is important to consider your company’s
data types and your existing ecosystem, since you’ll have to choose a data
warehouse that’s compatible.
For cloud systems, top solutions include:
 Amazon Redshift
 Microsoft Azure SQL Data Warehouse
 Google Big Query
Here, Amazon Redshift and Google Big Query are certainly the biggest
players. That said, most recent data shows that Amazon Redshift is the
highest-performing cloud solution overall.
In the big picture, it’s important to choose a data warehouse solution that
will fit in with your business model, your budget and your existing
systems. If you’re looking for speed, security and control, then an on-
premise solution may be the best choice for you. Conversely, if you’re
looking for low entry cost and easily scalability, you may want to go with a
cloud service. Either way, making the right decision for your company will
help you run the effective analytics and BI that you will need for long-term
stability and success.

Xplenty’s platform allows organizations to integrate, process, and prepare


data for analytics on the cloud. With Xplenty, every company can have
immediate connectivity to a variety of data stores and a rich set of out-of-
the-box data transformation components.

You might also like