Task 2a

TASK 2A
Data Analytics vs. Data

Science
The terms data analytics and data science are sometimes used interchangeably by
those who don’t work with big data. While there is overlap in places, there are core
features unique to each field and differences in the skills expected of the practitioners
within each field. Looking at job descriptions (and salaries!) for each role also shows
that the two areas have quite different specialties.
This article will highlight the primary attributes of both data analytics and data science,
explain what data analysts and data scientists do and the skills they should have, clarify
when to use each process, and explain how to choose which role to hire for your team.
What is data analytics?

Data analytics aims to discover insights about specific areas of a business and uses
basic statistics to find solutions to the questions that data scientists raise. Data analysts
then communicate these solutions to their stakeholders so they can be implemented by
the business. Data analytics follows a process of creating regular reports and
predictions for the business, instead of just providing one-off insights. Data analysts do
this by using an automated pipeline for consuming and monitoring data, which gets
designed and created by data engineers. This pipeline follows all the steps of the data
analytics lifecycle.
Sens
What is data science?

itivit
y:
Inte
rnal
doc
ume Data science is a field that uses rigorous experiments, computer algorithms, and
nt
for statistics to find patterns in both structured and unstructured data, leading to useful
Uni
on
Ban
k
business insights. It is an umbrella term that includes some parts of data analytics, as
well as a combination of other disciplines such as machine learning and data mining.
The goal of data science is to apply scientific methods and predictions to business
goals and discover new and unique questions to drive the business forward. Some
useful predictions that data science can help with include working out how many
supplies should be purchased based on expected sales volume, or answering a
question like “if we raise prices by X%, what is the predicted impact on sales and
revenue?”
The difference between data

science and data analytics
Both data science and data analytics techniques can be applied to big data. They both
involve collecting, preparing data, and analyzing data. But beyond these similarities, the
two fields are quite different. The main differences between data analytics and data
science are listed below.
There are four main types of data analytics: descriptive, diagnostic, predictive, and
prescriptive analytics. Descriptive and diagnostic analytics are done by data analysts, but
predictive and prescriptive analytics fall under the realm of data science. This is the
main difference between the two fields: data analytics looks backward and focuses on
past data, aiming to identify trends (by describing the past and diagnosing why certain
events happened). Data science looks forward and focuses on the future (by predicting
it or prescribing what should happen).
Data science involves coming up with and answering key questions that are game-
changers for driving businesses forward. Data analytics focuses on asking specific
questions that are on more of a micro-scale or are specific to a particular team. Despite
the smaller scale of the questions, data analytics answers very useful questions that
tend to be asked on a regular basis, which is why a key part of data analytics is
Sens
itivit operationalizing–procedurally automating–analytics reports.
y:
Inte
rnal
doc Both data analytics and data science make use of statistics; however, the types of
ume
nt statistics used in data analytics tend to be more rudimentary than those used by data
for
Uni science. Data analytics tends to use aggregation methods such as averages,
on
Ban
k
percentiles, sums, and counts in spreadsheets, analytics tools (such
as Mixpanel, Amplitude, or PostHog), or relational databases and data warehouses. Data
scientists, on the other hand, use more advanced statistical methods such as
regression or cluster analysis. Data scientists also commonly use machine learning
models, whereas data analysts are much less likely to do so.
Data analysts will always be provided with a question that needs answering and will
usually have access to structured data to help them with their analysis. Structured data
is data that is highly organized in its structure: for example, data that is stored in a
spreadsheet or relational database. Data scientists, by contrast, often have to wade
through large amounts of unstructured data (for example, image data, social media
posts, or large amounts of free text) and use data mining techniques to find useful
insights from it. They may also have to come up with their own questions, and they
must be able to justify why answering these questions adds value to the business.
One of the areas of confusion in comparing data analytics and data science is that
predictive and prescriptive analytics are sometimes viewed as part of data analytics
(because they are two of the four main types of data analytics), but they are also viewed
as part of data science because they tend to be done by data scientists. This Venn
diagram shows which activities are considered part of data science or data analytics
(some things are done in both fields), as well as using a color-coded key for which of
these tasks are done by data scientists, data analysts, or both.
Sens
itivit
y:
Inte
rnal
doc
ume
nt
for REF: Data Analytics vs. Data Science (rudderstack.com)
Uni
on
Ban
k
Overview: Data science vs data analytics
Think of data science as the overarching umbrella that covers a wide range of tasks performed to find
patterns in large datasets, structure data for use, train machine learning models and develop artificial
intelligence (AI) applications. Data analytics is a task that resides under the data science umbrella and is
done to query, interpret and visualize datasets. Data scientists will often perform data analysis tasks to
understand a dataset or evaluate outcomes.
Business users will also perform data analytics within business intelligence (BI) platforms for insight into
current market conditions or probable decision-making outcomes. Many functions of data analytics—such
as making predictions—are built on machine learning algorithms and models that are developed by data
scientists. In other words, while the two concepts are not the same, they are heavily intertwined.
Data science: An area of expertise

As an area of expertise, data science is much larger in scope than the task of conducting data analytics and is
considered its own career path. Those who work in the field of data science are known as data scientists.
These professionals build statistical models, develop algorithms, train machine learning models and create
frameworks to:
 Forecast short- and long-term outcomes
 Solve business problems
 Identify opportunities
 Support business strategy
 Automate tasks and processes
 Power BI platforms
In the world of information technology, data science jobs are currently in demand for many organizations
and industries. To pursue a data science career, you need a deep understanding and expansive knowledge of
machine learning and AI. Your skill set should include the ability to write in the programming languages
Python, SAS, R and Scala. And you should have experience working with big data platforms such as
Hadoop or Apache Spark. Additionally, data science requires experience in SQL database coding and an
ability to work with unstructured data of various types, such as video, audio, pictures and text.
Data scientists will typically perform data analytics when collecting, cleaning and evaluating data. By
analyzing datasets, data scientists can better understand their potential use in an algorithm or machine
learning model. Data scientists also work closely with data engineers, who are responsible for building the
data pipelines that provide the scientists with the data their models need, as well as the pipelines that models
rely on for use in large-scale production.
The data science lifecycle

Data science is iterative, meaning data scientists form hypotheses and experiment to see if a desired outcome
can be achieved using available data. This iterative process is known as the data science lifecycle, which
Sens
itivit
usually follows seven phases:
y: 1. Identifying an opportunity or problem
Inte 2. Data mining (extracting relevant data from large datasets)
rnal
doc
3. Data cleaning (removing duplicates, correcting errors, etc.)
ume 4. Data exploration (analyzing and understanding the data)
nt 5. Feature engineering (using domain knowledge to extract details from the data)
for
6. Predictive modeling (using the data to predict future outcomes and behaviors)
Uni
on
Ban
k
7. Data visualizing (representing data points with graphical tools such as charts or animations)
REF: Data science vs data analytics: Unpacking the differences - IBM Blog
Data Analytics vs. Data Science

While data analysts and data scientists both work with data, the main difference lies in what they do with it.
Data analysts examine large data sets to identify trends, develop charts, and create visual presentations to
help businesses make more strategic decisions.
Data scientists, on the other hand, design and construct new processes for data modeling and production
using prototypes, algorithms, predictive models, and custom analysis.
Working in Data Analytics

The responsibility of data analysts can vary across industries and companies, but fundamentally, data
analysts utilize data to draw meaningful insights and solve problems. They analyze well-defined sets of
data using an arsenal of different tools to answer tangible business needs: e.g. why sales dropped in a certain
quarter, why a marketing campaign fared better in certain regions, how internal attrition affects revenue,
etc.
Data analysts have a range of fields and titles, including (but not limited to) database analyst, business
analyst, market research analyst, sales analyst, financial analyst, marketing analyst, advertising analyst,
customer success analyst, operations analyst, pricing analyst, and international strategy analyst. The best
data analysts have both technical expertise and the ability to communicate quantitative findings to non-
technical colleagues or clients.
Characteristics of Data Analysts

Data analysts can have a background in mathematics and statistics, or they can supplement a non-
quantitative background by learning the tools needed to make decisions with numbers. Some data analysts
choose to pursue an advanced degree, such as a master’s in analytics, in order to advance their careers.
Working professionals that are considering changing careers could benefit if they have experience in
mathematical or statistical fields. Adding the pursuit of an advanced degree in the data industry will greatly
impact their job opportunities and make for a smooth transition into a data analysis position.
Skills and Tools

Sens
itivit Top data analyst skills include data mining/data warehouse, data modeling, R or SAS, SQL, statistical
y: analysis, database management & reporting, and data analysis.
Inte
rnal
doc
ume
nt
for
Uni
on
Ban
k
Roles and Responsibilities
Data analysts are often responsible for designing and maintaining data systems and databases, using
statistical tools to interpret data sets, and preparing reports that effectively communicate trends, patterns, and
predictions based on relevant findings.
Learn More: What Does a Data Analyst Do?
Working in Data Science

Data scientists, on the other hand, estimate the unknown by asking questions, writing algorithms, and
building statistical models. The main difference between a data analyst and a data scientist is heavy coding.
Data scientists can arrange undefined sets of data using multiple tools at the same time, and build their own
automation systems and frameworks.
Characteristics of Data Analysts

Drew Conway, data science expert and founder of Alluvium, describes a data scientist as someone who has
mathematical and statistical knowledge, hacking skills, and substantive expertise. As such, many data
scientists hold degrees such as a master’s in data science.
Skills and Tools

These include machine learning, software development, Hadoop, Java, data mining/data warehouse, data
analysis, python, and object-oriented programming
Roles and Responsibilities

Data scientists are typically tasked with designing data modeling processes, as well as creating algorithms
and predictive models to extract the information needed by an organization to solve complex problems.
REF: Data Analytics vs. Data Science: A Breakdown (northeastern.edu)
Data Science vs. Data Analytics: Understanding the Differences

August 2, 2021
Sens
itivit
As more organizations recognize the need to understand and manage the data they produce,
y: demand for data scientists and data analysts continues to grow. Students who are interested in a
Inte career that makes use of data modeling, statistics, programming, and other analytical skills have
rnal likely seen data science bachelor degree online and job listings that focus on data science or data
doc
ume analytics. However, while the data science and data analytics fields both involve working with and
nt manipulating data, they are not interchangeable.
for
Uni
on
Ban
k
When it comes to data science vs. data analytics, what are the differences, and how can a student
choose the right one?
Data science vs. data analytics

How do organizations use data science and data analytics to inform decisions and increase
efficiency and profitability?
Data science
Data scientists use programming, math, and statistics to gain insights and drive organizational
strategy. Data scientists are highly adept at machine learning, data modeling, and the use of
algorithms to automate processes. Since meaningful data is field-specific, data scientists also
must have domain expertise, the understanding of their industry or company, to provide context for
the data they work with. For example, data science research in healthcare can drive diagnoses,
help prevent disease, or teach computers to read X-rays or MRIs.
Data scientists work closely with sales and marketing, product development, information
technology, finance, and business leaders to help identify trends, spot issues, understand
consumer behavior, and present solutions that support strategic decision-making.
Data analytics
Data analytics professionals are responsible for data collection, organization, and maintenance,
as well as for using statistics, programming, and other techniques to gain insights from data. The
role of a data analyst is to spot trends and help solve problems. Examples of data analytics in
retail include order tracking, recommendation features, and identification of store locations.
Data analysts tend to respond to requests from decision-makers rather than drive the decision-
making process.
Similarities between data science and data analytics

The fields of data science and data analytics are similar in many ways. Both use data to help
understand an organization’s operations, which in turn supports decision-making. Both fields are
heavily STEM-focused, and both are in high demand across many industries. Here are some of
the ways in which the two fields overlap.
Massive quantities of data
Professionals in both data science and data analytics manipulate huge data sets with millions of
data points. These massive databases may have low-quality data that must be wrangled
(cleaned), maintained, and organized so that any analysis is accurate.
Technical skills
Both fields require programming skills (such as in R, Python, Tableau, and SQL), as well as
Sens
statistics, Excel, and data visualization and modeling proficiency. Professionals in both fields must
itivit be highly analytical and have a methodical approach to problem-solving and project management.
y:
Inte Communication skills
rnal
doc
ume Data scientists and data analysts work with colleagues across departments, many of whom may
nt not have a tech background. Professionals in both fields are responsible for presenting their
for
Uni
findings in a clear and effective manner.
on
Ban
k
Differences between data science and data analytics
The major difference between data science and data analytics is scope. A data scientist’s role is
far broader than that of a data analyst, even though the two work with the same data sets. For that
reason, a data scientist often starts their career as a data analyst.
Here are some of the ways these two roles differ.
Responsibilities
Data scientists model data to make predictions, identify opportunities, and support strategy. They
use data to understand the future. The role of the data analyst is to solve problems and spot
trends. They work with the data as a snapshot of what exists now.
Database manipulation and management
Data scientists use algorithms and machine learning to improve the ways that data supports
business goals. Data analysts collect, store, and maintain data and analyze results.
Ref: Data Science vs. Data Analytics: What’s the Difference? | Maryville Online
What Does a Data Analyst Do?

When a company wants to make sense of data—whether it’s been collected in-house or elsewhere—they
often rely on data analysts to make sense of all the information. Data analysts may be responsible for
cleaning and formatting data before identifying trends that can help business leaders make strategic
decisions.
Conducting data analysis involves a variety of tools, skills and computing languages to perform statistical
analyses and answer questions to solve organizational challenges. A data analyst may use a query language
like SQL, programming language like R and SAS, and visualization tools like Power BI and Tableau in the
course of their work. This often involves figuring out how to deal with missing data.
Strong communication skills are also useful in data analysis. Data analysts are often required to convey their
findings to outside teams or stakeholders, explaining their reasoning and research to justify their
conclusions.
What Does a Data Scientist Do?

Data scientists’ work is focused on creating the algorithms and predictive modelsExternal
link:open_in_new that data analysts use to collect, sort, and analyze information. They help to develop tools
and methods to extract information, create automation systems to eliminate routine work, and build data
frameworks tailored to their organization.
While data scientists often perform different tasks from data analysts, these roles can overlap. As a more
senior role, a data scientist often has a background in data analysis. This allows them to understand how
analysts approach their work and build solutions that generate relevant insights.
Sens
itivit
y: Soft skills, such as business intuition, critical thinking and innovative problem solving, are also important in
Inte this advanced position. If you can stay one step ahead of your organization’s challenges, you can prove to be
rnal
doc
a highly valuable asset and stay competitive as a professional.
ume
nt
for
Uni
on
Ban
k
Differences and Similarities Between Data Analysts and Data Scientists
Data analysts and data scientists serve important yet distinct roles in an organization. Here are a few ways
they can contribute to the same data set or project:
 A data analyst makes sense out of existing data through routine analysis and writing reports. A data
scientist works on new ways to capture, store, manipulate and analyze that data.
 A data analyst works toward answering business-related questions. A data scientist works to develop new
ways to ask and answer those questions.
 A data analyst relies on database software, business intelligence programs and statistical software. A data
scientist uses Python, Java and machine learning to manipulate and analyze data.
REF: Data Analyst vs Data Scientist | Master's in Data Science (mastersindatascience.org)
TASK 2B
Categories of Data
1. Qualitative Data
Sens
Qualitative data is used to represent no numerical information. This data type is used to represent the
itivit
y: qualities and characteristics of the given information, such as colour, gender, symbols, text, taste, etc. It
Inte cannot be presented in numerical form. These data are obtained from interviews, meetings, surveys, etc.
rnal They are also known as Categorical data. There are two main types of qualitative data: Nominal data and
doc
ume Ordinal data. Let us learn about them in detail.
nt
for
Uni
on
Ban
k
1. Nominal Data
Nominal data is a type of qualitative data that is used to represent data into labels based on different
categories. They do not have any specific order or numerical significance. Let us understand it better with a
few real-world examples.
 Colours ( red, blue, green, orange, etc)

 Fruits ( Apples, Bananas, Grapes, strawberries)
 Gender (Male, Female, other)
 Marital Status ( Single, married, divorced, widowed)
 Blood type (A, AB, O, B)
 Days of the week (Monday, Tuesday, Wednesday, Thursday, friday, Saturday, Sunday)
2. Ordinal Data
This is also a type of qualitative data where only non-numerical data is considered. It is almost similar to
nominal data. However, there is just one major difference, ordinal data are arranged in a meaningful order,
unlike nominal data, which does not follow any specific order.
Let us understand ordinal data with some examples.
 Reviews ( excellent, good, fair, poor)

 Educational Qualification (high school, undergraduate, postgraduate)
 Grades in exam ( A, B, C, D)
 Economic background ( below poverty, middle class, rich)
2. Quantitative Data
Quantitative data is a type of data that represents numerical information that we can count and measure.
Sens They are also known as Numerical data. It generally gives answers to “how many”, “how much”, etc. This
itivit
y:
data can be represented in graphical and chart forms
such as bar graphs, histograms, pie charts, etc. Let us
Inte understand quantitative data with some examples.
rnal
doc  Marks in a test
ume
 Temperature
nt
for
 Weight
Uni  Sales figure
on
Ban
k
These are some common examples of numerical data. It will always represent information in numerical
form. There are two major types of quantitative data: Discrete and continuous. Let us know about them in
detail.
1. Discrete Data
Discrete data is used to represent distinct or separate numerical values. They are discrete because they can
be presented in the form of whole numbers or integers, which cannot be divided into smaller parts.
However, the discrete data can be counted and is not infinite. They can be easily represented by various
graphs and charts, such as bar graphs, number lines, etc. Let us understand with a few examples given
below.
 Total number of students in college

 Number of cars in parking area
 Number of members in a family
 Number of wheels in a car
2. Continuous Data
Continuous data is a data type that deals with an infinite range of numerical data. They are generally defined
within a specific range, with any value within that range. It can be easily divided into smaller fractional or
decimal values. They are generally used in fractional form, unlike discrete, which uses only whole numbers
or integers.
The main difference between continuous data and discrete data is that discrete data cannot be presented in
decimal or fractional form, while continuous data can be presented in fractional form. Let us understand it
with some common examples.
 Height of a person
 Temperature in celsius or fahrenheit
 Weight in pounds or kilograms
 Distance in meter or kilometers
 Share price of market
REF: 4 Types Of Data- Nominal, Ordinal, Discrete And Continuous (pwskills.com)
TASK 2C
Sens
itivit
DATA VIRTUALIZATION
y:
Inte
rnal
doc
ume
nt
for
Uni What is Data Virtualization?
on
Ban
k
Data Virtualization is a data integration approach that allows an application to retrieve and
manipulate data without requiring technical details, such as how it is formatted or where it is
physically located. It provides a single, unified, and consistent business view of data across
various, disparate data sources, making it easier for business users to access data.
Functionality and Features

Data virtualization offers several key features:
 Real-time access to data: It provides business users with real-time access to data
regardless of its location.
 Data abstraction: It hides the complexities of data, such as its source, format, location,
and storage technology, from end-users.
 Data federation: It aggregates data from multiple sources and delivers a unified,
consolidated view of it.
 Cache: To improve performance, it saves recent or frequent data requests in the cache.
Data transformation: It transforms data into business-friendly formats.
Architecture
The architecture of data virtualization comprises of three primary components: the data
consumers (applications, BI tools, etc.), the data virtualization layer (which abstracts and provides
unified view of the data), and the data providers (databases, web services, flat files, etc.).
Benefits and Use Cases

Among its numerous benefits, data virtualization:
 Reduces data replication and storage costs

 Enhances agility due to its capacity for real-time data delivery
 Supports a diverse range of data formats and types
 Improves data quality by providing a consistent view of data
 Simplifies data management and governance
Challenges and Limitations

Despite its advantages, data virtualization also has a few challenges:
 Latency and performance issues can occur if data is being accessed from multiple,
Sens
itivit
geographically-dispersed sources.
y:  Security control implementation can be complex due to diverse data sources.
Inte
rnal  As it depends on source systems for data, any changes in those systems can impact
doc
ume the virtualization layer.
nt
for
Uni Integration with Data Lakehouse
on
Ban
k
Implementing Data Virtualization in a data lakehouse environment can simplify data management
and enhance accessibility. A lakehouse merges the features of data lakes and data warehouses.
Thus, data virtualization becomes a key capability in a lakehouse architecture to provide a unified
view of data, regardless of its format or location.
Security Aspects
Data Virtualization employs data security measures like data masking, encryption, and role-
based access control to ensure data privacy and compliance with regulations.
Performance
While Data Virtualization facilitates real-time access to data, its performance can be influenced by
factors such as network latency, the performance of source systems, and hardware limitations.
REF: Data Virtualization | Dremio
What is Data
Virtualization?
Data virtualization decouples the database layer that sits between the storage and application layers
in the application stack. Just like a hypervisor sits between the server and the OS to create a virtual
server, database virtualization software sits between the database and the OS to abstract/virtualize
the data store resources.
Because database resources are virtualized, they require a much smaller storage footprint than the
source database. Instead of making and moving new blocks of data, virtual data (virtual data copies)
use pointers to data blocks, providing high-performance access to data already in place.
Data virtualization provides the ability to securely manage and distribute policy-governed virtual
copies of production-quality datasets. No matter the underlying database management system
(DBMS) or source database location, data virtualization technology creates block-mapped virtual
copies of the database for rapid and controlled distribution all while leaving a minimal storage
footprint no matter how many copies are used.
Sens
itivit
y:
Inte
Why Virtualize Data?
rnal
doc
ume The speed of innovation and ability to adapt to rapidly changing market trends rests on the agility of
nt your release cycle and the ability to quickly diagnose, triage, and fix errors. Data virtualization is the
for
Uni
on
Ban
k
critical lever used by forward-thinking enterprises to provision production-quality data to dev and
test environments on demand or via APIs.
Virtual data copies are fully readable/writeable, and can be provisioned or torn down in just
minutes, eliminating development’s reliance on slow serial ticketing systems and DBA involvement
for initial data delivery as well as data refreshes after destructive testing.
Data virtualization technology facilitates data delivery across all phases of application development,
including testing, release, and production fix. Traditionally, IT organizations rely on a request-fulfill
model, in which developers and testers often find their requests queuing behind others. Because it
takes significant time and effort to create a copy of test data, it can take days, or even weeks to
provision or refresh data for a test environment. This creates massive wait states in the software
delivery life cycle, slowing the pace of application delivery.
To keep pace with a faster release cadence, dev and test teams are forced to work with a stale copy
of data because refreshing test data takes too long. This can result in missed test cases and
ultimately data-related defects escaping into production.
Common Use Cases & Systems Used With

Data Virtualization Technology
 DevOps: For teams that need to transform app-driven customer experiences, oftentimes
everything is automated except for the data. Data virtualization enables teams to delivers
production-quality data to enterprise stakeholders for all phases of application development.
 ERP Upgrades: Over half of all ERP projects run past schedule and budget. The main
reason? Standing up and refreshing project environments is slow and complex. Data
virtualization can cut complexity, lower TCO, and accelerate projects by delivering virtual
data copies to ERP teams more efficiently than legacy processes.
 Cloud Migration: Data virtualization technology can provide a secure and efficient
mechanism to replace TB-size datasets from on-premise to the cloud, before spinning up
space-efficient data environments needed for testing and cutover rehearsal.
 Analytics and Reporting: Virtual data copies can provide a sandbox for destructive query
and report design and facilitate on-demand, data access across sources for BI projects that
require data integration (MDM, M&A, global financial close, etc.)
 Backup and Production Support: In the event of a production issue, the ability to
provision complete virtual data environments can help teams identify root cause and validate
that any change does not cause unanticipated regressions.
Sens
itivit Data Virtualization Capabilities
y:
Inte
rnal By virtualizing data software teams have:
doc
ume
nt
for
Uni
on
Ban
k
 Enterprise Grade Distribution: Provision lightweight virtual database copies in minutes
(depending on the types and size of files) via UI or API that scale with your agile
development goals.
 Built for Scale: Replicate data from production to non-production environments at scale,
either on-premises or in the cloud for multiple instances. Teams can provision virtual
databases as necessary without taxing storage.
 Data Governance: Put your InfoSec department at ease with data controls that govern who
can do what, where, and when over specific datasets. When combining best-in-class security,
consistent data-masking policies, and robust auditing, data Virtualization becomes a security
asset.
 Cost Savings: Maximize testing throughput while minimizing storage use - Virtual Datasets
provisioning, destruction, refresh and rewind all provide new tools for application testers to
maximize testing throughput with virtually no additional storage cost.
REF : What is Data Virtualization? | Delphix
Data virtualization is a special kind of data integration technology that provides access of data in real time,
seamlessly all in one place. Think of it like a television guide which contains a listing of shows on a variety
of channels, without having to be on that channel to see the content. In data virtualization, customers can
access and manipulate each datum, regardless of physical location or formatting. Instead it is one stop
shopping. “Data virtualization solutions, also, create integrated views of the data, across the multiple
sources, without moving the data to a new location.” Data virtualization typically can access a wide variety
of Enterprise Data Architectures, including those on premise and in the cloud, and adapts agilely
to structural changes, without impacting the business.
Other Definitions of Data Virtualization Include:
 A single database view/s allowing access to distributed databases and multiple heterogeneous data stores.
(DAMA DMBoK2)
 “A technology that delivers information from various data sources, including big data sources such as Hadoop
and distributed data stores in real-time and near-real time.” (Boston University)
 Abstraction of IT data resources “that masks the physical nature and boundaries of those resources from
resource users.” (Gartner)
 A technical “approach by which data access can be easily centralized, standardized, and secured across the
enterprise, no matter the location, design or platform of the data source.” (Indiana University)
 “The process of aggregating data from different sources of information to develop a single, logical and virtual
view of information so that it can be accessed by front-end solutions such as applications, dashboards and
portals without having to know the data’s exact storage location.” (Techopedia)
Data Virtualization Use Case Examples Include:
 Creating “a world class process and strategy to automate the data forensics and resolve regulatory
requirements across the organization”
 Complying with the European General Data Protection Regulation (GDPR)
 Aiding in the development of blockchain and machine learning projects within an organization
Sens
itivit Businesses Need Data Virtualization To:
y:
Inte  Spend 40 percent less on building and managing data integration
rnal
doc
 Connect distributed data assets
ume  Limit data silos
nt  Drive new innovations
for  Streamline operations
Uni
on
Ban
k
REF: What Is Data Virtualization? - DATAVERSITY
Sens
itivit
y:
Inte
rnal
doc
ume
nt
for
Uni
on
Ban
k

Task 2a

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Task 2a

Uploaded by

Copyright:

Available Formats

TASK 2A

Data Analytics vs. Data

What is data analytics?

What is data science?

The difference between data

Data science: An area of expertise

The data science lifecycle

Data Analytics vs. Data Science

Working in Data Analytics

Characteristics of Data Analysts

Skills and Tools

Learn More: What Does a Data Analyst Do?

Working in Data Science

Characteristics of Data Analysts

Skills and Tools

Roles and Responsibilities

Data Science vs. Data Analytics: Understanding the Differences

Data science vs. data analytics

Similarities between data science and data analytics

Massive quantities of data

Here are some of the ways these two roles differ.

Database manipulation and management

What Does a Data Analyst Do?

What Does a Data Scientist Do?

REF: Data Analyst vs Data Scientist | Master's in Data Science (mastersindatascience.org)

 Colours ( red, blue, green, orange, etc)

Let us understand ordinal data with some examples.

 Reviews ( excellent, good, fair, poor)

 Total number of students in college

Functionality and Features

Benefits and Use Cases

 Reduces data replication and storage costs

Challenges and Limitations

REF: Data Virtualization | Dremio

Common Use Cases & Systems Used With

Other Definitions of Data Virtualization Include:

Data Virtualization Use Case Examples Include:

You might also like