Unit 1 PPT

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 72

Raj Kumar Goel Institute of Technology

Ghaziabad
Big Data (KCS061)
Session 2023-24
Course : B.Tech CSE VI sem

Unit-1 :Introduction to Big Data

Dr K. P. Jayant
Department of Computer Science & Engineering
1 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

2 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

3 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Introduction to Big Data:


• Types of digital data,
• history of Big Data innovation,
• introduction to Big Data platform,
• drivers for Big Data,
• Big Data architecture and characteristics,
• 5 Vs of Big Data, Big Data technology components,
• Big Data importance and applications,
• Big Data features – security, compliance, auditing and
protection, Big Data privacy and ethics, Big Data
Analytics,
• Challenges of conventional systems,
• intelligent data analysis,
• nature of data,
• analytic processes and tools,
• analysis vs reporting,
• modern data analytic tools.
4 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

What is Data?
Raw facts & figures.

5 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

What is Digital Data?


Digital data is the electronic representation of information in a
format or language that machines can read and understand.
In more technical terms, digital data is a binary format of
information that's converted into a machine-readable digital
format.
The power of digital data is that any analog inputs, from very
simple text documents to genome sequencing results, can be
represented with the binary system.

6 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

What is big data?

Big data is a term used to describe data of great variety, huge


volumes, and even more velocity.

Apart from the significant volume, big data is also complex such
that none of the conventional data management tools can
effectively store or process it.

The data can be structured , semi structured or unstructured.

7 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Examples of big data include:

• Mobile phone details


• Social media content
• Health records
• Transactional data
• Web searches
• Financial documents
• Weather information

Big data can be generated by users (emails, images, transactional


data, etc.), or machines (IoT, ML algorithms, etc.).

8 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Classification of Data
Data Classification:
Process of classifying data in relevant categories so that it can be
used or applied more efficiently.
The classification of data makes it easy for the user to retrieve it.
Data classification holds its importance when comes to data
security and compliance and also to meet different types of
business or personal objective.
It is also of major requirement, as data must be easily retrievable
within a specific period of time.

9 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Types of Data Classification:


Data can be broadly classified into 3 types
.
1. Structured Data

2. Semi structured Data

3. Un-Structure Data

10 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Types of Data Classification:


1. Structured Data:
Structured data is created using a fixed schema and is maintained
in tabular format.

The elements in structured data are addressable for effective


analysis.

It contains all the data which can be stored in the SQL


database in a tabular format.

Today, most of the data is developed and processed in the


simplest way to manage information.

11 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

In another words

Structured data is highly organized and can be easily


processed using traditional data processing tools.

This type of data is typically stored in a relational database


and includes data that can be represented in a tabular
format.

12 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Examples –
• DWS (Data Ware House) ,
• DM (Data Mart),
• OLTP (Online transaction process) ,
• ODS (operational data source/store ),
• APIs (Application programming interface) ,
• ERP (Enterprise resource planning) ,
• CRM (Customer relationship management ),
• MIS (management information system)
• etc

Relational data, Geo-location, credit card numbers,


addresses, etc.
13 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

Consider an example for Relational Data like


we have to maintain a record of students for a university like the
name of the student, ID of a student, address, and Email of the
student.
To store the record of students used the following relational
schema and table for the same.

S_ID S_Name S_Address S_Email

1001 A Delhi A@gmail.com

1002 B Mumbai B@gmail.com

14 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

3. Semi-Structured Data :
Semi-structured data is information that does not reside in a
relational database but that have some organizational properties
that make it easier to analyze.
With some process, we can store them in a relational database
but some time it is very hard to process of some kind of semi-
structured data.

15 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

In other words

Semi-structured data is partially organized and does not


have a specific structure.
It can be processed using some traditional data processing
tools and techniques,
but it may require some preprocessing before it can be
analyzed.

16 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Example
• HTML,
• XML data,
• No SQL,
• emails,
• CSV files,
• JSON files (Java Script Object Notation)
• log files ,
• Excel files
• etc

17 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

2. Unstructured Data:
It is defined as the data in which is not follow a pre-defined
standard or you can say that any does not follow any organized
format.
This kind of data is also not fit for the relational database because
in the relational database we will see a pre-defined manner or we
can say organized way of data.
Unstructured data is also very important for the big data domain
and to manage and store Unstructured data.
There are many platforms to handle it like No-SQL Database.

18 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

In other words
Unstructured data is not organized and does not have a
specific structure.
This type of data is often generated by humans or machines
and includes data such as text, images, audio, and video.
Analyzing unstructured data requires advanced data
processing techniques
such as natural language processing,
image processing,
and machine learning.

19 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Examples –
• Word,
• PDF,
• text,
• media logs,
• audio, video,
• www,
• Geo location,
• social media like Twitter, Facebook, Instagram, etc.
• Mobile phone,
• smart watch,
• Wi-Fi,
• etc

20 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Area/Uses
• Transportation.
• Advertising and Marketing.
• Banking and Financial Services.
• Government.
• Media and Entertainment. Meteorology.
• Healthcare.
• Cyber security.
• Banking and Securities
• Communications, Media and Entertainment
• Insurance
• Retail and Wholesale trade
• etc

21 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Applications of Big Data

22 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Evolution of Technology
1990s - Emergence of Data Warehousing:
2000s - Rise of NoSQL Databases:
2003 - Introduction of Hadoop:
2008 - Growth of Cloud Computing:
2010s - Expansion of Big Data Ecosystem:
2010s - Emergence of Data Lakes:
2010s - Advanced Analytics and Machine Learning:
2010s - Real-time Processing
Big data technologies continue to evolve rapidly, with a focus on improving
performance, scalability, and ease of use. The adoption of containerization and
orchestration tools, such as Docker and Kubernetes, has also played a role in
streamlining big data deployments.
23 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)
What is Big Data Architecture?
The term "Big Data architecture" refers to the systems and software used to
manage Big Data. A Big Data architecture must be able to handle the scale,
complexity, and variety of Big Data. It must also be able to support the needs of
different users, who may want to access and analyze the data differently.

24 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061) Big Data Architecture?
What is Big Data Architecture?

25 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
What is Big Data Architecture?
Big Data Architecture Layers
There are four main Big Data architecture layers to an architecture of Big
Data:
1. Data Ingestion 2. Data Processing 3. Data Storage 4. Data Visualization

1. Data Ingestion
This layer is responsible for collecting and storing data from various sources.

In Big Data, the data ingestion process of extracting data from various
sources and loading it into a data repository.
Data ingestion is a key component of a Big Data architecture because it
determines how data will be ingested, transformed, and stored.

26 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
Big Data Architecture Layers
2. Data Processing
Data processing is the second layer, responsible for collecting, cleaning, and
preparing the data for analysis.
This layer is critical for ensuring that the data is high quality and ready to be used
in the future.
3. Data Storage
Data storage is the third layer, responsible for storing the data in a format that can
be easily accessed and analyzed.
This layer is essential for ensuring that the data is accessible and available to the
other layers.
4. Data Visualization
Data visualization is the fourth layer and is responsible for creating visualizations
of the data that humans can easily understand. This layer is important for making
the data accessible.
27 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)
The characteristics of big data describes with 5 Vs

28 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
The characteristics of big data describes with 5 Vs

29 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
The characteristics of big data describes with 5 Vs

30 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
The characteristics of big data describes with 5 Vs

1. Volume:

Big data architecture deals with massive volumes of data,


ranging from terabytes to petabytes.

Volume describes both the size and quantity of the data.

Data from internet, social media, and the Internet of Things


(IoT) is being generated at an unprecedented rate.

Traditional data storage and processing techniques are often


insufficient to handle such a massive amount of data.

31 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
The characteristics of big data describes with 5 Vs

2. Variety:
Variety describes the diversity of the data types and its
heterogeneous sources.

Big Data information draws from a vast quantity of


sources.
• Structured Data:
• Unstructured Data:
• Semi-structured Data:

Data comes in different forms and formats such as text,


audio, video, and images.

32 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
The characteristics of big data describes with 5 Vs

3. Velocity:
Velocity describes how rapidly the data is generated and
how quickly it moves.

This data flow comes from sources such as mobile phones,


social media, networks, servers, etc.

Big data architecture deals with data streams that arrive at a


high rate, such as social media feeds, sensor data, and web
traffic.

Social media platforms generate millions of posts, likes, and


comments every second, while IoT sensors can generate
data at a rate of several terabytes per hour
33 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)
The characteristics of big data describes with 5 Vs
4. Veracity:
Veracity describes the data’s accuracy and quality.

Since the data is pulled from diverse sources, the information can have
uncertainties, errors, redundancies, gaps, and inconsistencies.

It's bad enough when an analyst gets one set of data that has accuracy
issues; imagine getting tens of thousands of such datasets, or maybe
even millions.

Big data architecture deals with data of varying quality, completeness,


and accuracy.

5. Value:
The ultimate goal of big data architecture is to extract insights and value
from the data that can help organizations make better decisions.

34 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Big Data importance and applications

Big data has become increasingly important in today's digital world, where
massive amounts of data are generated every second by individuals,
businesses, and machines. Here are some of the key reasons why big data is
important:
1. Improved decision-making:
By analyzing large and complex datasets, organizations can make more
informed and data-driven decisions, leading to better outcomes.
2. Cost savings:
Big data analytics can help identify inefficiencies in business operations,
leading to cost savings.

35 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Big Data importance and applications

3. Improved customer experience:


Big data analytics can help companies understand customer behavior and
preferences, enabling them to provide personalized and relevant
experiences.
4. Innovation:
Big data can help companies identify emerging trends and opportunities,
leading to new products, services, and business models.
5. Competitive advantage:
Companies that effectively leverage big data can gain a competitive
advantage by understanding customer needs, identifying new markets, and
optimizing business processes.

36 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Solutions for Big Data


Apache Hadoop
Apache Spark
Apache Storm
Hydra
Google BigQuery
etc

37 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Big Data Analytics

What Is Big Data Analytics?

Big data analytics is a term that describes


the process of using data to discover trends, patterns,
and other correlations, as well as using them to
make data-driven decisions.

It describes
the process of uncovering trends, patterns,
and correlations in large amounts of raw data to help
make data-informed decisions.

38 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Big Data Analytics

Big data analytics is the use of advanced analytic


techniques against very large, diverse data sets

that include structured,


semi-structured and unstructured data,
from different sources,
and in different sizes from terabytes to zettabytes.

39 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Types of Big Data Analytics

Big data analytics involves processing and analyzing


large and complex datasets
to extract valuable insights, patterns, and trends.

There are several types of big data analytics,

each serving different purposes based on the goals


and requirements of the analysis.

40 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Types of Big Data Analytics

There are four main types of big data analytics

• Descriptive Analytics “What has happened ?”


• Diagnostic Analytics “What did happen?”

• Predictive Analytics “What will happen?”

• Prescriptive Analytics “What is the solution?”

41 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Descriptive Analytics
Through this type of analytics, we use the insight gained to answer the question
“What is happening now based on incoming data?”

Purpose –It focuses on summarizing and describing historical


data to gain an understanding of what has happened.
Examples Reporting, dashboards, data visualization, and basic
statistical analysis.

Benefits
• It helps companies to make sense of the large amounts of raw
data they gather by focusing on the more critical areas.
• To understand their current business situation better in
comparison to the past.

42 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
Diagnostic Analytics
Through this type of analytics, we use the insight gained to answer the
question,
“What did it happen? “

Purpose Its aim to identify the reasons behind past events or


trends. It involves analyzing historical data to understand the
causes of specific outcomes.
Examples Root cause analysis, trend analysis, and correlation
analysis.
Benefits
• better understanding of our data and various ways to find the
answers to company questions.
• It enables businesses to understand their customers by using tools
for searching, filtering, and comparing the data produced by
individuals.
43 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

Predictive Analytics

“What might happen in future? “


Purpose Predictive analytics involves using historical data and
statistical algorithms to predict future outcomes or trends.

Examples Regression analysis, machine learning models, and


forecasting.

Benefits
• Reliable and more accurate forecast of the future.
• Companies can find ways to save and earn money, manage shipping
schedules, and stay on top of inventory requirements.
• Can can help organizations attract new customers and retain the old
ones.

44 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Prescriptive Analytics
“What action should be taken?”

Purpose Prescriptive analytics goes beyond predicting future


outcomes and recommends actions to achieve desired results. It
provides insights on what actions to take to optimize a particular
outcome.

Examples Decision support systems, optimization algorithms,


and simulation models.
Benefits
• Improving processes, campaigns, strategies, production, and
customer service.
• Helps manufacturers better understand the market and anticipate its
condition in the future
45 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

Others are……….
Text Analytics (or Text Mining)

Purpose Text analytics involves analyzing unstructured text data to


extract insights, sentiment, and patterns from documents, social media,
and other textual sources.
Examples: Sentiment analysis, natural language processing
(NLP), and text clustering.

Spatial Analytics
Purpose Spatial analytics involves analyzing geographic or location-
based data to understand patterns, relationships, and trends associated with
specific locations.
Examples Geographic Information System (GIS) analysis,
location-based recommendation systems, and mapping.

46 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Streaming Analytics

Purpose Streaming analytics involves analyzing real-time data


streams to gain insights and make decisions as events unfold.

Examples Real-time monitoring, fraud detection, and IoT


data analysis.

Preservation Analytics
Purpose Preservation analytics focuses on maintaining and
ensuring the quality, reliability, and integrity of data over time.

Examples Data quality monitoring, data governance, and


data lifecycle management. .
47 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

Social Media Analytics

Purpose Social media analytics involves analyzing data from


social media platforms to understand user behavior, sentiments,
and trends.
Examples Social network analysis, social media
monitoring, and trend analysis on social platforms.

Video Analytics

Purpose Video analytics involves analyzing video data to extract


insights, patterns, and information from visual content.
Examples Video surveillance analysis, facial
recognition, and object detection.

48 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

How big data analytics works?

• Collecting,
• Processing,
• Cleaning,
• and analyzing large datasets

to help organizations operationalize their big data.

49 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
Tools Used In Big Data analytics
Hadoop :
Frame Store and process bigdata distributed in parallel
Fashion
Mongo DB
Deal with large amount unstructured data
Talend
SW & service for data integration data management and data store,
Open source SW
Cassendra
Management of large about of data real time processing
Spark
used for data processing
Storm
real time data processing
Kafka
50 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

Bi g Data technology

51 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

52 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Intelligent Data Analysis (IDA)

Intelligent Data Analysis (IDA) refers to the process of


extracting appropriate information and knowledge from
large and complex datasets using advanced techniques and
technologies.

It involves the application of artificial intelligence (AI),


machine learning (ML), and other computational methods
to analyze data and make informed decisions.

53 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Key Component of Intelligent Data Analysis (IDA)

• Data Collection and Preprocessing


• Feature Extraction
• Modelling and Analysis
• Pattern Recognition
• Decision Making
• Visualization

54 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Key Component of Intelligent Data Analysis (IDA)

Data Collection and Preprocessing

Gathering relevant data from various sources and preparing it for


analysis.
This step involves cleaning, transforming, and organizing the
data to ensure its quality and suitability for analysis.

Feature Extraction

Identifying and selecting relevant features (variables) from the


dataset that are essential for the analysis.
This step helps reduce dimensionality and focuses on the most
important aspects of the data.
55 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

Key Component of Intelligent Data Analysis (IDA)

Modeling and Analysis

Employing advanced algorithms and statistical techniques to


build models that can uncover patterns, relationships, and trends
within the data.
This often involves the use of machine learning algorithms such as
clustering, classification, regression, and others.

Pattern Recognition
Identifying meaningful patterns and structures in the data that
can provide appropriate information.
This can include the detection of anomalies, trends, correlations,
and other important relationships.
56 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)
Key Component of Intelligent Data Analysis (IDA)

Decision Making

Using the generated insights to make informed decisions and


predictions.
Intelligent Data Analysis can be applied in various domains, such
as business, healthcare, finance, and more, to support decision-
making processes.

Visualization
Presenting the results in a visual and interpretable format,
making it easier for stakeholders to understand and act upon the
findings.
Visualization tools help in conveying complex information in a
more accessible manner.
57 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

Nature of Data and Nature of Big data

Nature of data refers to the inherent characteristics and


properties of information that is collected, processed, and
analyzed within various contexts.
Understanding the nature of data is essential for effectively
managing, interpreting, and utilizing information.

Whereas nature of big data refers to the defining characteristics


and properties that distinguish big data from traditional data. Big
data is characterized by five Vs.[Volume, Value, Variety, Velocity,
Veracity]

58 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
Big Data Privacy
Protecting / maintaining individuals’ data privacy

Data Collection
Data Storage
Data Sharing
Big Data Ethics
Ethics means ensuring ethical use of data in the context of big
data analytics.
• Informed Consent
• Transparency
• Fairness and Bias
• Accountability
• Legal Compliance
• Continuous Monitoring and Auditing
59 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)
1. Privacy Concerns:
Data Collection:
Big data often involves the collection of extensive and diverse
datasets. Privacy concerns arise when personally identifiable
information (PII) is included without adequate consent.

Data Storage
Safeguarding data during storage is crucial to prevent
unauthorized access and data breaches. Encryption and access
controls are common measures.

Data Sharing
Sharing data among organizations or third parties for collaborative
projects can pose privacy risks. Organizations must ensure proper
agreements and safeguards are in place.

60 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
2. Ethical Considerations:
Informed Consent
Ethical data practices involve obtaining informed consent from individuals
before collecting and using their data. Users should be aware of how
their data will be utilized.

Transparency
Organizations should be transparent about their data practices, providing
clear information on data collection, storage, and usage policies.

Fairness and Bias


Addressing biases in data and algorithms is crucial to ensure fair and
unbiased outcomes. Biases in data can lead to unfair treatment of certain
groups.

Accountability
Organizations should be accountable for the consequences of their data
practices. This includes taking responsibility for any negative impacts on
individuals or groups.
61 BigData KCS061 Dr KP Jayant, CSE Dept.
Big Data(KCS061)

3. Legal Compliance:

Data Protection Regulations:


Adhering to data protection laws and regulations, such as
the General Data Protection Regulation (GDPR), is
essential. These regulations outline rules for the collection,
processing, and storage of personal data.

Cross-Border Data Flows

Big data initiatives often involve international data transfers.


Organizations need to comply with regulations governing
cross-border data flows and ensure data sovereignty.

62 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

4. Anonymization and De-identification:

Anonymization

Removing or encrypting personally identifiable information to


protect individuals' identities. However, achieving true
anonymity can be challenging.

De-identification

Transforming data to make it less identifying while


maintaining its utility. This involves techniques like
pseudonymization.

63 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

5. Data Governance and Security:

Data Governance:

Implementing strong data governance practices to ensure


responsible and ethical data management throughout its
lifecycle.

Security Measures

Employing robust security measures to protect data from


unauthorized access, breaches, and cyber threats.

64 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

6. Continuous Monitoring and Auditing

Monitoring

Regularly monitoring data practices and security measures


to identify and address potential risks.

Auditing

Conducting audits to assess compliance with privacy


policies, ethical standards, and legal requirements.

65 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

The big challenges of Big Data


Big data brings big benefits, but it also brings big challenges such
new privacy and security concerns, accessibility for business
users, and choosing the right solutions for your business needs.

Making big data accessible.

Collecting and processing data becomes more difficult as the


amount of data grows.

Organizations must make data easy and convenient for data


owners of all skill levels to use.

66 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Maintaining quality data.

With so much data to maintain, organizations are spending more


time than ever before scrubbing for duplicates, errors, absences,
conflicts, and inconsistencies.

Keeping data secure. As the amount of data grows, so do privacy


and security concerns.

Organizations will need to strive for compliance and put tight


data processes in place before they take advantage of big data.

67 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Finding the right tools and platforms.

New technologies for processing and analyzing big data are


developed all the time.

Organizations must find the right technology to work within their


established ecosystems and address their particular needs.

Often, the right solution is also a flexible solution that can


accommodate future infrastructure changes.

68 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
Analysi s vs Repo rt
Analysis Report

Examination of large and summarizing and presenting


complex datasets to extract the results of big data
meaningful patterns, trends, analysis in a coherent and
and insights. accessible manner.
Involves tasks such as data organizing the analytical
preprocessing, exploratory data results, creating
analysis, employing distributed visualizations
computing frameworks

69 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)
Analysi s vs Repo rt

Analysis Report

Conducted by data scientists, Non-technical stakeholders,


analysts, and professionals executives, and decision-
makers
Use of big data technologies Utilizes visualization tools
and frameworks, such as (e.g., Tableau, Power BI),
Apache Hadoop, Spark, presentation software, and
reporting tools

70 BigData KCS061 Dr KP Jayant, CSE Dept.


Big Data(KCS061)

Test -1

71 BigData KCS061 Dr KP Jayant, CSE Dept.


72 BigData KCS061 Dr KP Jayant, CSE Dept.

You might also like