Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

DC Lab Assignment 8

Preetika Sastry
PB 14
Batch: B1

01. 02. 03.

Introduction What is Big Data What is Cloud Computing

04. 05. 06.

Why apply Cloud Advantages of Potential Challenges
Computing to BDA? Combining Cloud
Computing to BDA
Introduction to Big
Data and Cloud
Big data and Cloud computing both the technologies are valuable on its own. Many
businesses are targeting to combine the two techniques to reap more business benefits.
Both the technologies aim to enhance the revenue of the company while reducing the
investment cost. While Cloud manages the local software, Big data helps in business
What is Big
What is Big Data?
Big Data is high-volume, high-velocity and/or high-variety
information assets that demand cost-effective, innovative forms
of information processing that enable enhanced insight,
decision making, and process automation. The concept of big
data and what it encompasses can be better understood with
four Vs:

● Volume: The amount of data accumulated by private companies, public agencies, and other organizations on a
daily basis is extremely large. This makes volume the defining characteristic for big data.
● Velocity: It’s a given that data can and will pile up really fast. But what matters is the speed with which you can
process and examine this data so that it becomes useful information.
● Variety: The types of data that get collected can be very diverse. Structured data contained in databases, and
unstructured data such as tweets, emails, images, videos, and more, need to be consumed and processed all the
● Veracity: Because of its scale and diversity, big data can contain a lot of noise. Veracity thus refers to the the
certainty of the data and how your big data tools and analysis strategies can separate the poor quality data from
those that really matter to your business.
03 What is Cloud
What is Cloud Computing?
Cloud computing is the delivery of on-demand computing resources which includes servers, software, storage,
databases, networks, analysis, and intelligence over the Internet on a pay-for-use basis to offer faster
innovation, flexible resources, and economies of scale. You only pay for the services you use, which helps in
reducing your operating costs, run your infrastructure more efficiently and scale as your business needs

Cloud computing offers services to users in a pay-per-use model. Three main services offered by Cloud
providers are detailed below:

● Infrastructure as a Service (IAAS): It provides companies with computing resources with an instant
computing infrastructure, provisioned and managed over the internet.
● Platform as a service (PAAS): It is a cloud-based environment with the resources that allow it to deliver
everything from simple cloud-based applications to sophisticated and cloud-enabled business
● Software as a Service (SAAS): This service provides the necessary configurations and infrastructure that
IaaS provides for the platform and infrastructure.
Why should we apply Cloud
Computing to Big Data Analytics?
Why should we apply Cloud Computing to Big Data
The cloud can help you process and analyze your big data faster, leading to insights that can improve your
products and business. Merging big data with cloud computing is a powerful combination that can transform
your organization. When big data computing takes place in the clouds it is known as “Big Data Clouds”. Their
purpose is to build an integrated infrastructure that is suitable for quick analytics and deployment of an
elastically scalable infrastructure. Cloud technology is used to derive quantum-leap advantages inherent in big
data [10]. Hence, from the above description, we can see that Cloud enables “As-a-Service” pattern by
abstracting the challenges and complexity through a scalable and elastic self-service application
Big Data and the cloud computing relationship can be classified according to the types of service:

● IAAS in Public Cloud: IaaS is a cost-effective solution and by using this service in the cloud, Big Data
services allow you to access unlimited storage and computing power. It’s a very cost-effective solution
for companies where the cloud provider assumes all the underlying hardware management costs.
● PAAS in Private Cloud: PaaS helps in reducing the complexities of managing software and hardware
elements, which is a real concern when dealing with data by integrating Big Data technologies with
services offered by cloud infrastructure.
● SAAS in Hybrid Cloud: The analysis of the data of the social networks is nowadays an essential
parameter for the business analysis of the companies. SaaS providers provide a platform to perform the
A Forrester Research survey in 2017 revealed that big data solutions
via cloud subscriptions will increase about 7.5 times faster than
on-premise options.
Advantages of
Applying Cloud
Computing to Big
Data Analytics
Improved analysis

Big data analysis has been improved and better results have been obtained with the growth of Cloud technology.
Therefore, companies prefer to execute big data analysis in the cloud. In addition, Cloud helps integrate data from
various sources.

Simplified Infrastructure

Big Data analysis is a tremendously exhausting work in infrastructure since the data comes in huge volumes with
variable speeds and types with which traditional infrastructures generally cannot keep up. As cloud computing provides
a flexible infrastructure, which we can scale according to the needs at that time. So, it becomes easy to manage

Lowers the cost

Big data technology and Cloud technology, both provide value to organizations by lowering the ownership. The Cloud
per-user payment model converts CAPEX to OPEX. On the other hand, Apache reduced the cost of Big Data license,
which is supposed to cost to build and buy millions. The cloud allows customers to process large data without
large-scale large data resources. Therefore, both Big Data and cloud technology are reducing the cost for business
purposes and adding value to the company.
Security and Privacy

Privacy and Data security are the two main concerns when it comes to business data. Moreover, when a cloud
platform is used to host an application because of its open environment and limited user control security
becomes a primary concern. Also, being an open-source application, Big data solutions like Hadoop uses a lot
of third-party services and infrastructure, the Big Data solution such as Hadoop uses many third-party services
and infrastructure. Therefore, today system integrators bring a private cloud solution that is elastic and
scalable. In addition, it also takes advantage of scalable distributed processing. Many organizations, big data
analysis is used to detect and prevent threats and malicious hackers.


Infrastructure plays an essential role to support any application. Also, Virtualization technology is the ideal
platform for big data. Virtualized big data applications such as Hadoop provide multiple benefits that are not
accessible in the physical infrastructure but simplify big data management. Therefore, Big Data and Cloud
Computing projects rely heavily on virtualization.
Potential Challenges
Potential Challenges
Less control over security

These large datasets often contain sensitive information such as individuals’ addresses, credit card details, social
security numbers, and other personal information. Ensuring that this data is kept protected is of paramount importance.
Data breaches could mean serious penalties under various regulations and a tarnished company brand, which can lead
to loss customers and revenue. While security should not be a hindrance to migrating to the cloud, you will have less
direct control over your data, which can be a big organizational change and may cause some discomfort.

Less control over compliance

Compliance is another concern that you’ll have to think about when moving data to the cloud. Cloud service providers
maintain a certain level of compliance with various regulations such as HIPAA, PCI, and many more. But similar to
security, you no longer have full control over your data’s compliance requirements.

Network dependency and latency issues

The flipside of having easy connectivity to data in the cloud is that availability of the data is highly reliant on network
connection. This dependence on the internet means that the system could be prone to service interruptions.
In addition, the issue of latency in the cloud environment could well come into play given the volume of data that’s being
transferred, analyzed, and processed at any given time.
Thank You

You might also like