Fundamentals of Big Data and Business Analytics

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

1.

General Electric –State and explain how this generated data is leveraged to enable
growth in manufacturing sector for GE. Provide example of 2 business questions using
the data and potential analytics approach.

Answer:

Big Data helps manufacturing companies to stay competitive in the vast manufacturing
market. Big data is being used to achieve productivity and efficiency gains and uncover new
insights that drive innovation. Using data analytics, manufacturers can discover new
information and identify patterns that enable them to improve processes, increase supply
chain efficiency and identify variables that affect production quality. Data analytics helps
manufacturers uncover defects in processes, speed product development and improve
product quality. GE has leveraged the generated data for its growth in the manufacturing
sector by following steps:

 Cost reduction
 GE has leveraged big data technology called Hadoop to reduce cost for storage
of large data and to perform business efficiently.
 Optimize production processes to improve output and efficiency
 Streamline supply chains by discovering inefficiencies
 Reduce warranty and service costs through proactive problem solving

 Informed and quick decision making


 GE has used memory analytics to analyse new data sources to make informed
decisions based on the learnings.

 Increase Revenue
 Better forecast demand to ensure the right product mix
 Use customer insights to improve design, operations and marketing
 Innovate with new, intelligent products & transportation services

 Avoid Risks
 Improve Quality Assurance and minimize liability exposure with early detection
 Reduce production risk by analysing safety, quality, security issues

 Reduced machine downtime


 Big Data is used for performing predictive and preventive maintenance.
 It helps to recognize patterns of machine failure

 Enterprise growth
 Compare the performance differences and reasons across different sites
 Ability to analyse production plans
 Ability to apply predictive models

Some other ways GE using big data and data analytics:


 Reduce investment in working capital
 Improve the accuracy of cash flow forecasting
 Embed a culture of data-driven decision making
 Predict customer product preferences
 Optimize logistics and distribution

a. What is the end-to-end big data architecture required in this context? Show it
preferably using a diagram/ flowchart. You can explain possible tools which can be
leveraged in the life cycle and the rationale for the tool.

Answer:
Big data architecture helps to perform analysis and processing of data that is very large
and complex for traditional database systems.

 Big data solutions contains:


 Batch processing.
 Real-time processing.
 Interactive big data analysis.
 Predictive analytics and machine learning.

 Big data architecture.

Most big data architectures include following components:

 Data sources. All big data solutions have one or more data sources. Example: relational
databases.
o Static files produced by applications, such as web server log files.
o Example: IoT devices.
 Data storage.
o Data is stored in a distributed file store Example: Azure Data Lake Store or blob
containers in Azure Storage.
 Batch processing.
o Processing of large data files through batch jobs.
o Example: - U-SQL jobs in Azure Data Lake Analytics.

 Intake of Real-time message


o The architecture includes a way to capture and store real-time messages for stream
processing.
o However, many solutions need a message ingestion store to act as a buffer
o Example: Azure Event Hubs, Azure IoT Hub, and Kafka.
 Stream processing.
o Performs processing of large data stream.
o Processed data is written to an output device.
o Example: - Storm and Spark streaming in an HDInsight cluster.
 Analytical data store.
o Provides a managed service for large-scale, cloud-based data warehousing.
o Example: HBase, and Spark SQL
 Analysis and reporting.
o Provide data insights.
o To empower users to analyze the data, the architecture may include a data modelling
layer
o Example: Tabular data model in Azure Analysis Services.
 Orchestration.
o Repeated data processing operations, encapsulated in workflows that transform
source data, move data between multiple sources and sinks transfer the processed
data into an analytical data store.

Some tools which can be leveraged in the life cycle are

 Hadoop - helps in storing and analysing data.


 Cost reduction: Hadoop provides cost advantages for storage of large data by
helping to perform efficient ways of data storage.

 MongoDB - used on datasets that change frequently


 Data incorporation: Data management tools available as solutions like
Amazon Elastic MapReduce (EMR) that run underneath a customised version of
Apache Hive, Pig, spark, Couchbase, MapReduce, Hadoop, MongoDB, etc.
 Talend - used for data integration and management
 Cassandra - a distributed database used to handle chunks of data
 Spark - used for real-time processing and analysing large amounts of data
 STORM - an open-source real-time computational system
 Kafka - a distributed streaming platform that is used for fault-tolerant storage

2. Explain how clouds have increasingly transformed the adoption of big data in the
companies. You must mention 3 examples of business cases which could transform
their business. What choices one needs to make to improve cost optimization while
using the cloud-based platform.

Answer: Clouds have increasingly transformed the adoption of big data in the companies
because:-

 Business Continuity Planning:


o Seamlessly and cost-effectively implementation, easy to manage and deliver new
business models on the cloud.
o Seamless continuity of operations across geographical constraints.
 Distributed workforce:
o Cloud has erased the geographical and technological barriers that formerly
hindered seamless collaboration and cross-border exchange.
 On-demand scalability:
o Helps to scale up or down their infrastructure based on service.
o Business can smoothly expand without incurring investments to deploy new
servers and hardware.
 Improve customer experience:
o Helps enterprises speed up the time to market and improve customer
responsiveness.
o The cloud experience combines consumer data, digital experience, and
personalization to provide efficient and modern approach to communicate with
customers.

 Increase productivity:
o Highly flexible coupled with lesser distractions
o Work-life balance for remote workers
o Improving job satisfaction scores
o Agile, scalable, and flexible

 Clouds have transformed the adoption of big data in the companies in following
manner:

o Cloud computing focuses on scalable, elastic, on-demand and pay-per-use self-


service models.
o Cloud computing seamlessly provides elastic on-demand integrated computer
resources, required storage and computing capacity to analyse big data.
o Distributed processing for scalability and expansion through virtual machines to
meet the requirements of exponential data growth.
o Expansion of analytical platforms to address the demands of users, especially large
data-driven companies in providing contextual analysed data out of all the stored
information.
o Cost efficient manner to capture data and adding analytics to offer proactive and
contextual experiences.
o Cloud includes different kinds of software and hardware, pay-per-use or
subscription-based services offered both through the internet and in real time. Data
is collected using big data tools later it is stored and processed in cloud.
o Uninterrupted data management.
o The most common models for big data analytics are
 software services such as (SaaS),
 Platform service like (PaaS)
 Infrastructure service like (IaaS).
o Cloud computing infrastructure can serve as an effective platform to address the data
storage required to perform big data analysis.

 SwiftKey

o Language technology company


o Aids touchscreen typing by providing personalized predictions and corrections.
o Collects and analyses terabytes of data to create language models.
o Company needs a highly scalable, multi-layered model system that can keep pace
with steadily increasing demand and that has a powerful processing engine for the
artificial intelligence technology used in prediction generation.
o Company uses Apache Hadoop running on Amazon Simple Storage Service and
Amazon Elastic Compute Cloud to manage the processing of multiple terabytes of
data.

 RedBus

o The online travel agency


o Unifying several bus schedules into a single booking operation.
o Company uses powerful tool to analyse inventory and booking data across their
system of hundreds of bus operators serving more than 10,000 routes.
o Using clusters of Hadoop servers to process the data.
o Implemented GoogleQuery to analyse large datasets by using the Google data
processing infrastructure.
o Big Query helps to improve customer service and reduce lost sales.

 Nokia

o A mobile communications company.


o Nokia gathers and analyses large amounts of data from mobile phones.
o In order to support its extensive use of big data, Nokia relies on a technology
ecosystem that includes a Teradata Enterprise Data Warehouse, numerous Oracle
and MySQL data marts, visualization technologies, and Hadoop.
o Nokia has over 100 terabytes of structured data on Teradata and petabytes of multi
structured data on the Hadoop Distributed File System (HDFS).
o The HDFS data warehouse allows the storage of all semi/multi structured data and
offers data processing at the petabyte scale

 Cloud Cost Optimization Techniques

o Stop running instances that are not in use:


 Unnecessary instances should be discontinued
 Use them overnight or on the weekends.
o Choose the right type of instances for your requirement:
 Avoid using wrong instance for the application.
 Verify various instances offered by the cloud service provider before
deploying services.
o Use discounts and free storage whenever possible:
 Reserved, spot instances or another discount program could save a lot of
money for cloud users.
 Buying in bulk assures of a lower cost than buying bits and pieces.
 Look for potential price reductions and discounts.
o Centralized storage management:
 Easy to manage and monitor cloud usage
 Policy enforcement reduces the risks.
o Using server less services:
 The developer writes code for the application while cloud service handles all
the details of infrastructure deployment.
 Saves lot of time
 Reduces operating expenses.

You might also like