Professional Documents
Culture Documents
DBIS Lecture 4 - Slides (AI and Big Data)
DBIS Lecture 4 - Slides (AI and Big Data)
UMMDF7-15-M
Presentation by
• Big data refers to data that is too big to fit on a single server,
too unstructured to fit into a row-and-column database, or
too continuously flowing to fit into a static data warehouse.
9
Big Data is the
Volume refers to the vast amount of Velocity refers to the speed at which
data generated and collected by data is generated, collected,
organizations from various sources. processed, and analyzed.
Variety refers to the diverse types and formats of data that are generated and collected by organizations.
Unlike traditional structured data found in relational databases, big data encompasses a wide range of data
types, including structured, semi-structured, and unstructured data.
Data Volume
Exponential increase in
collected/generated data
15
Reference: Ruoming Jin http://www.cs.kent.edu/~jin/BigData/index.html
Key technologies and approaches used to address
volume in big data include:
1. Distributed file systems: These systems, such as Hadoop Distributed File System (HDFS), allow data to be
distributed across multiple nodes in a cluster, enabling scalable storage of massive datasets.
2. NoSQL databases: Unlike traditional relational databases, NoSQL databases are designed to handle large volumes
of unstructured or semi-structured data efficiently. Examples include MongoDB, Cassandra, and Apache CouchDB.
3. Data compression and deduplication: Techniques like data compression and deduplication help reduce the storage
footprint of large datasets without sacrificing data integrity or accessibility.
4. Data warehousing: Data warehousing solutions provide centralized repositories for storing and managing large
volumes of structured and semi-structured data, enabling organizations to perform complex analytics and
generate insights.
5. Cloud storage and computing: Cloud platforms offer scalable storage and computing resources on-demand,
allowing organizations to efficiently manage and analyze large volumes of data without investing in costly
infrastructure.
Exponential growth of data in today's digital age
4.6
Billions of RFID billion
tags
12+ TBs (1.3B in 2005)
camera
of tweet data phones
every day world wide
100s of
millions
data every day
of GPS
? TBs of
enabled
devices sold
annually
25+ TBs of 2+
log data
every day billion
people on
the Web
76 million smart meters by end
in 2009… 2011
How many in 2020?
Reference: Ruoming Jin http://www.cs.kent.edu/~jin/BigData/index.html
Speed (Velocity)
• Examples
• E-Promotions: Based on your current location, your purchase
history, what you like ➔ send promotions right now for store next
to you
• Structured data: This type of data is organized in a tabular format with a predefined schema,
where each data element is assigned to a specific field or column. Examples of structured
data include transaction records, customer information, and financial data.
• Unstructured data: Unstructured data does not have a predefined schema or format and can
exist in various forms, such as text documents, emails, social media posts, videos, images, and
audio recordings. Analyzing unstructured data requires advanced natural language processing
(NLP), text mining, and image recognition techniques.
• Streaming Data
• You can only scan the data once
Variety
Velocity Volume
26
Big Data: 3V’s
• Natural language processing (NLP): Machine learning algorithms are used to analyze and
understand human language, enabling applications such as sentiment analysis, language
translation, and chatbots.
• Computer vision: Machine learning models can analyze and interpret visual data, allowing
computers to recognize objects, detect anomalies, and understand scenes in images or
videos.
• Healthcare: Machine learning is used in medical imaging for tasks such as diagnosing
diseases from medical images (e.g., X-rays, MRIs) and predicting patient outcomes based
on electronic health records.
• Finance: Machine learning algorithms are applied in financial services for tasks such as
fraud detection, credit scoring, and algorithmic trading.
AI and Machine Learning
Big Data
Social Media and Internet of Things
Data growth is exponential
Main contributors to data growth
Increase in volume and velocity and variety
Data Warehouses and the Cloud
In computing, a data warehouse is a central
location and permanent storage area for data
and information held around your business in
different systems.
• A data warehouse gives you control of your data and your business
processes in one place.
• You can report across many data sets and join this information together.
• For example, you can merge a CRM customer record with its corresponding
finance record and report across all of your customers using both these data sets.
• Other data sources can be uploaded into a data warehouse, including ERP, e-
Commerce shops, public portals, stock management, amongst others. This means
you can automate business processes using data held across all your systems.
What are the advantages of a Data Warehouse?
• E.g. Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP),…
Key components of cloud computing
1.Infrastructure as a Service (IaaS): IaaS provides virtualized computing resources over the
internet, including virtual machines, storage, and networking infrastructure. Businesses
can rent these resources on-demand, allowing them to scale their infrastructure up or
down as needed without the need for physical hardware.
2.Platform as a Service (PaaS): PaaS provides a platform for developing, testing, and
deploying applications over the internet, without the need to manage the underlying
infrastructure. PaaS offerings typically include development tools, databases, and
middleware, allowing developers to focus on building and deploying applications rather
than managing infrastructure.
3.Software as a Service (SaaS): SaaS delivers software applications over the internet on a
subscription basis, allowing users to access and use the software from any device with an
internet connection. Examples of SaaS applications include email, customer relationship
management (CRM), and productivity tools like Microsoft Office 365 and Google
Workspace.
Big Data for business strategy
Big Data for customer interactions
Ten Practical Big Data
Benefits
Reference: http://datascienceseries.com/stories/ten-practical-big-data-benefits
71
Dialogue with Consumers
• Big Data allows you to profile these increasingly vocal and little
‘oppositions / rulers’ in a far-reaching manner so that you can engage in an
almost one-on-one, real-time conversation with them.
• This is not actually a luxury. If you don’t treat them like they want to, they will leave
you in the blink of an eye.
72
Dialogue with Consumers
Example
• When any customer enters a bank, Big Data tools allow the clerk to check
his/her profile in real-time and learn which relevant products or services (s)he
might advise.
• Big Data will also have a key role to play in uniting the digital and physical
shopping spheres: a retailer could suggest an offer on a mobile carrier, on the
basis of a consumer indicating a certain need in the social media.
73
Re-develop your products
• Big Data can also help you understand how others perceive your products so
that you can adapt them, or your marketing, if need be.
• On top of that, Big Data lets you test thousands of different variations of
computer-aided designs in the blink of an eye so that you can check how
minor changes in, for instance, material affect costs, lead times and
performance.
• You can then raise the efficiency of the production process accordingly.
74
Perform risk analysis
• Success not only depends on how you run your company. Social and
economic factors are crucial for your accomplishments as well.
• Predictive analytics, fueled by Big Data allows you to scan and analyze
newspaper reports or social media feeds so that you permanently keep up to
speed on the latest developments in your industry and its environment.
75
Keeping your data safe
• You can map the entire data landscape across your company with Big Data
tools, thus allowing you to analyze the threats that you face internally.
• With real-time Big Data analytics you can, for example, flag up any situation
where 16 digit numbers – potentially credit card data - are stored or emailed
out and investigate accordingly.
76
Create new revenue streams
• The insights that you gain from analyzing your market and its consumers with
Big Data are not just valuable to you. You can sell them….
• One of the more impressive examples comes from Shazam, the song
identification application.
• It helps record labels find out where music sub-cultures are arising by monitoring the
use of its service, including the location data that mobile devices so conveniently
provide.
• The record labels can then find and sign up promising new artists or remarket their
existing ones accordingly.
77
Customize your website in real time
• Big Data analytics allows you to personalize the content or look and feel of
your website in real time to suit each consumer entering your website,
depending on, for instance, their sex, nationality or from where they ended
up on your site.
• The best-known example is probably offering tailored recommendations:
• Amazon’s use of real-time, item-based, collaborative filtering (IBCF) to fuel its
‛Frequently bought together’ and ‛Customers who bought this item also bought’
features
• LinkedIn suggesting ‛People you may know’ or ‛Companies you may want to follow’.
And the approach works: Amazon generates about 20% more revenue via this method.
78
Reducing maintenance costs
79
Offering tailored healthcare
80
Offering enterprise-wide insights
• With Big Data tools, the technical teams can do the groundwork and then build
repeatability into algorithms for faster searches. In other words, they can
develop systems and install interactive and dynamic visualization tools that
allow business users to analyze, view and benefit from the data.
81
Making our cities smarter
• The city of Oslo in Norway, for instance, reduced street lighting energy
consumption by 62% with a smart solution. Since the Memphis Police
Department started using predictive software in 2006, it has been able to
reduce serious crime by 30 %.
82
Making our cities smarter
• The city of Portland, Oregon, used technology to optimize the timing of its
traffic signals and was able to eliminate more than 157,000 metric tonnes of
CO2 emissions in just six years – the equivalent of taking 30,000 passenger
vehicles off the roads for an entire year.
• The smart city project of Rivas Vaciamadrid in Spain – Ecopolis – has realized
energy savings of 35% and a 50% reduction in ICT spending through a winning
combination of smart grid and energy management, access control, air quality
monitoring, traffic management, IPTV, etc.
83
Elon Musk’s chilling warning over ‘unfriendly’ AI robots
VIDEO