Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Web and social media

analytics

A data and technology


perspective
Contacts

Chicago

Raj Parande
Principal
+1-312-578-4675
rajendra.parande@strategyand.pwc.com

Yuri Goryunov
Principal
+1-312-578-4791
yuri.goryunov@strategyand.pwc.com

Florham Park, NJ

Ramesh Nair
Partner
+1-973-410-7673
ramesh.nair@strategyand.pwc.com

Steffen Gnegel also contributed to this report.

This report was originally published by Booz & Company in 2011.

Strategy& 2
What is Web analytics, and why is it so essential but
challenging for today’s businesses?

What is web analytics?

Definition: Web analytics is the measurement, collection, analysis, and reporting of Internet data for the purposes of understanding and
optimizing Web usage. (Source: Web Analytics Association)

Why is it important?

• Today, Web interactions between commercial businesses and their customers take place via e-commerce stores, customer service
sites, interactive real-time chat, e-mail, and social media streams. These interactions are as important, if not more so, for a business’s
growth as customer touches through traditional voice and bricks-and-mortar channels
• Web data integrated with other channels provides a better picture of the customer–business relationship and helps in identifying
customer trends
• It is also useful in assessing the effectiveness of marketing campaigns and optimizing marketing spend
• And it improves the customer experience through faster service, thereby driving business growth and enhancing reputation

What are the challenges?

• Data volume growth is accelerating, making it cumbersome to capture and analyze Web data
• Unstructured social media data growth compounds the challenge, particularly as it must be integrated with enterprise structured data
• Multiple Web interaction platforms (PC, smartphone, tablet) further add to data capture and integration challenges
• Location and other smartphone sensor-based feeds also increase the complexity of continuous/real-time data capture
• There is no single tool available to capture and analyze all types of data

Strategy& 3
The number of Internet, social media, and mobile users
tripled over the past decade, reaching a third of the world’s
population
World Internet users, Dec. 1994–Mar. 2011
Observations
(in millions of people)
2,072 • In 2000, there were 390 million Internet users in the
1,802 world
1,319
1,018 • By March 2011, there were 2.1 billion Internet users
587
248 • The number includes 78% of North Americans and
1 billion people in Asia and the Middle East
combined
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012
• More than 800 million people use Facebook, with
Mobile data consumed, June 2009–Mar. 2011 Americans spending over 53 billion minutes a month
(in millions of megabytes)
on the site

• About 350 million currently active Facebook users


954
access it from their mobile devices
700
535 • Twitter users are also generating more than 1 billion
400 440 tweets each week
320
230
170 • At 140 characters per message, Twitter users
alone generate nearly 500 gigabytes of information,
the equivalent of 500 Encyclopaedia Britannicas,
04/2009 07/2009 10/2009 01/2010 04/2010 07/2010 10/2010 01/2011 04/2011 every month

Source: Internet Usage Statistics (www.internetworldstats.com/stats.htm); Facebook Press Room (www.facebook.com/press/info.php?statistics); Twitter Blog
(blog.twitter.com/2011/03/numbers.html); Opera Software State of the Mobile Web (media.opera.com/media/smw/2011/pdf/smw042011.pdf); Strategy& analysis

Strategy& 4
In the consumer space alone, Internet-based social and
commerce markets represent a multibillion-dollar
opportunity
Social gaming Mobile commerce Social commerce
(in US$ billion) (in US$ billion) (in US$ billion)

6.6 7.7 12.0


CAGR CAGR
6.5 CAGR
+45% +39%
+161%
4.5 8.0
4.5
3.2
3.1 5.0
2.2
2.1 3.0
1.5
1.5
1.0
0.1

2011 2012 2013 2014 2015 2011 2012 2013 2014 2015 2016 2010 2011 2012 2013 2014 2015
Players Players Players
§ Playfish § Gilt Groupe § Groupon
§ Zynga § eBay § Yelp
§ Playdom § Amazon § Living Social

Source: Strategy& research and analysis

Strategy& 5
Customers have very high expectations for their end-to-end
online experience
Customer expectations while browsing websites

Get support
Land Learn Shop Buy Receive & use
& interact

• Guide me to the site • Make it easy to • Let me configure • Allow me to use my • Tell me when my • Make it easy for
find what I need my products preferred payment order will arrive me to find product
• Allow me to shop method support online
through multiple ‒ Customized search • Guide my • Allow me to modify
channels results based on shopping decision • Give me flexible or cancel my order ‒ Direct me to
previous purchasing with relevant shipping options appropriate articles
• Remember who I activities advice • Make sure my order or help desk agents
am between visits • Send me arrives on time based on the
• Give me relevant • Give me complete promotions and products I own
‒ Persistent cookies content, let me visibility into coupons • Let me download
even if look and compare availability, delivery related software or • Allow me to
unauthenticated time, and method ‒ Customized based apps from the site connect with other
• Share the “wisdom on my previous customers and
• Give me a of the crowd” with • Give me the right browsing and
personalized enthusiasts
me price purchasing behavior
landing page
‒ Display what people • Provide sales • Let me go straight
‒ Tailored banner ads, “like me” have support (e.g., to checkout
promotions, and/or looked at click-to-chat)
recommendations
• Recommend ‒ Respond to my
products I might browsing
be interested in behavior with
targeted
‒ Behavioral targeting assistance Web analytics enabled
based on site
activity

Technology foundation

Source: Strategy& analysis

Strategy& 6
Customer analytics example: Amazon recommends
targeted products based on crowd user behavior or specific
user profile data

1
1 Customer analytics

Through insights generated …the prospect is provided with


A prospect is shopping on from users’ purchasing product recommendations for
Amazon for a smartphone behaviors… similar phones

2
2

…the customer is provided with


A customer logs onto
recommendations for similar
his/her Amazon account Based on previous products or accessories
purchase history…

Source: Strategy& analysis

Strategy& 7
Web analytics example: A client was able to significantly
increase average order value by leveraging online data for
behavioral targeting

Observe & learn Recognize Target


Average revenue per order
(Before & during behavioral targeting pilot)

$123.54

+23%
$100.20

• Customer interactions on • Individual behavioral patterns • Product recommendations –


a webpage • Customer segment Increased average order
classification value (AOV) Focus
• Site navigation patterns
• Wisdom of the crowd of pilot
• Repeat visitor activity
– “Hot” products and content • Content personalization
• Search context – Increased conversion,
– Product affinities
(customers who viewed/ loyalty
• Purchase/conversion
history bought X also viewed/
bought Y) • Search personalization
• User profile data
– Popular search hits and – Increased conversion,
misses loyalty
Pre-pilot Pilot
Support tools
(performance reporting, multivariate testing, site optimization)

Source: Strategy& analysis

Strategy& 8
Companies have to migrate from a Web analysis tool
infrastructure to an integrated architecture to enable a
customized user experience
Web and social media analytics: architecture options

Real-time integration
Product recommendation tools • Extensive (data warehouse
Benefits

Technical
integrated with CRM
Web analysis tools • Limited (JavaScript tags capabilities
Technical systems, data visualization
on each webpage, needed
• Very limited (JavaScript capabilities software)
Technical application programming
tags on each webpage, needed
capabilities interfaces) • Dynamically customized
familiarity with vendor
needed websites, enabled by real-
software) • Recommendation
time data from multiple
engines can look at user-
• Enable analyses of how Pros sources
level behavior and
people use a website Pros • Very effective targeting due
suggest appropriate
(time on site, pages to integration with other
products or place
Pros visited, etc.) data (e.g., CRM)
targeted ads
• Great for marketing dept.
• Low implementation • High implementation efforts
• No dynamically
• Significant technical
efforts customized websites Cons
expertise needed (typically
Cons because the data used is
• Analysis is time- in-house)
consuming, data not as mainly clickstream data
Cons and not other CRM data • Data warehouse: Teradata,
granular and accurate as Sample
Aster Data
data warehouse Sample vendors
• Certona • Data visualization: Tableau
vendors
Sample • Coremetrics, Google
vendors Analytics, Omniture

Implementation complexity

Source: Strategy& analysis

Strategy& 9
Providing a rich experience requires a robust analytic
capability, integrating disparate sources of structured and
unstructured data
Example analytics Data used

• Targeting promotions and personalizing offers


Customer (e.g., customized mailing, rewards, coupons) • Customer purchasing behavior
analytics • Purchase history
• Product recommendations

Marketing • Optimizing marketing mix and promotions • Marketing response data


analytics • Pricing optimization and demand sensitivity • Pricing sensitivity data

Web • Customer online activity analysis • Web activity data


analytics • Sentiment analysis • Customer social media posts

• Demand and inventory forecasting • Demand data • Distribution data


Operational • Localization • Inventory data • HR data
analytics • Supply chain analysis • Location data
• Workforce optimization (Web usage, smartphone)

• Fraud analytics • Customer interaction data


Fraud & risk
• Purchase returns data
analytics • Shrinkage analysis
• Inventory data

Source: Strategy& analysis

Strategy& 10
But the explosion of unstructured data volumes requires
new approaches to data consolidation and analytics
applications
ILLUSTRATIVE
Structured and Unstructured Data Evolution
Change Drivers

1. Speed: Data access speeds of physical storage


mechanisms have not kept up with
improvements in network speeds
Complex,
unstructured
2. Scale: Traditional data storage techniques like
RDBMS have limited scalability to manage
growing data volumes (clustering beyond a
handful of servers is notoriously difficult)

3. Integration: Today's data processing tasks


increasingly need to access and combine data
from many different unstructured sources, often
Relational,
over a network
structured
4. Volume: Data volumes have grown from tens of
1970 1980 1990 2000 2012 gigabytes in the 1990s to hundreds of terabytes
and often petabytes in recent years
Relational, Structured Data Complex, Unstructured Data
• CRM • Inventory • Documents • SharePoint
• Financials • Sales records • Web feeds • Sensor data
• Logistics • HR records • System logs • Audio
• Data marts • Web profiles • Online forums • Images/video

Source: IDC white paper sponsored by EMC and Cloudera

Strategy& 11
Unstructured data integration and analytics face multiple
challenges, but they can be overcome with some new
innovations ILLUSTRATIVE

Leading Web analytics and industry trends Vendors and products

• Social feed integration with Web and warehouse data for advanced
• Google Analytics, Adobe/Omniture,
Capturing & analyzing customer analytics
IBM/Coremetrics/Unica
multiple streams of data • Task- or page-targets-based unobtrusive, short, highly actionable, quick
• ForeSee
feedback data supplementing site surveys

• Added sources of data and complexity of integration from multiple


• Google Android, Apple iOS, RIM
Multiple platforms for platforms and form factors (smartphones, tablets)
BlackBerry
customers to interact on • Complexity of integrating structured data with unstructured feeds from
• Google TV, Boxee, Apple TV
Web, social media, chat, and Internet-connected televisions

• New analytics, storage, and processing for accelerated integration at


• Google MapReduce, Apache Hadoop,
Increasing volume of lower costs due to exponential growth of “big data” needs
Google Caffeine
data at a faster rate • No single tool to capture and analyze massive Web data, requiring
• Google Analytics, Omniture, SAS
concurrent use of multiple analytic tools

• Targeted offers based on customer location tracking enabled by GPS • Google Latitude
Continuous and cell-based tracking mechanisms in smartphones
• Foursquare
Data streams • Interaction opportunities from tracking customer check-ins at vendor
• Gowalla
locations using new services

• Browser-based features for customer to opt out of tracking


• Upcoming regulations like FTC “Do Not Track” initiatives • Mozilla Firefox, Google Chrome, Opera
Regulations
• Android- and iOS-based app developers self-regulating and asking
customer permission for data collection

Source: Strategy& analysis

Strategy& 12
To meet the challenges and gain the benefits of integrating
Web and enterprise data, multiple technology
enhancements are needed
Enhancements to Web and enterprise analytics

• Redesign and refine websites by optimizing site areas and page types, and rationalizing page tags to track interactions

• Upgrade infrastructure (e.g., Hadoop clusters, tag management systems) and processes to collect data from multiple streams
including Web channel, social media, video, and smartphone apps

• Implement validation process and engines to ensure correct data capture

• Implement multiple Web tools (Google, Omniture, etc.) and enterprise analytics tools (SAS) to fill any gaps in data capture and enhance
analytic capabilities

Integrating multi-stream data with mapreduce/hadoop

Structured smartphone
app data Analytic modeling
Batch/real-time Batch/on-demand
Structured web
form data
Structured • Ad hoc queries
analytic • Model execution
Unstructured • Dashboard feeds
web data warehouse
Mapreduce/hadoop Mapreduce/hadoop • Accelerate nightly batches
• Automatic redundant backups
Social feeds

Source: Strategy& analysis

Strategy& 13
New technologies, such as MapReduce and Hadoop, can be
utilized to quickly process large sets of unstructured data
A A Traditional data integration B B Unstructured data integration using Hadoop

Business intelligence & analytics Business intelligence & analytics Highlights


Behavioral Consumer
Fraud detection §A Traditional technologies are
A
ad targeting analytics
Analytics optimized for processing
Dashboards
reporting Dashboards
Analytics Algorithmic
structured data and
reporting models
presenting results for a
Distributed
cloud servers
narrow range of analytic
for scalability applications. Substantial
ODS EDW ODS EDW MapReduce manipulations are required
to process large volumes of
unstructured data
DM DM DM DM
ODS ODS
§B New distributed technologies
Analytics
Granular, low- Granular, low-
EDW
Warehouse EDW Analytics
“Mapped” and such as MapReduce
Granular, low-level details Warehouse B
level details level details Data feeds
reduced data (developed by Google) and
Data feeds Data feeds sets
Hadoop (open-source
Extract/transform/ Extract/transform/ Create Apache platform) are
load (ETL) Load MapReduce/import created for the purpose of
processing large volumes of
Data sources Data sources unstructured data and
importing the results for use
by a broad range of analytic
EDI LOG Objects
Application
XML
Application applications
Application Application DB DB
XML CSV CSV Text JSON Binary
DB DB
External
Relational Flat Files External feeds Relational Flat files Feeds Unstructured data

Source: Strategy& analysis

Strategy& 14
The MapReduce model does not replace traditional
enterprise RDBMS; it tackles problems that could not be
solved previously
Comparing RDBMS to MapReduce How Hadoop complements RDBMS

RDBMS MapReduce/Hadoop Highlights

Data size Gigabytes Petabytes • Storage of extremely high volumes of enterprise


data
Access Interactive and batch Batch • Accelerating nightly batch business processes
• Improving the scalability of applications
Structure Fixed schema Unstructured schema
• Creating automatic, redundant backups
Procedural
Language SQL • Producing just-in-time feeds for dashboards and
(Java, C++, Ruby, etc.)
business intelligence
Integrity High Low • Use of Java for data processing instead of SQL

Scaling Nonlinear Linear • Turning unstructured data into relational data


• Taking on tasks that require massive parallelism
Updates Read and write Write once, read many times • Moving existing algorithms, code, frameworks,
and components to a highly distributed
Latency Low High computing environment

MapReduce and Hadoop enable execution of analytics on the complete universe of data rather than on a sample set,
as done traditionally in an RDBMS. This provides better analytic output for higher-quality decision making
Source: “10 Ways to Complement the Enterprise RDBMS Using Hadoop,” by Dion Hinchcliffe

Strategy& 15
Successful implementation of MapReduce/Hadoop requires
“heavy lifting” enhancements at every layer of data
architecture
B Areas of consideration in Hadoop Adoption Considerations
Business intelligence & analytics
1.
1 Data readiness of structured, external, and unstructured data sources needs
Behavioral Consumer to be assessed to determine which sources are suitable for the Hadoop
Fraud detection 7
ad targeting analytics platform and what new capabilities will be enabled or what existing
capabilities will be better served
Analytics Algorithmic
Dashboards
reporting models 2.
2 ETL jobs need to be reviewed and possibly rationalized, since some data is
now sourced through Hadoop. Scheduling need to be rationalized while
dependencies integrity is closely monitored and preserved

ODS EDW MapReduce 6 3


3. Define and implement the distributed file system while ensuring consistency
of foreign keys, such as customer or product identifiers. On the infrastructure
side, some nonfunctional requirements may be relaxed (e.g., backups and
5 uptime do not need to be as strict as with conventional infrastructure)
DM DM DM
ODS 4
4. Modify EDW schemas to accept data feeds from Hadoop
EDW 5.
5 Define which data elements will be bi-directionally synchronized between
EDW and Hadoop
2 4
6.
6 MapReduce/Hadoop capabilities need to be joined with SQL so that
Extract/transform/ Create 3 MapReduce routines can be managed and optimized like other SQL queries.
load MapReduce/import This will allow MapReduce programs to react differently depending on the
data and parameters presented at run time, eliminating the need to create
many versions of a program for different situations
Data sources 1
7.
7 Review and rationalize extract programs that shepherd data from the
EDI LOG Objects database to a series of downstream files, such as statistical analysis or data
Application XML Application mining
DB CSV DB Text JSON Binary
Relational Flat files External Unstructured data
feeds

Source: Strategy& analysis

Strategy& 16
Besides data technologies, other dimensions of the solution
stack must also be considered
Challenges

• Now that we better understand browsing patterns, how do we change


Navigation navigation or usability?
patterns • Does the technology stack allow for appropriate and timely information
updates?
Channels
• What additional content metadata can improve interpretation of
Metadata clickstream data?
tagging • What metadata attributes are common across channels?
Traditional New
• Where should those attributes be stored and maintained?

Display Mobile Real-time & • What is the acceptable data lag?


near-real-time • Given the volume of transactions, is our architecture used in the optimal
analytics way and does it provide answers to the most important questions?
Social
Search
Video • What new content needs to be created to address insights delivered by
Content analytics?
Commerce Gaming creation • How do we ensure consistency of content across channels?
• How do we prevent it from becoming “stale”?

§ In regulated industries, how can we shorten the content approval cycle


Governance &
while maintaining compliance?
compliance
§ Do we have optimal workflows and effective governance?

Source: Strategy& analysis

Strategy& 17
Integrating Web, social media, smartphone, and other
unstructured data poses multiple challenges but provides
significant benefits
Integrating Web data poses resource, Integrating Web data provides
infrastructure, and other challenges better insights and other benefits
54 Quality of online marketing 50
Infrastructure/technology
constraints 50
Overall quality of the website 35
42

33 Quality of offline marketing 24


Lack of staff or resources Marketing
necessary to execute 41 Quality of search engine effectiveness
42 placement 24

33 Quality of product merchandising 24


Independent data use and
analysis in business units 25 Quality of internal marketing
21 efforts (house ads) 19

26 Customer satisfaction 52
Don’t want to share the data
across applications 26
Quality of content Customer
26 48
expectations
Ability to meet needs of different
17 segments 30
Too difficult to perform analysis
on large data set 27
18 Performance of the website 47

21 Navigability of the website 46


No solution capable of meeting
integration requirements 11
21 Quality of internal search 26

17 Data analysis cost efficiencies 24


Operational
Don’t see a need to $1B or more efficiencies
integrate data 14
15 $100M to less than $1B Ability to perform ad hoc analysis 22
$50M to less than $100M
4 Faster time-to-market 18
None 6
Decreased dependence on
3 IT resources 12

Source: Forrester: Jupiter Research e-Rewards Executive Survey (2/08), n = 514 (small and medium-sized business decision makers, U.S.); Strategy& analysis

Strategy& 18
Strategy& is a global team These are complex and charting your corporate We are a member of the
of practical strategists high-stakes undertakings strategy, transforming a
committed to helping you — often game-changing function or business unit, or 157 countries with more
seize essential advantage. transformations. We bring building critical capabilities, than 184,000 people
100 years of strategy we’ll help you create the committed to delivering
We do that by working consulting experience value you’re looking for quality in assurance, tax,
alongside you to solve your and the unrivaled industry and advisory services. Tell us
toughest problems and and functional capabilities and impact.
helping you capture your of the PwC network to the out more by visiting us at
greatest opportunities. task. Whether you’re strategyand.pwc.com.

This report was originally published by Booz & Company in 2011.

www.strategyand.pwc.com
© 2011 PwC. All rights reserved. PwC refers to the PwC network and/or one or more of its member firms, each of which is a separate legal entity. Please see www.pwc.com/
structure for further details. Disclaimer: This content is for general information purposes only, and should not be used as a substitute for consultation with professional advisors.

Strategy& 19

You might also like