Professional Documents
Culture Documents
Module+3+-+CPG+Data+Landscape (1)
Module+3+-+CPG+Data+Landscape (1)
Module+3+-+CPG+Data+Landscape (1)
Module 3
Key Data Sources Used in CPG Industry
Debtanu Dutta
Deepnarayan Debnath
Mayank Aggarwal
Ram Mehta
2
01
Key Data Sources in CPG Industry
3
List of Data Sources to Enable Business Decisions
Syndicated Retail Sales Household Panel POS Shopper/Loyalty Consumer Survey Brand Tracking
B&M Sales
(Online + Offline)
IRI Kantar
Overall
ePOS Clickstream Digital Shelf Social Media Macro-economic Others
eComm & Digital
Real time logistics Supply Risk Assessment Forecasting Market Indicator International Trade Sustainability
Supply Chain
Data Components:
Various attributes help in a deep-dive analysis of market, category and competitor performance and trends:
Products Typical Product Hierarchy: Category – Subcategory – Brand – Sub Brand – Product hierarchy and dimensions help in analysis of various metrics at
Pack Size – SKU different levels of granularity and aggregation for trend analysis, and
Category specific Product dimensions: Segment, Flavour, Pack Type, Price tier comparative analysis for similar products of competitors. As marketing &
etc. sales activities differ by brands, a lot of insights can be derived at different
product levels.
Markets Typical markets available across countries: Country total, State, Key Cities, Various metrics are comparable at different market levels for evaluating
Sales Channels/Type of stores, Retailers, State*Channel type, etc. performance across different markets. Typically, strategic decisions will differ
at market level depending on consumer preference and shopper behaviour
observed in data.
Periods Typical period breaks are Week, Period/Month, Last 13 Weeks/Quarter, Year Trend analysis and seasonality analysis play a major role in taking tactical
Till Date (YTD) & Full year. decisions.
6
Syndicated Sales Data (2/2)
A variety of metrics in the syndicated data help in getting a holistic view of the brand performance.
7
Household Panel Data (1/2)
What is it?
Panel data is tied to individual buying households. With panel data, you can look at the consumer dynamics underlying your sales patterns. Panel data
can tell you the demographics of your buyers, how often they buy, how loyal they are, and what other products they purchase
Data Components:
Various attributes help to look at consumer dynamics underlying your sales patterns, Demographic/regional profile of buyers, product portfolio
management
8
Household Panel Data (2/2)
A variety of non additive metrics in the HHP data helps to rightly position product into the market understanding customer behaviour
Measure Description Calculation
Raw_Buyers Actual Number of Buyers (Before weighting methodology has been applied). Used to determine robustness of the data. Actuals
Obuyers Number of households buying a brand/product or visiting a store at least once within the specified time period. Actuals
Percentage of the population that bought the product, brand or category at least once within the specified time period (= % buying (Buyers /
Penetration
households of the product, brand or category). Population can be a country, a region, or a socio-demographic group such as 65 years+. Population) * 100
Otrips Total number of occasions or shopping trips in which the brand or product was purchased. Actuals
The percentage of households that bought the brand/product at least twice (≥2) in the specified time period, in relation to all households
Repeat_Buyers Actuals
that bought the brand/product during the same period.
OSpend Total value of the brand or product, expressed in specified currency. Actuals
Ovolume Total volume of the brand or product (in kg/L or relevant unit). Actuals
Opacks Total volume of the brand or product (in packs). Actuals
Cat_Trips Total number of occasions or shopping trips within the category. Actuals
Cat_Spend Total value of brands purchased within the category. Actuals
Cat_Volume Total volume of brands purchased within the category. Actuals
Cat_Packs Total packs of brands purchased within the category. Actuals
Volume_Secondary Only used if a second volume is required I.E Bottles etc. Actuals
(Trips / Buyers) *
Frequency Average number of purchase trips for the product, brand or retailer per buying household in the specified time period.
100
Spend Per Buyer Average amount spent on the brand/product per buying household within the specified time period (in currency). (Value / Buyers)
Volume Per Buyer Average quantity bought of the brand or product per buying household within the specified time period (in kg/L/packs or relevant unit). (Volume / Buyers)
Spend Per Trip Average value spent on the brand/product per purchase trip for the brand/product within the specified time period (in currency). (Value / Trips)
Average volume bought of the brand/product per purchase trip for the brand/product within the specified time period (in kg/L/packs or
Volume Per Trip (Volume / Trips)
relevant unit).
Average Price Average price of the brand/product paid by the buying households of the brand/product during the specified time period (in currency per
(Value / Volume)
(Volume) kg/L/pack or relevant unit.
9
Point of Sales (POS) Data (1/2)
What is it?
Post of sales data refers to information about product sales that is collected from a retailer across its all stores as well as online channel. This can also
capture shopper level transaction and loyalty data.
Key data providers: Walmart (Retail Link & Luminate), Dunnhumby (Tesco), Kroger’s, Partners Online (Target), Carrefour etc.
Data Components:
Products Typical Product Hierarchy: Category – Subcategory – Brand – Sub Brand – Pack Product hierarchy and dimensions help in analysis of various metrics at different
Size – SKU levels of granularity and aggregations.
Category specific Product dimensions: Segment, Flavour, Pack Type, Price tier
etc.
Stores/ Store format, types, location, region, store id Various metrics are analyzed and compared in online and offline channels at
Channel Channels – Online, In store, Delivery, Buy Online Pick In Store (BOPIS) various level starting from individual store to aggregated market levels which
helps in different decisions.
Transaction Transaction /Bill level details Helps in understanding visits and per visit shopper behaviour and Basket
characteristics
Shopper/ Masked customer id level anonymous data (no sensitive information like name, Helps in understanding purchase pattern at customer segment level
Loyalty age, sex) with Customer segmentation like price sensitive, loyalty, family
segmentations
Periods Typical period breaks are Week, Period/Month, Last 13 Weeks/Quarter, Year Trend analysis and seasonality analysis play a major role in taking tactical
Till Date (YTD) & Full year. decisions.
10
Point of Sales (POS) Data (1/2)
A variety of metrics in the POS data help in getting a holistic view of the brand performance.
Metric Group Metric Details/ Key benefits
Sales 1. Sales Value Product sales measured in terms of $ value, volume (Kg/Ltr or units) and units (packs).
2. Sales Volume
3. Sales Units
1. Base Sales 1. The normal expected sales in the absence of any trade promotion (TPR, Feature, or Display).
2. Incremental sales 2. The difference between actual volume and baseline (expected) volume.
Price 1. Average Selling Price 1. The dollars spent by consumers on each unit (or equivalized unit) of an item across all stores in each market.
2. Discount 2. The average percent of price reduction relative to a product’s baseline price during a trade promotion.
Loyalty 1. Purchase Frequency 1. How often a shopper makes purchase from the retailer
2. Share of Wallet 2. What is the sales% of a brand from the total transaction/bill sales
3. Contribution of channel 3. What is the channel wise sales% of a particular product/brand
Store level 1. Same Store Change% 1. Change in sales over selected period for stores which operated throughout the selected duration
2. Store Sales Contribution% 2. Sales contribution of a store to the selected overall market
11
Click Stream Data (1/2)
What is it?
Clickstream data refers to the chronological collection of various interactions that a user has with a particular website, application, or digital platform.
These interactions are typically recorded in the order they occur and include details about the user's actions, such as clicking on links, buttons, or images,
as well as other types of engagement like mouse movements, scrolling, and time spent on different pages.
Key data providers: Adobe Analytics, Google Analytics, Tealium etc.
Data Components:
Products Category, Brand, Product details Product details help in analysis of various metrics for comparison and
site/platform level benchmarking.
Sessions Session ID, Customer ID, Date & time stamp, Type of devise used for Analysis of individual session and multiple sessions as a temporal measure
access helps in knowing consumer preferences and what is leading to conversion vs
dissonance
Website/App Page category/ sub-category, Click event details, Page mapping to Funnel It gives insights into the page navigation and what are the ‘Hot Spots’ on the
navigation types, Landing/Exit page website/app
Customer End goal of each journey, Search term used, Source of traffic, new vs Helps in understanding customer journey across pages starting from where
Journey returning customers, Geography of customer they came to the website/app to where they ended a session and what
engaged them the most
Errors Page & link where error happened, type of error Gives clear customer pain points due to unavailability of required
content/navigation on the website/app and admin can take the necessary
corrective action in web/app design/content
12
Click Stream Data (2/2)
A variety of metrics directly available or derived from the Click stream data help in getting a holistic view of the brand performance.
Metric Group Metric Details/ Key benefits
Traffic 1. Page views 1. The count of pages a user visits within a website or app.
2. Sessions 2. The total number of interactions a user has with a website within a specific time frame.
Engagement 1. Bounce Rate 1. The percentage of visitors who navigate away from a site after viewing only one page.
2. Conversion Rate 2. The percentage of visitors who take a desired action, such as making a purchase or signing up for a newsletter.
3. Click Through Rate 3. The percentage of users who click on a specific link or element, often used in email marketing and advertising.
4. Time of Page 4. The average amount of time users spend on a specific page.
eCommerce 1. Average Order Value (AOV) 1. The average amount spent by customers on each order
2. Cart Abandonment Rate 2. The percentage of users who add items to their cart but do not complete the checkout process
Advertising 1. Cost Per Click (CPC) 1. The amount paid for each click on an online advertisement
Metrics 2. Cost Per Acquisition (CPA) 2. The cost to acquire a customer through advertising
3. Return on Ad Spend (ROAS) 3. The ratio of revenue generated to the cost of advertising
Others 1. Page Load Time 1. The time it takes for a specific page to load completely.
2. Error Rate 2. How many errors are encountered per session
13
Media Data
Actual spend, CTR, Clicks, CPM, Impressions, CPCV, VCR, • Clicks: The number of times an ad was clicked
Views • CPM: Cost Per Mile, it’s the cost an advertiser pays for
one thousand impressions of an advertisement
• Impressions: Number of times an ad appeared to users
• CPCV: Cost Per Completed View, A bidding method for
video campaigns where you only pay for instances when
the video ad was viewed entirely
• VCR: Video Completion Rate, the number of times an ad
was watched completely, out of the total number of times
the ad started playing
• Views: Number of times an advertisement was viewed
14
Order and Shipment Data
What is the significance of these data point ?
Order and shipment data enhance operational efficiencies through informed decision-making, improved resource allocation, and timely customer response.
These are operational data points that provide visibility towards key processes and opportunities to improve meeting customer SLAs
Metric Group Metric Details/ Key benefits
Order Order Aging Order Aging: Tracking order aging provides insights into how long orders are in the system, helping identify bottlenecks and improve order
Order Fulfilment processing speed.
Rate Order Fulfillment Rate: This metric measures how well customer orders are being fulfilled, indicating operational effectiveness and customer
Sales Unit price satisfaction levels.
variance Sales Unit Price Variance: Monitoring price variance helps identify pricing discrepancies, aiding in pricing strategy adjustments and profit
Average Order optimization.
Volume Average Order Volume: Tracking average order volume helps anticipate resource needs and plan inventory levels for efficient operations.
Order Processing Order Processing Efficiency: This metric highlights how efficiently orders are processed, enabling process improvements and reduced lead
Efficiency times.
Order Accuracy Order Accuracy: Measuring order accuracy ensures customers receive the correct products, enhancing customer satisfaction and reducing
returns.
QTY adjustments –
QTY Adjustments – Short or Over: Tracking quantity adjustments indicates inventory accuracy and highlights areas for process improvement.
Short or Over
Order Cycle Time: Monitoring cycle time from order placement to delivery helps identify inefficiencies and optimize order processing
Order Cycle Time workflows.
Shipment On Time in Full On Time in Full: This metric gauges how often orders are delivered as promised, indicating supply chain reliability and customer satisfaction.
Transportation Cost Transportation Cost by Orders: Tracking transportation costs per order helps manage expenses and optimize logistics for cost-effective
by Orders operations.
Route Efficiency Route Efficiency Index: This metric measures how well routes are optimized, contributing to fuel savings, reduced emissions, and improved
Index delivery speed.
Delivery Lead Time Delivery Lead Time: Monitoring delivery lead times provides insights into the speed and responsiveness of the supply chain, impacting
Net Value per cubic customer expectations.
meter Net Value per Cubic Meter: Measuring net value per cubic meter aids in optimizing storage and transport space, maximizing revenue per unit
Load Completion volume.
Efficiency Load Completion Efficiency: This metric assesses how well shipments are fully loaded, minimizing transportation waste and increasing
efficiency.
15
Sales Force Data (Sales Force Application)
Visit Information
Visit Date, Start time of a Visit, End time of a Visit, Visit
Location Information Fact/KPIs
Store Information Below are the typical facts available in the data
Store ID, Store Name, Store Location, Further details on store • Number of visits, Amount of time spent in stores, Amount
visited. of time spent in commute.
Order Information • Sales Frequency, Total Sales, Total Orders, Unique num. of
Products.
Order ID, Order Quantity, Order Gross Value, Order Net Value,
• Amount of time rep. spent in specific pages
Tax amount, Scheme Applicability, Promotions
Product Information
Product ID, Product Price, Product Quantity
Sales Rep. Journey Data
Rep. ID, Rep. Name, Clickstream data
16
03
Typical Challenges to Overcome In Processing
External Data Sources
Various Challenges at Each Stage of Data Processing
1 Data Acquisition
Issues while acquiring data Data Delivery Tech glitch in
from external vendors and Data Coverage Data Compliance
Schedule data receipt
internal system
2 Data Cleaning
& Preparation Diverse Data Data Data Data Invalid
Dispersed Structured & Structures Duplication Completeness Consistency Data
Unstructured Data Sources
3 Data Integration
Different data structure & Attribute Unit of Time Update of
content across data Currencies
Nomenclature Measure Granularity Master Data
sources & countries
Syndicated Sales Data with
Internal Datasets
4 Process Scalability
Seamless onboarding of Lack of Industry standard
new data sources and Siloed Solutions
Advanced Planning not followed
new geography to the
existing solution
18
Our Approach Towards Problem Solving
1 Data Acquisition
• Coverage: Data extrapolation using coverage factor
• Data Compliance: Validation and required correction/ masking as per PepsiCo internal & country’s legal guidelines e.g. GDPR
• Data Delivery Schedule: Alignment with the data partner to share full year data delivery calendar
• Tech glitch in data receipt: Automated checks for timely data receipt in Azure blob storage/landing zone
3 Data Integration
• Attribute Nomenclature: 1) Product key (UPC/EAN) based product mapping, 2) ML based NLP approaches to map attribute
Syndicated Sales Data with names with the master data
Internal Datasets • UOM, Currency, Time Duration Alignment: Country*Category level rulebook for required transformation for harmonization
• Update of Master Data: Automated process for identification of new additions in the master data
4 Process Scalability
• Reusability: templatize various collaterals such as data dictionary, BRD, etc. for standardization and speed
• Implementation of industry wise Integrated Data Model for seamless onboarding of new dataset/new geography
19
End-to-End Data Harmonization Process
Harmonization &
Data Receipt Checks & Data Validation
Master Data Transformation
File Transfer/ Ingestion & Cleaning
Management
• System notification for • File structure validation and • Alignment across: • Data Aggregations
receipt of new data from required correction/ alert
• Time Period • Transform the data and file
the 3rd party data provider for anomaly (# of columns,
• Currency structure as per
Column order, header)
• Alert in case of non-receipt business/use case
by stipulated timeline • Data validation based on • Unit of Measurement requirement
Input Data (based on delivery calendar) predefined rules including: • Attribute values
• File transfer/ingestion to • Decimal conversion through ML (products,
designated area/tables geography, retailers,
External • Duplicates demographics, etc.)
Syndicated Sales • Null/Missing Values • Identification of new values
HH Panel
• Negative Values and addition in Master data
ePOS, etc.
files
• Hierarchical level value
mismatch (Product, • Specific checks during Output Data
Market, Channel Type) restatement of historical
data
Internal • Trend break Harmonized
Finance, • Data cleaning based on data files
Inventory, etc. validation results
20
04
Business Value Drivers Enabled by Data
Sources
Actionable Insights by Consolidation of Internal & External Datasets for Consumer Insights, Marketing &
Commercial Functions (1/2)
1. Competitive Landscape 2. Revenue Growth 3. Precision Marketing 4. eComm, Dcomm & Omni-channel
Management Execution
Develop media strategy and Formulate strategy for Pricing, • Closer connect with the shopper • Improved forecasts and fulfillment
execution plan Promotion & Assortment • Deeper brand engagement rate
• Optimal media spend • Shopper preferred one-stop
destination
• Seamless brick & click experience
In which • Segment Sales • Nielsen, IRI, How to • Sales Units, • Nielsen, IRI, Who are the • Audience • cDNA How do we • Sales • Amazon, etc.
Subcategory/ • Competitor Walmart price the Value Walmart key audience to Segmentation • Nielsen Media optimize our • Shipment • D2C data
Segments our Sales Luminate etc. products • Discount Luminate etc. target? • Demographic • Kantar Media product • Internal
brands are • Market Share optimally? • Sales Volume • Finance - & • AIMIA offerings? shipment
leading? on promoted spend Psychographic
price • Promotion profiling
• Features/ calendar
display How do we • Consumer • Clickstream -
Which • Segment Sales personalize our Journey data GA, AA, etc. How do we • Consumer • GA, AA, etc.
Which products • Sales Units, • Nielsen, IRI,
competitor • Competitor consumer • Conversion • AIMIA maximize Journey data • AIMIA
are suitable to Value Walmart
brands are Sales connects? rate conversion • Conversion • 1010,
sell at which • Product Luminate etc.
driving category • Market Share retailer/channe bundling opportunities rate Numerator,
growth? l? • Competitor and build • Reviews & etc.
mapping How do we • Impressions • Nielsen perfect eStore? Ratings • Amazon
optimize media • Conversion Media, Kantar
investments? rate Media
What is root • Sales Val, Vol What • Type of promo • Nielsen, IRI, • GRP • Nielsen, IRI, What is our • Sales value, • 1010,
cause of market • WD, ND promotions to • Promo/ Walmart • TRP etc. brand’s volume Numerator,
share decline? • Retail Selling run at campaign Luminate etc. • Media Spends • GA, AA, etc. competitive • Traffic and Nielsen, etc.
Price consumer details • Finance - • Sales • Finance - position? conversion
• Sales of microsegments • Sales units, spend • Campaign spend rate
promoted ? value • Promotion details • Media • Price
products calendar calendar • Out of Stock
What are the • Category and • Nielsen, IRI, Identify key • Social • Google Store clustering • Store wise • Route Plan Which stores • On Shelf • Nielsen,
key category New Product GFK, etc. demand Listening Analytics (GA), & which Stores Sales • sDNA are facing OOS Availability Walmart
trends? Sales - • AA/ GA, etc. occasions • Search Terms Adobe to visit in a • Time spent in • Employee regularly? • Out of Stock% Luminate
Value/Volume, • 1010, Analytics (AA), route in a day? a store productivity
Market share Numerator Tealium, etc. • # SKUs in a • Store locations
• Social listening • Social Media - • Google Trends store
FB, IG, YT, • Primary
Twitter, etc. Research
NBA & • Store wise • POS sales Are my stores • Product • Planogram
Why they buy a • Share of • Household
What is the • Category and • Nielsen, IRI, Assortment Sales • Nielsen, IRI, compliant to Facings images
particular Wallet Panel - Kantar,
expected New Product GFK, etc. Optimization- • # SKUs in a Walmart the planogram • Count of • Shelf images
brand? • Brand IRI, Nielsen,
market size of Sales - • 1010, Which products store Luminate etc. guidelines? products
Penetration GFK, etc.
the new Value/Volume, Numerator & what • Store cluster
product? Market share quantity to be • Must Sell SKU
sold at which list
store?
How do we • Product • 1010, Is the Sales Rep • Time spent in • Route Plan
compare Scoring Numerator, complying to a store • Employee
against • Ratings & etc. BU/Segment • # SKUs in a productivity
competition? Reviews • AA/ GA level guidelines store • Shelf Images
• Brand Love & • Primary & • Planogram • Walmart
Tension score Research priorities? compliance Luminate
• OOS %
www.tigeranalytics.com