Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

New Product

Idea Generation
via market research, user interviews, text analysis*

MARK 3088 Product Analytics

Lecturer: Dr. Yu-Ting Lin


Agenda
• Using LLMs for Marketing Insights
• Sentiment Analysis
• Topic Modeling

2
It all starts with text….

Like piecing together shards to learn about a distant civilization,


text provides a window into marketing insights (Ashlee Humphreys, 2020)
3
Increasing Data Size & Complexity

4
Ways to deal with it 5
Generalizability
Aspect-Based
Dictionaries Sentiment
LLMs
“The food was very fresh. “The food was very fresh.
Very noisy atmosphere. Very noisy atmosphere.
But the weather was great. But the weather was great.
Will visit again”​ Will visit again”​​
Speed & Interpretability Complexity & Performance

“The food was very fresh. “The food was very fresh.
Very noisy atmosphere. Very noisy atmosphere.
But the weather was great. But the weather was great.
Topic Will visit again” Will visit again”​​ Sentiment
Modelling Analysis
e.g., LDAs ML Models

Specificity 6
Summary courtesy of Prof. Stephan Ludwig
High-level Overview of Popular Language Models

Type Model Name #Parameters Release Open Source #Tokens Training dataset

Encoder-Only BERT 110M, 340M 2018 ✓ 137B BooksCorpus, English Wikipedia

BooksCorpus, English Wikipedia, CC-NEWS,


Encoder-Only RoBERTa 355M 2019 ✓ 2.2T
STORIES (a subset of Common Crawl), Reddit

12M, 18M, 60M,


Encoder-Only ALBERT
235M
2019 ✓ 137B BooksCorpus, English Wikipedia

BooksCorpus, English Wikipedia, Giga5, Common


Encoder-Only XLNet 110M, 340M 2019 ✓ 32.89B
Crawl, ClueWeb 2012-B

GPT Family GPT-4 1.76T 2023 × 13T -

LLaMA Family LLaMA2 7B, 13B, 34B, 70B 2023 ✓ 2T Online sources

7
Most Marketing Practitioners & Researchers
Prefer One-click Solutions…


8
What we want is….

Performance
Generalizability
No Complexity
One-Click Solutions

9
So, let's start
building our own
LLMs tools

Co-Authors: Stephan Ludwig, Xiaohao Yang, Ehsan Abedin,


Peter Danaher, Lan Du, Yu-Ting Lin, Dhruv Grewal
10
Costs: $137,000 and counting…
So, let’s start building our own LLMs tools

Co-Authors: Stephan Ludwig, Xiaohao Yang, Ehsan Abedin,


Peter Danaher, Lan Du, Yu-Ting Lin, Dhruv Grewal 11
So, let’s start building our own LLMs tools

Co-Authors: Stephan Ludwig, Xiaohao Yang, Ehsan Abedin,


Peter Danaher, Lan Du, Yu-Ting Lin, Dhruv Grewal 12
Model Comparison Correlation
Between True and Predicted Sentiment Scores
for Alternative Methods Using Test Data

Method LIWC EV 2.0 VADER BERT GPT LX

Correlation 0.26 0.52 0.51 0.66 0.62 0.72

13
Where next? Well, you decide…

14
Text Mining/Analysis Tools Comparison
Word cloud/
Network Classification Sentiment Topic
Free Web-based word
analysis /clustering analysis modelling
frequency

NVivo X X X
Leximancer X X X X X
Voyant X X X X
Orange Text
X X X X X X
Mining
R and RStudio X X X X X X
Python X X X X X X

Summary courtesy of The University of Queensland Library


15
Widget Catalogue from Orange

16
Customer Sentiment
– a vital metric to understand how customers perceive the product or service experience

17
What is Customer Sentiment?
• Sentiment refers to the positivity or negativity expressed in text.

• Sentiment analysis of reviews, social media posts or comments,


live chat, telephone call transcripts and other unstructured data
inputs from customer interactions can provide more nuanced
insight into how people actually “feel” about your product.

18
Applications of Sentiment Analysis
• Product Development:
Feedback on product features and performance helps in refining existing
products and developing new ones that better meet customer needs.
• Marketing Strategies:
Understanding sentiment helps in crafting marketing messages that
resonate with customers and addressing any negative perceptions.
• Customer Support:
Identifying common issues and sentiments related to customer support
interactions can improve service quality and response times.
• Sales Insights:
Positive sentiment can be leveraged to highlight successful features and
use cases, while negative sentiment can identify barriers to sales.

19
Types of Sentiment Analysis
1. Fine-grained sentiment analysis via a specific score or rating
2. Emotion detection
3. Polarity detection
4. Multilingual sentiment analysis
5. Visual sentiment analysis
6. Intent analysis (ft. topic modelling)
7. Aspect-based sentiment analysis
(ft. topic modelling)

Image courtesy of Qualtrics 20


Unit of Analysis
• Document-level (useful for professional reviews or press
coverage)
• Sentence level (for short comments and evaluations)
• Sub-sentence level (for picking out the meaning in phrases or
short clauses within a sentence)

21
Methods of Sentiment Analysis
1. Liu Hu: lexicon based and language specific. Ranks sentiment as
a single number that is either negative, zero, or positive
2. Vader: rule-based analysis in addition to lexicon. Has 4 scores:
positive, negative, neutral, and compound
3. Multilingual sentiment: lexicon-based for several languages
4. SentiArt: sentiment analysis based on vector space models
returning text valence
5. LiLaH sentiment: manual translations of NRC Emotion Lexicon
6. Custom dictionary: add you own positive and negative
sentiment dictionaries. Accepted source type is .txt file with
each word in its own line. The final score is computed in the
same way as Liu Hu
7. If Auto commit (now called ‘apply automatically’) is on,
sentiment-tagged corpus is communicated automatically.
Alternatively press Commit

Image courtesy of Orange


22
Lexicon-based Sentiment Analysis Rule-based Sentiment Analysis

Basis of Uses predefined lists of sentiment-laden Uses a set of linguistic rules to determine
Analysis words. sentiment.

How it works Sentiment Lexicon: It uses dictionaries of Rule Set: It relies on a set of rules defined by
words annotated with sentiment scores. Each experts, which may include syntactic and semantic
word in the lexicon is associated with a rules, such as negation handling, intensifiers, and
sentiment score indicating its positive or parts of speech tagging.
negative sentiment. Pattern Matching: The text is analyzed according to
Word Matching: The text is analyzed by these rules to identify sentiment-bearing structures
matching words from the text with words in and patterns.
the sentiment lexicon. Context Awareness: Rules can be designed to
Score Aggregation: The overall sentiment handle more complex linguistic phenomena, such as
score of the text is computed by aggregating the impact of negation on sentiment and the role of
the sentiment scores of the individual words modifiers and context.
or phrases found in the text.
Complexity Simpler and quicker to set up. More complex and requires detailed linguistic
knowledge.
Handling of Nope Yes
Context
Flexibility and Nope Yes 23
Customization 23
Following steps are
needed to be performed
while applying any
approach
Considering this problem instance:
“Sam is a great guy.”

24
24
Topic Modelling

25
What is Topic Modelling?
• A statistical model used in natural language processing (NLP)
and text mining to discover abstract topics within a collection
of documents.
• It helps to understand the underlying themes present in the
text data by identifying groups of words that frequently occur
together.
• These topics can provide insights into the structure and
content of large text datasets.

26
Applications of Topic Modelling
• Market Research: Analyzing product reviews to identify
common praises and complaints.
• Customer Support: Categorizing customer support tickets by
topic to prioritize and address common issues.
• Media and Journalism: Analyzing news articles to track how
topics evolve and how different media outlets cover them.

27
Key Concepts in Topic Modelling
• Document: A single piece of text, such as an article, report, or
any written material.
• Corpus: A collection of documents.
• Topic: A collection of words that frequently appear together,
representing a theme or subject in the text.

28
Topic Modelling vs Text Classification

Image courtesy of datacamp


29
Popular Topic Modelling Algorithms
1. Latent Dirichlet Allocation (LDA)
2. Latent Semantic Analysis (LSA)
3. Hierarchical Dirichlet Process

30
1.Choice of algorithm:
1. Latent Semantic Indexing.
Returns both negative and
positive words and topic weights.
2. Latent Dirichlet Allocation
3. Hierarchical Dirichlet Process

Image courtesy of Orange

2. Parameters for the algorithm.


LSI and LDA accept only the number of topics modelled, with
the default set to 10. HDP, however, has more parameters. As
this algorithm is computationally very demanding, you are
strongly recommended to try it on a subset or set all the
required parameters in advance and only then run the algorithm
(connect the input to the widget) 31
Benefits and Challenges
• Unsupervised learning • Choosing the number of topics
doesn’t require labelled data requires domain knowledge
• Scalability to handle large • Topics may not always be
dataset efficiently easily interpretable
• Provides a high-level • Quality of results depends on
understanding of text data preprocessing steps and
quality of text data

32
Feedback on PM Career & Related Terminology

33
What do I do as a Product Manager?

Chloe Shih: Product • Prev. Discord, Meta, TikTok, Google • Forbes 30U30
34
Software
Development
Methodologies

Image courtesy of ResearchGate


35
AGILE vs. CRISP-DM
Complimentary PM Frameworks

Cross Industry Standard Process for Data Mining


36
For Your Week 7 and Onwards…

37
“Champo Carpets” Case Study
Improving B2B sales using machine learning algorithm

https://champoofficial.com/
38
38
Business Understanding
• One of the largest carpet manufacturing companies based in India.
• At the beginning of 2020, the company employed 1,500 people with a
capacity to produce 200,000 pieces of carpets and floor covering per month.
• As part of the sales and marketing, Champo Carpets shared sample designs
with its potential customers, based on which the customer placed an order.
• However, their sample-to-order conversion ratio was low compared to the
industry average. This had cost repercussions as well as lost opportunities.
• You’re tasked to help them out!
1. Target their product accurately to the right clients
2. Design an appropriate recommendation system

39
Data Understanding & Preparation
Exploratory Data Analysis
1. Order category (sample or order)
2. Revenue generated during 2017-2019
3. Carpet type and units sold
4. Countries vs. revenue generated
5. Customers and revenue generated

40
Let’s take stock of the situation…
We are the managers of a Product Analytics project for a
company. We are following the CRISP-DM working methodology.
We have already gone through the steps of
• understanding the business problem
• understanding the data and preparing the data
(steps boring, not enjoyable, problematic on the part of the
customer and on the part of the data)
• We are ready for the modelling phase, which seems to us the
most creative and fulfilling.

41
Data/ML Pipelines
Learning
Algorithm
5
1 Dataset
Training Data
6 Train Model

4 Split
2

Data 3
Cleaning Feature
Engineering New Data 7 8 Evaluate Model
Score Model

9
10 Predict

42
Case Preparation Questions
1. What kind of analytics and ML algorithms can be used by
Champo Carpets to solve their problems, and in general, for
value creation?
2. Develop ML models to help identify product features that
contribute toward conversion (or non-conversion) of samples
sent to customers.

43
What did we examine today?
Key concepts

• Sentiment Analysis
• Topic modelling: Latent Semantic Analysis (LSA) and Latent
Dirichlet Allocation (LDA)

•Questions to consider

• Ask yourself: How do these concepts relate to my work, projects


(or the work/projects I want to be involved in)? How can they help
me? Importantly, how could I adapt them, so they do?

44
Q&A
Thank you!

See you tomorrow 

You might also like