Professional Documents
Culture Documents
Legacy Segmentation Projects: Chudi Okoye
Legacy Segmentation Projects: Chudi Okoye
Legacy Segmentation Projects: Chudi Okoye
Chudi Okoye
Magenta People: T- 2016/17 Nielsen ConneXions Survey Quarterly No longer Not sure if still licensed. Need to verify status and Logistic,
Mobile Consumer licensed? location of legacy data Descriptives
Target Segment
Mobile Insights Survey Quarterly Still licensed Julie Liabraaten’s team
Value Segments 2017/18 Mobile Insights Survey Quarterly Still licensed Julie Liabraaten’s team Logistic,
Descriptives
Underpenetrated 2018/19 Merkle DataSource Mixed Annual Still licensed user_dw.TMO_Consumer_ALL_Attr_20180220 Decision
Segments: Pop and (Merkle pop data. Indivs ~240m; HHs ~127m) Tree, SQL,
Base Profiles Descriptives
user_dw.TMO_CONSUMER_ALL_ATTR_SUBSCRIBER
Base data _ACCT_20180220 (Merkle in TMO: HHs ~8m; Subs
~20m)
Profile of T-Mobile 2018/20 Merkle Mixed Annual Still SQL,
Multicultural Base DataSource licensed user_dw.Chudi_Merkle_Base_Combined (Merkle Descriptives
base with some TMO customer metrics: Subs ~8m)
Base data
Value Segments 2017/18 Mobile Insights Survey Quarterly Still licensed Julie Liabraaten’s team Logistic,
Descriptives
Underpenetrated 2018/19 Merkle DataSource Mixed Annual Still licensed user_dw.TMO_Consumer_ALL_Attr_20180220 Decision
Segments: Pop and (Merkle pop data. Indivs ~240m; HHs ~127m) Tree, SQL,
Base Profiles Descriptives
Base data user_dw.TMO_CONSUMER_ALL_ATTR_SUBSCRIBER
_ACCT_20180220 (Merkle in TMO: HHs ~8m; Subs
Profile of T-Mobile 2018/20 Merkle Mixed Annual Still ~20m) SQL,
Multicultural Base DataSource licensed Descriptives
user_dw.Chudi_Merkle_Base_Combined (Merkle
Base data base with some TMO customer metrics: Subs ~8m)
Cluster Ratio
Retail
Device
Brand Channel & Customer Churn &
Demos Level Usage Satisfaction
Performance Purchase Service Switching
Analysis
Drivers
Segmentation derived from two primary domains: Wireless behaviors and Demographics.
– Attitudinal data was not available for segmentation, but incorporated in post-hoc segment profiling
The result of this analysis indicates that T-Mobile attracts the most
engaged, though not necessarily the most profitable, wireless
consumers. In addition, T-Mobile segments exhibit fairly high switching
propensity.
Budding Aficionados
Demographics: Young - mainly 18-34; mostly subprime/near prime with low-to-mid income; high minority penetration
17%
Mobile Engagement: Very high tech engagement, heavy usage, price-value oriented, DUP; strong new-to-wireless share
High Indexing Carrier(s): T-Mobile, AT&T
Mobile Minimalists
Demographics: Older – 55+; prime/near prime with low-to-mid income; mostly Caucasian with strong rural penetration
12% Mobile Engagement: Very low tech, strong prepaid preference (PayGo plurality); low plan/device spend; price motivated
High Indexing Carrier(s): Other (non-Tier 1)
Motivated Mid-Incomers
Demographics: Middle-age to older; near prime with low-to-mid income; mostly Caucasian but strong minority; with kids
27% Mobile Engagement: Low-to-mid tech, fair prepaid share (prefer unlimited prepaid); midlevel spend; DUP; price motivated
High Indexing Carrier(s): Metro PCS, Other (non-Tier 1). Competitive space
Mobile Enthusiasts
Demographics: Fairly young – 25-44; mainly near prime with high income; heavy minority penetration, with kids at home
24%
Mobile Engagement: Strong tech, mostly postpaid; high spend; high DUP take-up; high switching ; price/network motivated
High Indexing Carrier(s): T-Mobile
Note: Index compares segment share within carrier base to share within market base
Market
Middle-age to older Young - mainly 18-34 Older – 55+ Middle-age to older Fairly young – 25-44
Share
21 17 12 27 24
Mostly prime/near Mostly sub/near Prime/Near prime
% % % % prime
Prime/Near Mainly% near prime
prime with high prime with low-to- with low-to-mid
with mid income with high income
income mid income income
Demographics
Mostly Caucasian
Mostly High minority Mostly Heavy minority
with strong rural
Caucasian penetration Caucasian penetration
Basic
penetration
Kids at home: ◕ Kids at home: ◐ Kids at home: ◔ Kids at home: ◐ Kids at home: ●
Mid-to-high tech Very high tech Very low tech Low-to-mid tech Strong tech
engagement engagement engagement engagement engagement
Fairly strong prepaid
Primarily High family Strong prepaid Mostly postpaid with
(prefer unlimited
postpaid plan share (PayGo) preference family plan
plan)
Engagement
Plan/device spend: ● Plan/device spend: ◕ Plan/device spend: ◔ Plan/device spend: ◐ Plan/device spend: ●
Mobile
Moderate DUP Moderate-to-high Very low DUP Moderate DUP High DUP
take-up DUP take-up take-up take-up take-up
Usage level: ◕ Usage level: ● Usage level: ◔ Usage level: ◐ Usage level: ●
Verizon T-Mobile, AT&T; Metro PCS, Other T-Mobile
Tier2 prepaid
(Sprint/AT&T Strong new-to- Tier2. Competitive (Sprint/AT&T
Carrier
providers
Skew
Watch Family/Kid
Country Lifestyles
Family Oriented Risk Takers movies
Belong to Some Domestic Travel
Lifestyle Enjoy Travel, Attend Sports Games Visit theme parks, water
Veterans Club Enjoy Art, Travel,
Entertaining Enjoy Music, Dancing parks
Gardening
Enjoy Travel, Music
Brand Loyal
Shop at QVC, Land’s Brick & Mortar Shoppers
Online Shoppers Influenced by What’s Hot Impulsive Shoppers
Retail & Shopping End Shop at Rent-A-Center,
Shop at Ann Taylor, and What’s Not Brand Conscious
Catalog Shoppers Tractor Supply Company
Pottery Barn
Tech Engagement
◕ ● ◔ ◐ ●
Share
Share of T-Mobile base 18% 19% 7% 26% 29%
Share of MetroPCS base 13% 12% 8% 39% 28%
T-Mobile customers likely to switch (next 12 months) 11% 11% 7% 11% 20%
MetroPCS customers likely to switch (next 12 months) 14% 18% 12% 14% 22%
T-Mobile/MetroPCS non-customers likely to switch 11% 10% 7% 10% 18%
T-Mobile top-choice consideration (non-customer likely switchers) 11% 7% 6% 10% 9%
MetroPCS top-choice consideration (non-customer likely switchers) 2% 5% 3% 5% 3%
Scope of Leveraged existing syndicated data sets versus new custom research
1
Study • Nielsen Mobile Insights data for core segmentation
• Additional Nielsen assets for profiling and geo-mapping: MRI, SMS, ConneXions and PrimeLocation
T-Mobile Segments: T-Mobile base over-indexes among Mobile Enthusiasts and Budding Aficionados. The least
indexed in T-Mobile base are Entrenched Majors and Mobile Minimalists.
.
Key profiles of the segments: T-Mobile high-index segments skew family and younger, minority, lower income,
largely coastal and urban. Price is the primary trigger for inbound switching from Verizon and AT&T.
Key
2 Churn and tenure may be a concern: Mobile Enthusiasts and Budding Aficionados (indexing high for T-Mobile)
Findings have the highest short-tenured base. A majority of Mobile Enthusiasts within T-Mobile base (51%) and Metro PCS
base (52%) are short-tenure customers (less than 2 years), compared to 28% of the same segment at Verizon and
39% for the industry as a whole.
Ethnicity and churn: Hispanics and African Americans have higher churn rates and lower tenure than Asians and
Caucasians, especially at lower credit class levels.
•Focus on personal and professional •Highlight different ways to •Promote inexpensive, simple plans • Emphasize plans as the most •Emphasis on latest technology and
organization and efficiency communicate via phone without messy contracts, features, economical choice handsets
or variety
Messaging •Emphasize advanced productivity
features
• Highlight easy way to stay
connected with friends and family
•Highlight device capabilities
• Integrated voice, data, •Bundled package with unlimited / •Heavily discounted or free phones – • Discounted, basic phone •Integrated voice, data and Hotspot
Hotspot/Data cards plans high number of minutes and highlighting their basic functionality • Largest “circle” of free calls; earlier plans
Offer • Employee business discounts “social” data plans and overall reliability nighttime calling •Do not need to offer the heaviest
discounts
“Common Market”
“Magenta Land” Techs and the City
Generation WiFi Gearing up
Time Shifters Cyber strivers
The Pragmatics Big City, Small Tech`
Core (8)
SMEs (12)
You & I Tunes Calling Circles
Cinemaniacs Digital Dreamers “Purple Land”
WiFi Warriors IM Nation Landline Living
Bundled Burbs New Technorati Discounts & Deals
Plugged-in Families Video Vistas The Unconnected
Cyber Sophisticates Tech Nesters Last to Adopt
Stretch (6)
SMMs (6)
Broadband Boulevards Video Homebodies
High-Tech Society Leisurely Adopters
Opting Out Plug & Play
Techtown Lites
Antenna Land
Overlap Reduction
Reduce Magenta/Purple People overlap3
Magenta
People
1
A ConneXions segment is identified as Magenta Core if (a) the segment meets a threshold index of 120 when comparing that segment’s share within
T-Mobile customer base to its share of industry base; or (b) the segment scores a minimum index of 110 AND has at least 2% share of T-Mobile base
2
A segment is identified as Magenta Stretch if T-Mobile has gross adds share (SoGA) within that segment that is >= T-Mobile’s overall SoGA
3
A segment overlapping as Magenta and Purple is assigned as primary target to the cohort with >= 60% share of the segment based on a logistic model
(variables used for model: gender, age, ethnicity, income, credit status, education, employment, wireless plan type, number of lines, wireless spend)
Note: Leveraging Nielsen Mobile Insights adds wireless deep-dive and enables individual-level target profiling
ME | Marketing, Pricing & Analytics T-Mobile Confidential 26
ME | Marketing, Pricing & Analytics T-Mobile Confidential 27
ConneXions
Tech Nests
New
Technorati
Text Legend: Original Purple People SMEs and SMMs
Note: A segment overlapping as Magenta and Purple is assigned as primary target to the cohort with >= 60% share of the segment based on a
logistic model
Consumer status (low- or mid-scale+) determined by: income, credit status, education, employment, number of lines and wireless spend
Segment Key: Magenta Media Magenta Stretch Magenta Non-Media Purple Secondary Purple Primary Common Market Non-Target
Data Source: Nielsen SMS data, BCMI Analysis/Depiction Bubble area depicts segment market size, scaled to 10%
Acquisition: Use for improved targeting in DM campaigns to drive response rate, lift and ROI
Retention: Apply in customer analytics for better understanding of pain points within base to mitigate potential deacts
Marketing
A Usage
Cross sell: Improve understanding of incremental needs within base to help drive less risky cross sell initiatives
Local: Better understand local market Magenta opportunity and improve trade area analyses
Media buying: Refine media plan against specific Magenta segments, nationally or in specific markets
Beyond
C Marketing
Incorporate in propensity modeling to drive network investment
Verizon dominates the Magenta Stretch base (45% postpaid, 41% blended), but not so much gross
adds (30% postpaid, 26% blended)
Magenta Stretch make up a sizable share of Verizon and AT&T likely switchers (13%); T-Mobile is
top consideration for a significant proportion of Magenta Stretch likely switchers (17%)
Like other segments, Magenta Stretch are strongly value-driven in carrier selection, although they
also place premium on network reliability
Magenta Stretch are affluent, upscale, tech savvy and spend the most on wireless
Magenta Stretch tend to be fairly loyal, but more likely to leave due to poor network experience
Improving T-Mobile’s Magenta Stretch composition could help as bulwark against growing prepaid
threat
ME | Marketing, Pricing & Analytics T-Mobile Confidential 33
Magenta Media Target Age
• Current media buying target skews millennial: 18-34
• Magenta media target includes 18-34 (43%) and 35-54 (39%), suggesting a different buying skew
• Decision needed to change media buying age skew to 18-49
The value segment analysis in this report was based on the final segmentation results
Plan switchers are mainly value-oriented, with price given as the primary (though not exclusive)
Profiles & reason for choosing a new carrier
2 Motivations Postpaid-to-prepaid switchers: skew older, are less Caucasian, less educated, less fulltime-employed,
tilt more subprime, and have lower income than proto-postpaid switchers
Prepaid Prepaid brand perception has steadily improved in recent time, probably helping to drive post-to-
3 Perception pre switching
Sprint Sprint %age contribution to Metro’s inbound gross adds has doubled since the Sprint attack
4 Attack campaign, growing from 7% to 15%
More of prepaid gross adds remain within the prepaid category versus converting to postpaid
Price Shoppers dominate post-to-pre switching, but Value Optimizer interest has increased
Cross-plan switching makes up a small but notable component of industry switching, more prominently within Tier II carrier base
Among Top 4, T-Mobile and AT&T show slightly higher vulnerability, possibly due to having strong prepaid brands (Metro and Cricket)
Among Big 4, T-Mobile has the highest percent of in-brand plan switchers (11%) changing from individual postpaid to prepaid
T-Mobile has even stronger Big 4 lead (at 37%) among individual postpaid subs switching to corporate brand prepaid
This dynamic could be intensified by TMO One, and is potentially ARPU dilutive – though possibly margin accretive
AT&T and Verizon contribute 58% of post-to-pre plan/carrier switchers, vs 18% by T-Mobile
Metro and Cricket are the primary destinations of post-to-pre switchers, jointly accounting for 51%
T-Mobile and AT&T contribute more pre-to-post switchers than their share of prepaid base
T-Mobile’s share of outbound pre-to-post switchers is higher than its share of postpaid base; Verizon is less
ME | Marketing, Pricing & Analytics T-Mobile Confidential 40
4. Multicultural Profiling
Base penetration is high: Multicultural penetration of T-Mobile base is far higher than multicultural share of
competitive base.
.
Credit status is mixed: Asians have the highest prime base of any ethnic group, but Hispanics and African
Americans trail Caucasians on prime composition.
Key
2 Findings on Churn and tenure may be a concern: Hispanics and African Americans have higher churn rates and lower tenure
Multicuturals than Asians and Caucasians, especially at lower credit class levels.
But economics are attractive: On the whole, Multiculturals have fairly compelling economics; Asians and
Hispanics in particular have more lines per account, better margins, and hence higher CLV than Caucasians.
Drivers: Churn and tenure vary more by credit class; margin by number of lines; and CLV by both.
Report finds ethnicity to be an important differentiator of customer value, if mediated by such factors as number
Conclusions & of lines, credit status, age, urbanicity, and so on.
3
Recommendations
Multicultural targeting should skew towards prime and multiple lines to drive stickiness and higher value.
Whites have a dominant share of competitive base: versus T-Mobile, greater by 34 ppts at VZ and 25 ppts at
AT&T US Adult Population and Ethnic Share Postpaid Base Penetration: TMO vs. Competition
Merkle Pop Merkle Pop Percent T-Mobile Verizon AT&T Sprint
180,000,000
154.3m 70% 80% 76%
64%
160,000,000 70%
60% 67%
140,000,000
60%
50% 54%
120,000,000
Population Share
50%
100,000,000 New model 40% 42%
corrects 40%
80,000,000 2018 pop 30%
29%
37.4m 30%
60,000,000 27.3m
21%
10.5m 20%
20% 17%
40,000,000 16% 15% 15%
11% 10% 10% 11%
10% 10%
20,000,000 7%
4% 6% 5%
4%
0 0% 0%
Hispanic African American Asian White Hispanic African American Asian White
ME46 if Mega_Internet_Use IS ONE OF: 0 AND Electronics_bkt IS ONE OF: N AND Density IS ONE OF: RURAL OR SMALL_TOWN, SUBURBAN_FRINGE AND Age_bkt2 >= 7 or MISSING
| Marketing, Pricing & Analytics T-Mobile Confidential 1% 0.2050
Study identifies consumer segments currently underrepresented in T-Mobile base
(mirrors findings in Feb SLT deck).
Scope of
1 Study
It leverages Merkle population data, joined with T-Mobile’s unit economics, and it is
complemented by Mobile Insight data to provide refinements to the segmentation.
Key
2 Findings
The couple segments have slightly better margins and CLV than T-Mobile base.
Opportunity also exists to offer AVDs to potential customers working with select
companies in under-penetrated areas to increase sales while minimizing dilution from
intrinsic customers
300.0
63%
250.0 55%
47%
Base in Millions
43%
200.0 2.2 2.8 2.1 3.5
150.1
89.5
150.0
239.56
100.0 139.5
50.0
0.0 Total Qualified Base Non-STR Caucasian High Non-STR Caucasian High STR Caucasian Single STR Caucasian Couple Other
National
National HHBase
Base Without
Without Network
Network/Distribution
or Distribution Income Single Income Couple
(red)
Non-STR Caucasian Single Non-STR Caucasian Couple
STR Caucasian Single STR Caucasian Couple Other
High Income High Income
Penetration Index
47% 63% 43% 55% 104%
(Segment Avg. / Population Avg.)
Size (Individuals/Adults) 2.2M 2.8M 2.1M 3.5M 139.5M
Age Less than 55 yrs old Mixed
Ethnicity Caucasian Mixed
Household Size (Adults) Single Two Single Two Mixed
Urban, Suburban core, Urban, Suburban core, Suburban Rural, Small Town, or Rural, Small Town, or
Geographical Density Mixed
Suburban fringe fringe Greenfield Greenfield
Household Income (Median) More than $100K More than $100K Any Any Mixed
Source: Merkle/IDW
ME | Marketing, Pricing & Analytics T-Mobile Confidential 52
Merkle DataSource
DataSource leverages
analytics to mine data
DataSource aggregates the variables and create DataSource empowers
best-of-the-best data from powerful derived elements profitable marketing
dozens of sources allowing for targeting decisions
and segmenting
audiences effectively
30+ 2500+ Daily new Direct linkage keys with Monthly new 95%+ 90-95%
major data selectable data movers Audience Platforms & homeowners coverage of all match rates on data
sources elements Datalogix/Oracle U.S. households overlay applications
Data Enrichment | Phone Append | Reverse Phone Append | E-append | Reverse E-append | eCOA | Onboarding |
Applications
Market Sizing | Lists / Audience Selection | Segmentation | Predictive Modeling | Syndication | Real-time Marketing
Online
Dig
ital
Target Variable
— As there is no standing variable in DataSource to indicate T-Mobile membership, Merkle should create a target variable flagging current T-Mobile
customers by matching current-customer Household_ID from T-Mobile database back to DataSource records
— A target variable should be created, with successful matches (T-Mobile customers) coded as (1) and non-matches (non-customers) coded as (0)
— Merkle should use this new customer membership variable as the target variable in the test, after confirming match rate to T-Mobile
Sample
— The overall sample for the test should be 150k, comprising 24k T-Mobile customers and 126k non-customers, in line with T-Mobile’s market share
— To achieve the above split against the expected match rate, Merkle should obtain an initial pull of 500k (80k T-Mobile and 420k non-customers)
• The initial T-Mobile customer pull (80k) should be randomly selected, and specified as 90% postpaid and 10% prepaid to reflect T-Mobile base split
• Once secured, the T-Mobile pull should be matched to DataSource records and then combined with the random sample of non-customers (420k)
— Merkle should build the final sample dataset from the over-sampled pull, and review this with T-Mobile before proceeding with the test
Descriptive Analysis
— Once test sample is set, Merkle should run initial descriptive analyses to obtain the distributions of valid and missing cases for each DS variable
• For categorical variables: obtain the distinct values, count for each category and number/percentage of missing values
• For continuous variables: obtain the distributions, distinct values and number/percentage of missing values
— Merkle should send the descriptive analyses results to T-Mobile for review and further guidance before proceeding with the test
ME | Marketing, Pricing & Analytics T-Mobile Confidential 58
Merkle should use the following steps to prepare the DataSource variables for test modelling:
A: Categorical Variables
SAS 9.4 offers some options to specify the method of parameterization and reference category for logistic models, the most common being effect coding
Merkle should use the effect coding method to create dummies for each categorical variable, creating k-1 dummies for each variable (where k=number of
levels contained within each categorical variable)
— Merkle should specify the effect coding parameterization and reference category for each variable using the CLASS statement in SAS along with the PARAM=EFFECT and
PARAM=REF REF=“<referencename>” options in the logistic step (note that effect coding is also the default in SAS and will be selected automatically even if the
PARAM=EFFECT option is not specified)
Merkle should create a single dummy variable for all the missing cases
For all categorical variables, Merkle should perform a weight of evidence (WOE) transformation and obtain the information value (IV) estimate for each
variable using the SAS proc hpbin procedure (WOE = LN(% of non-events ➗ % of events), and IV = ∑ (% of non-events - % of events) * WOE)
B: Continuous Variables
Merkle should transform all continuous variables into bins, taking the following steps:
1. Create a single dummy variable for all the missing cases for each variable (the proportion of missing cases will be known from the descriptive analysis)
2. Bin all remaining valid cases, with the bins generated by grouping records into deciles (such that the total number of bins = % of valid cases ➗ 10)
• For example, if a continuous variable has 90% valid cases, Merkle should specify nine bins using the SAS proc rank procedure and specify groups=9 in the program
3. Once the bins for valid cases are created, Merkle should explore the distributions and, if extreme skewness is detected, use the SAS proc transreg power transformation
procedure to optimize the scaling and reduce skewness, so as to improve the predictive power of the variable
As with categorical variables, Merkle should also perform weight of evidence transformation and obtain the information value estimate for each
variable, again using the SAS proc hpbin algorithm but this time with the SAS numbin option
The following example shows the program steps required to produce the ROC Curve output, among others (separate runs for age and income):
Age Income
*/ Data creation (actual step will use server data);
data MerkleTest;
input customer income age;
datalines;
0 140 25
0 200 35
0 190 45
0 180 55
1 120 65
0 190 45
1 180 55
1 120 65
0 190 45
0 180 55
1 120 65
1 170 75;
*/Logistic step specifying ROC output;
ods graphics on;
proc logistic data=MerkleTest
plots(only)=(roc(id=obs)effect);
model customer=age / scale=none
clparm=wald
clodds=pl
rsquare
ctable
lackfit;
units age=10;
run;
ods graphics off;
SAS will not produce the data for the ROC curve (i.e. the coordinates of sensitivity and specificity) automatically; this can be generated by specifying:
— outroc=ROCTable (this outputs a table named “ROCTable” with columns for sensitivity and specificity)
Merkle should generate a classification table to provide data that we can use to perform further model evaluation analysis such as accuracy tests:
— Merkle should specify the ctable option in the logistic step to generate an actual-by-predicted classification table, using the pprob= option to specify the probability
levels for the classification table: pprob= (0.5, 0.6, 0.7, 0.8, 0.9, 1) should be specified
— Merkle can also generate a classification table (confusion matrix) by specifying the PREDPROBS=INDIVIDUAL option in the OUTPUT statement (e.g. output
out=<outputfilename>), and then use the PROC FREQ statement in reference to the output file (data= <outputfilename>) to generate the confusion matrix. However,
we recommend the ctable option since it allows for multiple levels of probability to be specified.
As additional measures of model performance, Merkle should generate the gain and lift statistics to estimate and report the following metrics:
1. “Positive Lift/Gain %” to show the cumulative proportion of predicted true positives gained in the propensity deciles with higher lift scores than the random baseline
2. “Cumulative Lift” representing the combined lift estimate for propensity deciles with higher lift scores than the random baseline
Merkle will need to use the SAS GainLift macro to generate the lift and gain statistics in logistic regression, as there is seemingly no standard
procedure for this in Base SAS
— Merkle can download the macro text file for this procedure from here, and if necessary they can review and become familiar with the analytic process here
• Some guidance to Merkle analyst, if it helps: Save the macro to a location on your desktop and then do a %inc ‘location of macro code.sas’; or run it before the data step. Then to
invoke the macro you call %gainlift(arguments as needed)
Finally, Merkle should obtain other test statistics in the model output, including: likelihood ratio test results, AIC (Akaike information criterion), SC
(Schwarz criterion), Wald, R2, Hosmer-Lemeshow, and Gini Co-efficient (this can be calculated manually from the AUC: GiniCoeff = 2*AUC-1)
ME | Marketing, Pricing & Analytics T-Mobile Confidential 61
Summary Report & Model Outputs
T-Mobile expects Merkle to deliver a summary report covering the key statistics for each of the tested DataSource variables.
The summary report should be delivered as an Excel spreadsheet with the following columns:
RO Positive Sig.
Variable Population Gini Information Cumulative Likelihood Odd Ratio Profile
Name Coverage CA Value Lift/Gain Lift (p- C-Statistic AIC SC Ratio
Wald
Likelihood Estimate
Coefficient
UC % value)
MS0001
MS0002
MS0003
MS1000
T-Mobile also expects Merkle to send over all the key model outputs from SAS. We recognize that this could be cumbersome. As such, we
will work with Merkle as testing progresses to determine which specific outputs T-Mobile will require.
Merkle should set up a call or onsite visit to review the results with T-Mobile after they have delivered the report.
Source: http://cancerdiscovery.aacrjournals.org/content/3/2/148
Source: Wikipedia
Tested Variables
Merkle Source Count — T-Mobile does not have access to the entire Merkle
Number of Tested Premium DataSource Variables 1,011 DataSource variables, only the few that have been
Number of Premium DataSource Variables dropped due to missing threshold of 85% 77 onboarded to the MDB platform. As such, Merkle
Total Tested and Untested Premium DataSource Variables 1,088
Number of Tested Core DataSource Variables 224 analysts were engaged to perform this variable test
Number of Core DataSource Variables dropped due to missing threshold of 85% 142
Total Tested and Untested Core DataSource Variables 366 — Merkle ended up testing the entire set of variables in
Number of Tested Marketplace DataSource Variables 40
Number of Marketplace DataSource Variables dropped due to missing threshold of 85% 0 their master database, including the premium list and
Total Tested and Untested Marketplace DataSource Variables 40 other stand-alone set not in their primary database
Number of Tested DataSource Variables 1,275
Number of Marketplace DataSource Variables dropped due to missing threshold of 85% 219 — Merkle had a total of 1,494 variables from the
Total Tested and Untested DataSource Variables 1,494
aggregated databases; after applying an 85% missing
value filter, 219 were dropped and 1,275 were tested
Sample
Test Sample Percent Count TMUS Sample Percent Count
Prospect 77.6% 116,400 T-Mobile 70% 23,520
TMUS (TMO & 22.4% 33,600 MetroPCS 30% 10,080
Metro) Total TMUS Sample 100% 33,600
Total Test Sample 100% 150,000
ME | Marketing, Pricing & Analytics T-Mobile Confidential 66
We have several metrics with which to evaluate the variables. Some evaluate goodness of the model, some assess independent variable and dependent variable
correlations, some assess significance, and some assess the actual predictive power of the variables
Among the actual predictors, some like Wald are parametric so should probably not be used
This leaves us mainly with AUC, IV, Lift/Gain, and Gini to evaluate the variables; AUC and IV are the most important
To evaluate, we need to isolate from the rest the 100+ variables that are already loaded into IDW and deployable in segmentation model
We’ve worked with Merkle to identify the onboarded 100+ in their variable test output, but for a variety of reasons, they can only match 48 of the 122 variables (39.3%)
For this reason, we recommend the following approach for the immediate segmentation input:
Retain the ad-hoc segmentation (10 segments) input variables if they meet the theoretical threshold of 0.5 AUC
— Age
— HH Income
— Presence of Children
Incorporate additional variables from the cohort of onboarded 100+ , if they meet the threshold of 0.65 AUC and/or 0.3 IV, as long as test result was statistically significant
— Ethnicity
— Urbanicity
— HH spending/wireless spend
For the remaining 100+ cohort not in the Merkle test, load them into the segmentation and have the model evaluate their utility
Work with the architecture team and Merkle to onboard the additional variables
Update segmentation after the next batch of variables have been onboarded
Variable test
— Determine the relationships among input variables and assess the predictive power of input data against named
targets (targets = TMO Membership in MDB and Likelihood to Switch in MI)
Behavioral segmentation
— Leverages advanced algorithm and wide range of demographic, behavioral and psychographic variables to
build market-level segmentation
ME | Marketing, Pricing & Analytics T-Mobile Confidential 74