Statistics in the Business

Statistics TEPE Group 1

5910751485 Warapong
5910755627 Techin
5910756062 Chaiyatorn
Company Profile
Batex International Co., Ltd
บริษัท บาเท็กซ อินเตอรเนชั่นแนล จํากัด
Founded: 2001
Industry: B2B Chemical Trade, Import, Export
Plastics industry, Industrial detergents

Revenue (2018): 75 Million Baht

Employees: 130 (Office) 720 (Factory)
Name: Ms. Nutkamol Rerkkasemsan
Education: BSc. Chemistry @Thammasat (2002)
MSc. Chemical Engineering @KMUTNB (2004)
MSc. Business Statistics @STOU (2005)
Position: Head of Sales at Batex International Co., Ltd
Job Description: - Manage the sales workforce
- Perform business analytics, market research
to predict and optimize sales
The use of
Statistics in Customer Segmentation
Batex Int’l
Business Statistics is extensively used
to create business models both
internally and externally.
Sales Forecasting
-Enterprise Optimization
-Marketing Analytics
-Pricing Analytics
-Risk and Credit
-Supply Chain HR Analytics
-Transportation Analytics
Customer Segmentation

The company wants to divide up the 3,000+ customer base into

five clusters to better suit the customer’s needs

● Factories and Suppliers are surveyed every few years

● Divided customers into clusters with different characteristics
● Customizes sales techniques based on the clusters
Customer Segmentation 1. Obtaining the data

Used connections to survey over 3,000 companies who buy

chemicals and who might be interested in buying chemicals

Questions include
1) Attitude towards chemical trade and use of chemicals (29 Q’s)
2) Demographics (5 Q’s)
3) Demand price of chemicals (8 Q’s)
Customer Segmentation 2. Factor Analysis
To simplify the 29 questions into meaningful factors

A Correlation Matrix is used to see whether two questions have correlated answers or
are independent. [1 = correlated, 0 = indep]
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12



Customer Segmentation 2. Factor Analysis
The Eigenvalues and variances are calculated for each question. A Scree Plot is drawn
To see how many questions are needed to explain the behavior of customers

5 factors would explain 50% of the customer’s

behaviors, but the team decided to use 10 factors
as it explains 66.61% of the behavior.
Customer Segmentation 3. Extracting Factors
The company then combines the 29 questions into the 10 factors by combining the
most related questions together. An Extraction and Rotation Analysis based on the
factor analysis table is used. Each question in the factor is weighted by the company.

Factor Questions Factor Questions

1 1 (30%) 2 (25%) 17(25%) 26(20%) 6 8 14 23 24 (25%)

2 3 (40%) 7(40%) 16 (20%) 7 10 (50%) 15(25%) 25(25%)

3 4 18 20 (33.3%) 8 9 27 28 29 (25%)

4 5 19 (50%) 9 11

5 6 21 22 (33.3%) 10 12 13 (50%)
Customer Segmentation 3. Extracting Factors
The survey data is then rewritten in terms of the 10 factors.
Example data of 5 companies:
Customer Segmentation 4. Clustering
A Dendrogram is drawn to obtain the number of clusters. It shows how different customers
answers each question. The R command hclust(*, “ward”) is used.



Q3 Based on the dendrogram, The

company decides to divide the
Q4 customers into five clusters
based on how similar their answers are
Customer Segmentation 4. Summarizing Data
The company summarizes the survey results according to each cluster. Now the company
can now train the salespeople according to the results.

Sales Forecasting Sales Statistics

The marketing manager wants to forecast total sales

And want to understand which factors influence them
(1) to plan the sales tactics
(2) to give an idea of staffing required
(3) to allocate money to advertising, promotions
(4) to make better policy decisions concerning price, advertising,
and product development spending
Sales Forecasting Sales Statistics
Data from the customer is measured against the amount of sales of chemicals.
Sales is the dependent variable, and the other variables are Independent
Sales Forecasting 1. Correlation
The company measures which variables are most correlated to sales
R-squared value= [1=most correlated 0=uncorrelated/independent]
Sales Forecasting 1. Correlation
Graphs are also used in conjunction to see the relationship.
This is the dependent variable vs the first independent variable(PDI VS SALES)

Sales Forecasting 2. Regression Model

Al the independent variables are combined to form a regression model

So the company can predict the sales based on the variable inputs. Model:
b1*PDI + b2*DEALS + b3*PRICE + b4*R.D + b5*INVEST + b6*ADVERTIS + b7*EXPENSE + b8*TOTINDAD
Sales Forecasting 2. Regression Model
Sales = 3027.6336 + 3.3723PDI + 4.6953DEALS - 18.1112PRICE - 9.9033RD + 1.6895INVEST+8.2907ADVERETI
+ 4.4434EXPENSE-0.4427TOTINDAD
Sales Forecasting 3. Checking Model

A Residual plot is drawn to make sure our model is correct and valid throughout the entire
data set. Random residual plot = model is probably okay
HR Analytics

The company has sales that travel around the country.

The biggest problem is attrition (lost of motivation and leaving)
which leads to damage for the company

The company produces a model which predicts whether any

given employee has attrition and is at risk of resigning

in order to better understand why employees want to leave

HR Analytics 1. Collecting Data

These data of every employee is collected:


1=resigning 1=None 2=Freq
0=Not Resig 3=Rarely
HR Analytics 1. Collecting Data

The data is divided into Attrition (Employees who have resigned or have considered)
And non-attrition(Employees who don’t want to resign)

The Attrition data is considered

HR Analytics 2. Logistic Regression

The logistic regression takes the variables and outputs a probability value
(1=resigning 0=not resigning)
HR Analytics 3. CART Analysis

This is similar to a probability tree diagram which takes in each of the variables and outputs
the confidence level that the employee will leave orange=stay blue = leave
HR Analytics 3. Checking Model

The HIT Analysis is used to produce the probability of correctly identifying whether the
employee will leave or not

These are only three of the methods the company used. There are around 15 methods that
the company uses including: ROC, Log Regression, Confusion Matrix, Gains Analysis

But the company is moving towards a machine learning model.

The Importance of Statistics
● Needed in every part of the business
From HR to manufacturing to sales to quality control
● Allows the company to simplify big data into simple analysis
● Aids in quick decision making with a high confidence level
Thank You

