Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

October 15, 2012

Customer Analysis
By using self-organizing maps

Thomas Asikis (70005), Konstantinos Stavrou(70134) asikis.thomas@gmail.com |


kn.stavrou@gmail.com

Customer SOM Analysis


Executive Summary
A quick Intro
In this document we will describe the customers and try to identify possible characteristics they share. In order to that we used Self Organizing Maps generated by the software Viscovery SOMine. We will focus more on churn and the characteristics of the people who tend to change their subscription to another mobile company. In total we generated and used 4 SOM models, but we will use one as our primary model.

Churn
Churn is the most crucial factor of this report. In a total of 4 segments, it shows up in 2 groups. These 2 groups share the 14.79% (Image 1) of total customers. Also the group 3 has high values of calls and charge, which makes it a promising group for profits. We need to focus especially on this group and try to reduce the churn rate. The high values of average call time and total day calls, accompanied with high charge, indicate that this customers seem to be unsatisfied with the amount of money they pay, and Image 1 the distribution of that is why they leave. As for segment 4 we can clearly see that customers in segments they are customers with low rates and high customer service number. This shows us that they probably churned due to dissatisfaction with the customer service or that they face problems that werent solved correctly.

Predictions
By using the new customers in the main SOM (images 8 & 9), we reached the following conclusion: 2 out of 5 customers will be placed in Segment 1 Loyal & Low Revenue Customers. 1 customer will be placed in Trusting Newcomers. 1 customer will be placed Segment 3 Leaving Profit. 1 customer will be placed Segment 4 Unsatisfied Newcomers.

This means that 2 out of 5 customers will churn. The customer in segment 4 is likely going to call the customer service. Customer 3323 belongs to segment 3 He seems to have a lot of calls and minutes of call time. He also has a high average time per call and total charge. He has churned and he is likely to churn again. The rest of the customers belong to the most solid groups. The customers that belong in the other groups follow the characteristics that we described in each one of them.

Conclusion
After reading the whole report and the data we got from the SOMs we reached the following conclusions: Segment 3 which has the highest values of average call time and charge, is a group with churn. We should try to reverse that. Since that group has the most calls in day and night, we should try to offer them economic packages for this times of the day. Since they have big charges we could also try to offer them some special offers, in order to help them save money. Segment 4 is populated by customers, who have low values on calls and charge but excessively high number of customer service calls. This means that those customers face issues with our services and they call customer services for a solution. Since they churn, we can assume that the customer service could not help them find a solution or that the solution offered did not satisfy the customers. Since this segment has a small group of customers, we can ignore it. If we chose not to do that, we can focus on customer service and the feedback we get from the customers about it. Segment 1 is our most loyal group, and a big source of revenue. It is also a solid group without big deviations from the average. This means that no present actions are needed, since it is a segment with pretty good values. Segment 2 is the promising group. It has customers who are not likely to churn and also have the second highest total charge, which means that they are profitable for us. We can try to make them more loyal and establish a trusting relationship with them by offering them lower prices in evening and night call minutes, which are the biggest part of their total calls.

By seeing the segmentation of the new customers, we can say that we have high churn ratio about them.2 out of 5 customers are likely going to leave. In order to avoid them we should try to make them better contracts, and especially to the customer who belong to segment 3, because he has a high profitability profile.

Customers Profiles
After careful study of the models and the corresponding SOMs (images 3, 4, 5, 6, 7 in Appendix), we reached some conclusions about the customers and the following customer profiles that emerged: Segment 1 Loyal & Low Revenue Customers. The biggest group of customers. They have the highest number of calls than the other groups. Customers with low profitability, due to their low charge and small number of calls. They dont churn. The data about them doesnt show big deviation in general. They also have a much bigger account length than the other customer groups. This confirms their loyalty and also makes them a predictable and stable customer group. Segment 2 Trusting Newcomers. Customers with mostly low account lengths. They dont churn. Most of the data values are near the average values. They use the customer service more than the segments 1, 3. This show us that as newcomers they tend to call the service calls, due to issues they may face. They are an important group, because they can be tomorrows Loyal & Low Revenue Customers. Segment 3 Leaving Profit. Customers with the highest charge, highest average call time and pretty high number of calls. They tend to talk more in daytime. Their average call time is quite big. They churn. This group needs an important focus, because if the churn is eliminated, then the revenue will probably increase. Segment 4 Unsatisfied Newcomers. Apart from being the smallest group of customers and also having a high churn ratio, this group shows some interesting statistics. The calls and charge numbers are lower than the average. However the total number of service calls and its respective percentage to the total calls is exceptionally high. This show us that these customers face issues, which cannot be solved by our customer service in a satisfactory way. This groups statistics may indicate some problems about the way the customer service operates

By reading the profiles above we can see that, the customers who churned are the ones with the highest average time per call and also the highest charge. On the contrary the loyal customers have the biggest number of calls but not the highest charge (see table 2 in appendix). Most time spent in calls (for day, night and evening) has been made by the customers of the 3rd segment. The first 2 segments have the most international calls. The first segment has the most calls in the evening.

SOM Development
Model: How it was developed
First of all we examined carefully the attributes and we got acquainted with the problem. Then before messing with the tool we tried to think what other attributes we could produce from those that were already given to us. So we came up with total calls, total minutes, total charge, CSP (Call Service Percentage from the total calls which were made to the customer service), average call time and ICP (International Calls Percentage from the total calls which were international ones). We think CSP is important because it shows us from the total calls someone made which were directed to the customer service. Then if that customer changed company we will know that maybe there was a problem with the customer service or there were many problems with his connection (and thus made frequent calls to customer service). ICP is not that important but it can show us details about customers that make a lot of international calls. Usually these customers are a unique group with many common characteristics and it would be useful to study it. Also the total minutes that someone talked on the phone are important, maybe more important than total calls because it shows the actual usage of the network. And so we also use the average call time which encompasses both the total calls and the total minutes so that we can see for each customer how much he talks on the phone on average. About the tuning of the attributes. We started cutting of some outliers but we thought that it might possibly be important date that we dont want to lose so we only used a logarithmic transformation on ICP and CSP so as for the values to become smoother. We want to study even outliers with a huge difference from the other values because they might hide important data. Concerning the training of the map we used 500 nodes and an accurate training because we dont have that much data to worry about the speed. Also we used all the variables in the training of the map because we believe that every variable holds a secret pattern, but of course every variable is not as important for the reason we study the map. The most important ones for our case are account length, churn, average call time and the number of customer service calls. Because these attributes can show us important details about those who left (e.g. if they made lot of calls to the customer service, if they talked a lot and how long they had the account. Of course there are more variables that are important but we have to focus on some so as to not clutter our research. The following priorities are set as on the following table:

Table of variable weights

Attributes Account length Number vmail messages Total day minutes Total eve minutes Total night minutes Total day calls Total eve calls Total night calls Total day charge Total eve charge Total night charge Total intl minutes Total intl calls Total intl charge Number customer service calls churn Total calls Total charge Total minutes Average call time Customer Service Percentage International Calls Percentage

Priority 1,00 0,20 0,90 0,85 0,80 0,80 0,70 0,75 0,70 0,65 0,70 0,60 0,50 0,40 1,00 1,00 0,80 0,70 0,90 1,00 0,70 0,50

Table 1 Table f variable weights for our main model.

Basic and Complementary Models


No priorities Model
This model messes up with our research. We can see that the big cluster with customers that churned still stands out but the other one is divided into two other clusters molding with the rest of the customers. Also this model focuses a lot on vmail messages which is of no big concern to us. Still we can see here that total charge is still gathered in the big cluster with customers that churned.

No new variables no priorities


In this model things are looking better with customers that changed company gathered in one segment but most of the segments are somehow small and dont reveal too much about our customers behavior, because the date is too spread out and homogenous. Also this model focuses more on the vmail messages again and on the international calls which still isnt that much important for us.

No new variables - priorities


Yet again the customers that churned are gathered in one segment but this time the vmail messages and the international calls are spread throughout the clusters. So far so good. We get some more details than from the other models and that is because with our priorities set we can focus more on the important parts of our data. Customer service calls and its percentage seem more stable and divided between two clusters. But many other attributes that are still important are divided and do not help that much in conclusion making.

Main model: Both new variables and priorities


With everything in the game and with all the priorities set we get a nice model to work on. Still not so clear but it is focused in what we actually want to consider. Why our customers churn, how they used our network and how much they were charged. These facts are easily distinguished in the map and much more precise conclusions can be made.

Appendix
The arrangement of segment inside the SOM. The segments define the customer profiles, because the segmentation was done according to their attributes. S 1 Loyal & Low Revenue Customers. S 2 Trusting Newcomers. S 3 Leaving Profit. S 4 Unsatisfied Newcomers.

Image 2 SOM of the main model.

Below is the diagram which describes the contribution of the customers inside the SOM. It was used to make our basic observations and predictions about the customer segments and also was used a long with the maps to reach better conclusions.

Image 3 Data distribution across segments.

Image 4 churn on the SOM. Red represents the customers who churned and blue the customers who didnt.

Image 5 Average call time SOM, Red represents high values and blue represents low.

Image 5 Total calls SOM, Red represents high values and blue represents low.

Image 4 Total charge SOM Red represents high values and blue represents low.

Image 9 SOM with churn for new customers. Image 8 SOM Maps with new customer ids as labels.

Action Segment Description Code S1 S2 S3 S4

account Frequency length 51,34% 34,17% 9,81% 4,68% 113,7 81,4 101,8 104,6 total day charge 28,86 31,16 40,69 23,63

number customer churn service calls 0,985 2,147 1,278 4,224 total night charge 8,71 9,46 9,52 8,63 0 0 1 1

average total total call day eve time minutes minutes 1,85 2,081 2,258 1,761 169,8 183,3 239,3 139 194,6 205,8 225,4 185,1

Segment S1 S2 S3 S4

total night minutes 193,5 210,1 211,6 191,9

total calls 310,1 297,1 308,7 301,8

total night calls 100,5 99,4 101,6 97,9

total eve calls 101,7 97,6 101 99,7

total charge

CSP

total minutes 568 609,3 687,2 526,4

56,87 0,00318 60,82 0,0073 72,29 0,00418 50,82 0,01413

total eve Segment charge S1 S2 S3 S4 16,54 17,49 19,16 15,73

total total intl intl calls minutes 10,26 10,01 10,82 10,45 4,54 4,52 4,38 3,71

ICP

total intl charge 2,771 2,702 2,922 2,821

number total vmail day messages calls 8,78 8,34 3,94 7,59 103,4 95,7 101,8 100,4

0,01471 0,01541 0,01426 0,01236

Table 2 Statistical data for each cluster/segment/group of customers.

You might also like