SVM ML

Contents
• What is Machine Learning(ML)

• Evolution of ML
• What is VUCA
• Roll of ML for VUCA Management
• Different ML techniques
• Application of ML in E-commerce
• Conclusion
What is Machine Learning?
• A branch of artificial intelligence, concerned with the design and
development of algorithms that allow computers to evolve behaviors
based on empirical data.
• ML comes under data science which is an umbrella term covering all

aspect of data processing like data analysis, preparation and useful
decision making like customer segmentation, recommender,
personalization, customization in e-commerce market.
• ML is a technique responsible for developing algorithms and programs

that can learn on their own without human intervention.
• In 1959, Arthur Samuel, an American pioneer in the field of computer

gaming and artificial intelligence coined the term “Machine Learning”
while working at IBM.
Evolution of ML
What is VUCA?
• VUCA is an acronym that was first introduced in 1991 by the U.S. Army as a result of the extreme
conditions in Afghanistan and Iraq.
• It represents:
• Volatility
• Uncertainty
• Complexity
• Ambiguity
• These conditions were totally new and totally changed the nature of warfare. Unsurprisingly,
today’s business environment has changed in a very similar way. Like warfare, business in the
21st century will never be the same again.
• Considering the ever dynamic market conditions, which are subjected to change and challenges
the intelligent machines which can adapt will only survive.
• These conditions can be analyzed using “VUCA”: Volatility, Uncertainty, Complexity and
Ambiguity.
• VUCA will guide in identifying, preparing and getting ready for the potential events in all the 4
categories mentioned.
Continue
Roll of Machine Learning in VUCA
Management
• To predict the risk factors and take appropriate decision according to it in an
organization, tends to impossible for a human brain.
• Humans are natural pattern seekers and problem solvers, while machines are
fantastic at performing billions and trillions of calculations per second.
• Today’s machines are trained by expert machine learning techniques (rules);

that includes software packages, so that proper decision can be taken.
• Once some data are fed to the system, analytics does everything that one’s
instinct would do.
• It interprets the available data, predicts what’s going to happen, and makes a
decision based on its prediction. Like an human being, analytics improves
with experience.
Continue
• A great business decision utilizes both human input and advanced analytical
tools. This two-pronged approach is the key to form a market-leading
business.
• Machine learning starts with data—more the data, better the results are likely
to be. Having lots of data to work with in many different areas lets the
techniques of machine learning be applied to a broader set of problems.
• Once machine learning has the right data, it uses algorithms to work with the
data, typically applying statistical analysis such as a regression, along with
two-class boosted decision tree and multiclass decision forest etc..
• The goal is simply to determine what combination of machine learning

algorithm and data, generates the most useful results and generate a model for
decision making.
Different ML techniques
• Broadly, there are 3 types of Machine Learning Algorithms, such as
1. Supervised Learning
Examples of Supervised Learning: Regression, Decision Tree, Random Forest,
kNN, Logistic Regression etc.
2. Unsupervised Learning
Examples of Unsupervised Learning: Apriori algorithm, K-means.
3. Reinforcement Learning
Example of Reinforcement Learning: Markov Decision Process
Continue
• Common Machine Learning Algorithms are,
1. Linear Regression
2. Logistic Regression
3. Decision Tree
4. Random Forest
5. SVM(Support-vector machine)
6. Naive Bayes
7. kNN(k-Nearest Neighbour)
8. K-Means
9. Dimensionality Reduction Algorithms
Continue
1. Linear Regression:
• It is used to estimate real values (cost of houses, number of calls, total sales etc.)
based on continuous variables.
• Here, the relationship between independent and dependent variables is

established by fitting a best line.
• This best fit line is known as regression line and represented by a linear equation
Y= a *X + b.
• In this equation:
• Y – Dependent Variable
• a – Slope
• X – Independent variable
• b – Intercept
• These coefficients a and b are derived based on minimizing the sum of squared
difference of distance between data points and regression line.
Continue
2. Logistic Regression:
• It is a classification algorithm.
• It is used to estimate discrete values ( Binary values like 0/1, yes/no,

true/false ) based on given set of independent variable(s).
• In simple words, it predicts the probability of occurrence of an event by

fitting data to a logit function. Hence, it is also known as logit
regression.
• Since, it predicts the probability, its output values lies between 0 and 1.
Continue
3. Decision Tree:
• It is a type of supervised learning algorithm that is mostly used for
classification problems.
• It works for both categorical and continuous dependent variables.
• In this algorithm, we split the population into two or more

homogeneous sets. This is done based on most significant attributes/
independent variables to make as distinct groups as possible.
Continue
• In the image below, we can see that population is classified into four
different groups based on multiple attributes to identify ‘if they will
play or not’.
• To split the population into different heterogeneous groups, it uses

various techniques like Gini, Information Gain, Chi-square, entropy.
Continue
4. Random Forest:
• Random Forest is a trademark term for an ensemble of decision trees.
• In Random Forest, we’ve collection of decision trees (so known as

“Forest”). To classify a new object based on attributes, each tree gives a
classification and we say the tree “votes” for that class. The forest chooses
the classification having the most votes (over all the trees in the forest).
• Each tree is planted & grown as follows:
• If the number of cases in the training set is N, then sample of N cases is taken at
random but with replacement. This sample will be the training set for growing the
tree.
• If there are M input variables, a number m<<M is specified such that at each node,
m variables are selected at random out of the M and the best split on these m is
used to split the node. The value of m is held constant during the forest growing.
• Each tree is grown to the largest extent possible. There is no pruning.
Continue
5. SVM:
• It is a classification method. In this algorithm, we plot each data item as
a point in n-dimensional space (where n is number of features we have)
with the value of each feature being the value of a particular
coordinate.
• For example, if we only had two features like Height and Hair length of
an individual, we’d first plot these two variables in two dimensional
space where each point has two co-ordinates (these co-ordinates are
known as Support Vectors).
Continue
• Now, we will find some line that splits the data between the two
differently classified groups of data.
• This will be the line such that the distances from the closest point in
each of the two groups will be farthest away.
• In the example shown above, the line which splits the data into two
differently classified groups is the black line, since the two closest
points are the farthest apart from the line. This line is our classifier.
Then, depending on where the testing data lands on either side of the
line, that’s what class we can classify the new data as.
Continue
6. Naive Bayes:
• A Naive Bayes classifier assumes that the presence of a particular
feature in a class is unrelated to the presence of any other feature.
• Naive Bayesian model is easy to build and particularly useful for very
large data sets.
• Bayes theorem provides a way of calculating posterior probability P(c|

x) from P(c), P(x) and P(x|c).
Continue
• Where,
• P(c|x) is the posterior probability of class (target) given predictor (attribute).
• P(c) is the prior probability of class.
• P(x|c) is the likelihood which is the probability of predictor given class.
• P(x) is the prior probability of predictor.
• Example: Let’s understand it using an example. Below there is a
training data set of weather and corresponding target variable ‘Play’.
Now, we need to classify whether players will play or not based on
weather condition by following the below steps.
Step 1: Convert the data set to frequency table
Step 2: Create Likelihood table by finding the probabilities like Overcast

probability = 0.29 and probability of playing is 0.64.
Step 3: Now, use Naive Bayesian equation to calculate the posterior probability
for each class. The class with the highest posterior probability is the outcome of
prediction.
Continue
• Problem: Players will play if weather is sunny, is this statement is
correct?
• We can solve it using above discussed method, so P(Yes | Sunny) =

P( Sunny | Yes) * P(Yes) / P (Sunny)
• Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36,

P( Yes)= 9/14 = 0.64
• Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher
probability.
• Naive Bayes uses a similar method to predict the probability of

different class based on various attributes. This algorithm is mostly
used in text classification and with problems having multiple classes.
Continue
Continue
7. kNN (k- Nearest Neighbours):
• It can be used for both classification and regression problems.
• K nearest neighbors is a simple algorithm that stores all available cases

and classifies new cases by a majority vote of its k neighbors.
• The case being assigned to the class is most common amongst its K
nearest neighbors measured by a distance function.
• It is helpful to large and noisy training data.

Continue
8. K-Means:
• It is a type of unsupervised algorithm which solves the clustering problem.
• Its procedure follows a simple and easy way to classify a given data set
through a certain number of clusters (assume k clusters).
• Data points inside a cluster are homogeneous and heterogeneous to peer groups.
• Cluster formation:
• K-means picks k number of points for each cluster known as centroids or mean value
(random).
• Find nearest number of k- centroids and put in cluster.
• Finds the centroid or mean of each cluster based on existing cluster members. Here we
have new centroids.
• As we have new centroids, repeat step 2 and 3. Find the closest distance for each data
point from new centroids and get associated with new k-clusters. Repeat this process
until convergence occurs i.e. centroids does not change or until we get the same mean or
centroid.
Continue
Example:
Given dataset is {2, 3, 4, 10, 11, 12, 20, 25, 30}
And k=2
c1=4 & c2=12 (say)
Step-1:
k1={2, 3, 4} k2={10, 11, 12, 20, 25, 30}
c1=(2+3+4)/3=9/3=3 & c2=(10+11+12+20+25+30)/6=108/6=18
Step-2: c1=3 & c2=18

k1={2, 3, 4, 10} k2={11, 12, 20, 25, 30}
c1=4.75=5 & c2=19.6=20
Step-3: c1=5 & c2=20

k1={2, 3, 4, 10, 11, 12} k2={20, 25, 30}
c1= 7 & c2=25 we are getting same centroid value
i.e. centroids does not change
Step-4: c1= 7 & c2=25
k1={2, 3, 4, 10, 11, 12} k2={20, 25, 30}
Continue
9. Dimensionality Reduction Algorithms:
• E-commerce companies are capturing more details about customer like their
demographics, web crawling history, what they like or dislike, purchase history,
feedback and many others to give them personalized attention more than the
nearest grocery shopkeeper.
• To identify highly significant variable among these huge data, dimensionality

reduction algorithm helps us along with various other algorithms like Decision
Tree, Random Forest, PCA(Particle Component Analysis), Factor Analysis,
Identify based on correlation matrix, missing value ratio and others.
Applications of ML in E-commerce
• Segmentation, Personalization, & Targeting
• Pricing Optimization
• Fraud Protection
• Search Ranking
• Product Recommendations
• Customer Support & Self Service
• Supply & Demand Prediction
Conclusion
• The VUCA perceptions in the corporate world are tumultuous because
they influence the enterprise revenue and the national economy at large.
• Its quite easy to predict that the rapid growth and ever changing digital
market in the recent society and the customer’s inclination towards the
e-commerce world lead to volatility, uncertainty, complexity and
ambiguity in decision making process of an organization.
• ML approach would be the best fit to make daily decisions and

investments to enhance work practice productivity.
• Markets like ecommerce, stocks which are highly competitive, volatile

etc.; and ever changing dynamic learning of machines by using
machine learning techniques, post data analysis through VUCA will be
highly profitable.
References
• Pondel, M., & Korczak, J. (2018, September). Collective Clustering of Marketing Data-
Recommendation System Upsaily. In 2018 Federated Conference on Computer Science and
Information Systems (FedCSIS) (pp. 801-810). IEEE.
• Tavakoli, M., Molavi, M., Masoumi, V., Mobini, M., Etemad, S., & Rahmani, R. (2018,
October). Customer Segmentation and Strategy Development Based on User Behavior
Analysis, RFM Model and Data Mining Techniques: A Case Study. In 2018 IEEE 15th
International Conference on e-Business Engineering (ICEBE) (pp. 119-126). IEEE.
• Catal, C., & Guldan, S. (2017). Product review management software based on multiple
classifiers. Iet Software, 11(3), 89-92.
• Yang, H. W., Pan, Z. G., Wang, X. Z., & Xu, B. (2004, August). A personalized products
selection assistance based on e-commerce machine learning. In Machine Learning and
Cybernetics, 2004. Proceedings of 2004 International Conference on (Vol. 4, pp. 2629-2633).
IEEE.
References
• H. Rao, Z. Zeng and A. Liu, "Research on personalized referral service and big data mining for
e-commerce with machine learning," 2018 4th International Conference on Computer and
Technology Applications (ICCTA), Istanbul, 2018, pp. 35-38.
• Mohanta, B., Patnaik, S., & Patnaik, S. (2018, September). Big Data for Modelling Interactive
Systems in IoT. In 2018 2nd International Conference on Data Science and Business Analytics
(ICDSBA) (pp. 105-110). IEEE.
• Chen, X., Sun, W., Wang, B., Li, Z., Wang, X., & Ye, Y. (2018). Spectral Clustering of
Customer Transaction Data With a Two-Level Subspace Weighting Method. IEEE Transactions
on Cybernetics.
• Hossain, A. S. (2017, December). Customer segmentation using centroid based and density
based clustering algorithms. In Electrical Information and Communication Technology (EICT),
2017 3rd International Conference on (pp. 1-6). IEEE.
THANK YOU

SVM ML

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SVM ML

Uploaded by

Copyright:

Available Formats

Contents

• What is Machine Learning(ML)

• ML comes under data science which is an umbrella term covering all

• ML is a technique responsible for developing algorithms and programs

• In 1959, Arthur Samuel, an American pioneer in the field of computer

• Today’s machines are trained by expert machine learning techniques (rules);

• The goal is simply to determine what combination of machine learning

• Here, the relationship between independent and dependent variables is

• It is used to estimate discrete values ( Binary values like 0/1, yes/no,

• In simple words, it predicts the probability of occurrence of an event by

• It works for both categorical and continuous dependent variables.

• In this algorithm, we split the population into two or more

• To split the population into different heterogeneous groups, it uses

• In Random Forest, we’ve collection of decision trees (so known as

• Each tree is planted & grown as follows:

• Bayes theorem provides a way of calculating posterior probability P(c|

Step 2: Create Likelihood table by finding the probabilities like Overcast

• We can solve it using above discussed method, so P(Yes | Sunny) =

• Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36,

• Naive Bayes uses a similar method to predict the probability of

• K nearest neighbors is a simple algorithm that stores all available cases

• It is helpful to large and noisy training data.

Step-2: c1=3 & c2=18

Step-3: c1=5 & c2=20

• To identify highly significant variable among these huge data, dimensionality

• ML approach would be the best fit to make daily decisions and

• Markets like ecommerce, stocks which are highly competitive, volatile

You might also like