Professional Documents
Culture Documents
SVM ML
SVM ML
• It represents:
• Volatility
• Uncertainty
• Complexity
• Ambiguity
• These conditions were totally new and totally changed the nature of warfare. Unsurprisingly,
today’s business environment has changed in a very similar way. Like warfare, business in the
21st century will never be the same again.
• Considering the ever dynamic market conditions, which are subjected to change and challenges
the intelligent machines which can adapt will only survive.
• These conditions can be analyzed using “VUCA”: Volatility, Uncertainty, Complexity and
Ambiguity.
• VUCA will guide in identifying, preparing and getting ready for the potential events in all the 4
categories mentioned.
Continue
Roll of Machine Learning in VUCA
Management
• To predict the risk factors and take appropriate decision according to it in an
organization, tends to impossible for a human brain.
• Humans are natural pattern seekers and problem solvers, while machines are
fantastic at performing billions and trillions of calculations per second.
• Once some data are fed to the system, analytics does everything that one’s
instinct would do.
• It interprets the available data, predicts what’s going to happen, and makes a
decision based on its prediction. Like an human being, analytics improves
with experience.
Continue
• A great business decision utilizes both human input and advanced analytical
tools. This two-pronged approach is the key to form a market-leading
business.
• Machine learning starts with data—more the data, better the results are likely
to be. Having lots of data to work with in many different areas lets the
techniques of machine learning be applied to a broader set of problems.
• Once machine learning has the right data, it uses algorithms to work with the
data, typically applying statistical analysis such as a regression, along with
two-class boosted decision tree and multiclass decision forest etc..
1. Supervised Learning
Examples of Supervised Learning: Regression, Decision Tree, Random Forest,
kNN, Logistic Regression etc.
2. Unsupervised Learning
Examples of Unsupervised Learning: Apriori algorithm, K-means.
3. Reinforcement Learning
Example of Reinforcement Learning: Markov Decision Process
Continue
• Common Machine Learning Algorithms are,
1. Linear Regression
2. Logistic Regression
3. Decision Tree
4. Random Forest
5. SVM(Support-vector machine)
6. Naive Bayes
7. kNN(k-Nearest Neighbour)
8. K-Means
9. Dimensionality Reduction Algorithms
Continue
1. Linear Regression:
• It is used to estimate real values (cost of houses, number of calls, total sales etc.)
based on continuous variables.
• This best fit line is known as regression line and represented by a linear equation
Y= a *X + b.
• In this equation:
• Y – Dependent Variable
• a – Slope
• X – Independent variable
• b – Intercept
• These coefficients a and b are derived based on minimizing the sum of squared
difference of distance between data points and regression line.
Continue
2. Logistic Regression:
• It is a classification algorithm.
• Since, it predicts the probability, its output values lies between 0 and 1.
Continue
3. Decision Tree:
• It is a type of supervised learning algorithm that is mostly used for
classification problems.
• If the number of cases in the training set is N, then sample of N cases is taken at
random but with replacement. This sample will be the training set for growing the
tree.
• If there are M input variables, a number m<<M is specified such that at each node,
m variables are selected at random out of the M and the best split on these m is
used to split the node. The value of m is held constant during the forest growing.
• Each tree is grown to the largest extent possible. There is no pruning.
Continue
5. SVM:
• It is a classification method. In this algorithm, we plot each data item as
a point in n-dimensional space (where n is number of features we have)
with the value of each feature being the value of a particular
coordinate.
• For example, if we only had two features like Height and Hair length of
an individual, we’d first plot these two variables in two dimensional
space where each point has two co-ordinates (these co-ordinates are
known as Support Vectors).
Continue
• Now, we will find some line that splits the data between the two
differently classified groups of data.
• This will be the line such that the distances from the closest point in
each of the two groups will be farthest away.
• In the example shown above, the line which splits the data into two
differently classified groups is the black line, since the two closest
points are the farthest apart from the line. This line is our classifier.
Then, depending on where the testing data lands on either side of the
line, that’s what class we can classify the new data as.
Continue
6. Naive Bayes:
• A Naive Bayes classifier assumes that the presence of a particular
feature in a class is unrelated to the presence of any other feature.
• Naive Bayesian model is easy to build and particularly useful for very
large data sets.
Step 3: Now, use Naive Bayesian equation to calculate the posterior probability
for each class. The class with the highest posterior probability is the outcome of
prediction.
Continue
• Problem: Players will play if weather is sunny, is this statement is
correct?
• Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher
probability.
• The case being assigned to the class is most common amongst its K
nearest neighbors measured by a distance function.
• Its procedure follows a simple and easy way to classify a given data set
through a certain number of clusters (assume k clusters).
• Data points inside a cluster are homogeneous and heterogeneous to peer groups.
• Cluster formation:
• K-means picks k number of points for each cluster known as centroids or mean value
(random).
• Find nearest number of k- centroids and put in cluster.
• Finds the centroid or mean of each cluster based on existing cluster members. Here we
have new centroids.
• As we have new centroids, repeat step 2 and 3. Find the closest distance for each data
point from new centroids and get associated with new k-clusters. Repeat this process
until convergence occurs i.e. centroids does not change or until we get the same mean or
centroid.
Continue
Example:
Given dataset is {2, 3, 4, 10, 11, 12, 20, 25, 30}
And k=2
c1=4 & c2=12 (say)
Step-1:
k1={2, 3, 4} k2={10, 11, 12, 20, 25, 30}
c1=(2+3+4)/3=9/3=3 & c2=(10+11+12+20+25+30)/6=108/6=18
• Its quite easy to predict that the rapid growth and ever changing digital
market in the recent society and the customer’s inclination towards the
e-commerce world lead to volatility, uncertainty, complexity and
ambiguity in decision making process of an organization.
• Tavakoli, M., Molavi, M., Masoumi, V., Mobini, M., Etemad, S., & Rahmani, R. (2018,
October). Customer Segmentation and Strategy Development Based on User Behavior
Analysis, RFM Model and Data Mining Techniques: A Case Study. In 2018 IEEE 15th
International Conference on e-Business Engineering (ICEBE) (pp. 119-126). IEEE.
• Catal, C., & Guldan, S. (2017). Product review management software based on multiple
classifiers. Iet Software, 11(3), 89-92.
• Yang, H. W., Pan, Z. G., Wang, X. Z., & Xu, B. (2004, August). A personalized products
selection assistance based on e-commerce machine learning. In Machine Learning and
Cybernetics, 2004. Proceedings of 2004 International Conference on (Vol. 4, pp. 2629-2633).
IEEE.
References
• H. Rao, Z. Zeng and A. Liu, "Research on personalized referral service and big data mining for
e-commerce with machine learning," 2018 4th International Conference on Computer and
Technology Applications (ICCTA), Istanbul, 2018, pp. 35-38.
• Mohanta, B., Patnaik, S., & Patnaik, S. (2018, September). Big Data for Modelling Interactive
Systems in IoT. In 2018 2nd International Conference on Data Science and Business Analytics
(ICDSBA) (pp. 105-110). IEEE.
• Chen, X., Sun, W., Wang, B., Li, Z., Wang, X., & Ye, Y. (2018). Spectral Clustering of
Customer Transaction Data With a Two-Level Subspace Weighting Method. IEEE Transactions
on Cybernetics.
• Hossain, A. S. (2017, December). Customer segmentation using centroid based and density
based clustering algorithms. In Electrical Information and Communication Technology (EICT),
2017 3rd International Conference on (pp. 1-6). IEEE.
THANK YOU