Exercise 7 Submission Group 12

Data Science I
Overall Points: / 100

Exercise 7: Classification & Clustering
Submission Deadline: July 10 2023, 07:00 UTC
University of Oldenburg
Summer 2023
Instructors: Maria Fernanda "MaFe" Davila Restrepo, Wolfram "Wolle" Wingerath
Submitted by: <Akalin,Alp Uzun,Burak Yalcin,Mehmet>
Part 1: Logistic Regression & Gradient Descent

/ 25
1.) Suppose we are training a model using stochastic gradient descent. How do we know if we are
converging to a solution?
Solution:
There are a few ways to know if we are converging to a solution when training a model using stochastic
gradient descent (SGD).
--Monitoring the loss function.
The loss function is a measure of how well the model is performing. As the model converges, the loss
function should decrease. If the loss function is not decreasing, it may mean that the model is not
converging.
--Monitoring the training accuracy.
The training accuracy is a measure of how well the model is able to predict the correct output for the
training data. As the model converges, the training accuracy should increase. If the training accuracy is not
increasing, it may mean that the model is not converging.
--Plotting the learning curve.
The learning curve is a graph of the loss function or training accuracy over the course of training. A well-
converging model will show a decreasing loss function or increasing training accuracy over time.
In addition to these methods, there are a few other factors that can indicate whether or not a model is
converging using SGD. These factors include:
--The learning rate.
The learning rate is a hyperparameter that controls how much the model parameters are updated at each
step. A lower learning rate will result in slower convergence, but it may also be more stable. A higher
learning rate will result in faster convergence, but it may also be more likely to diverge.
--The batch size.
The batch size is the number of training samples that are used to calculate the gradient at each step. A
larger batch size will result in more accurate gradients, but it will also be more computationally expensive.
A smaller batch size will be less computationally expensive, but it may result in less accurate gradients.
--The number of epochs.
The number of epochs is the number of times that the entire training dataset is used to train the model. A
larger number of epochs will result in more accurate models, but it will also take longer to train the model.
2.) Do gradient descent methods always converge to the same point? Please explain your reasoning.
Solution:
No, gradient descent methods do not always converge to the same point. This is because the convergence
of gradient descent methods depends on the following factors:
**The starting point.
The starting point of the gradient descent algorithm is important because it determines the direction in
which the algorithm will move. If the starting point is close to a local minimum, then the algorithm is more
likely to converge to that local minimum.
**The learning rate.
The learning rate is a hyperparameter that controls how much the model parameters are updated at each
step. If the learning rate is too small, then the algorithm will converge very slowly. If the learning rate is too
large, then the algorithm may not converge at all, or it may converge to a different local minimum.
**The loss function.
The loss function is the function that the gradient descent algorithm is trying to minimize. If the loss
function is non-convex, then there may be multiple local minima. In this case, the gradient descent
algorithm may converge to any one of the local minima.
In general, gradient descent methods are more likely to converge to the same point if the starting point is
close to a global minimum, the learning rate is small, and the loss function is convex. However, there is no
guarantee that gradient descent methods will always converge to the same point, even if all of these
factors are satisfied.
Here are some additional reasons why gradient descent methods may not converge to the same point:
**Noise in the data.
If the data used to train the model is noisy, then the gradient descent algorithm may not be able to
converge to a single point.
**Constraints on the model parameters.

If the model parameters are constrained to lie within a certain range, then the gradient descent algorithm
may not be able to converge to a point that lies outside of this range.
**Abrupt changes in the loss function.
If the loss function has abrupt changes, then the gradient descent algorithm may not be able to follow the
gradient accurately.
In conclusion, gradient descent methods do not always converge to the same point. This is because the
convergence of gradient descent methods depends on a number of factors, including the starting point,
the learning rate, the loss function, the noise in the data, and the constraints on the model parameters.
3.) Consider the following labeled data points:
In [254… import matplotlib.pyplot as plt
# The data points in 2D feature space

data_points = [(1, 2), (2, 3), (3, 3), (4, 5), (5, 6), (5, 7)]
# The labels corresponding to the data points

labels = [0, 0, 0, 1, 1, 1]
class_0 = [point for point, label in zip(data_points, labels) if label == 0]
# Plot the data points

plt.scatter(*zip(*class_0), color='blue', label='class 0')
plt.scatter(*zip(*class_1), color='red', label='class 1')
plt.legend(loc='upper left')
plt.show()
a) Please train a logistic regression model using sklearn.linear_model and draw the decision
boundary.
Solution:
In [6]: import numpy as np

import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
# The data points in 2D feature space

data_points = np.array([(1, 2), (2, 3), (3, 3), (4, 5), (5, 6), (5, 7)])

labels = np.array([0, 0, 0, 1, 1, 1])
# Train a logistic regression model

model = LogisticRegression()
model.fit(data_points, labels)

labels = [0, 0, 0, 1, 1, 1]
# Plot the data points

plt.scatter(data_points[:, 0], data_points[:, 1], c=labels, cmap='coolwarm', edgecolor=
# Create a meshgrid of points to visualize the decision boundary

x_min, x_max = data_points[:, 0].min() - 1, data_points[:, 0].max() + 1
y_min, y_max = data_points[:, 1].min() - 1, data_points[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02), np.arange(y_min, y_max, 0.02))
# Predict the class labels for the meshgrid points

Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
# Plot the decision boundary

plt.contourf(xx, yy, Z, alpha=0.8)
# Set the axis limits and labels

plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
# Show the plot

plt.scatter(*zip(*class_0), color='blue', label='class 0')
plt.scatter(*zip(*class_1), color='red', label='class 1')
plt.legend(loc='upper left')
plt.show()
b) Please explain how you got from the model parameters to the equation for the separating line.
Solution:
The logistic regression model in the code is a binary classifier, which means it can predict one of two
classes. In this case, the classes are 0 and 1. The model parameters are the weights and bias of the linear
regression model. The weights are the coefficients of the features in the model, and the bias is a constant
term.
The decision boundary of the logistic regression model is a line that separates the two classes. The
equation for the decision boundary can be found by setting the logistic regression output to 0.5. This gives
the following equation:
logit(p) = 0.5
where logit(p) is the logistic function, and p is the probability of the data point belonging to class 1. The
logistic function is defined as:
logit(p) = log(p / (1 - p))
Solving for p in the equation logit(p) = 0.5 gives:
p = 1 / (1 + e^(-0.5))
This is the equation for the separating line. The line can be plotted by evaluating the equation for different
values of x.
In the code, the decision boundary is plotted using the contourf() function. The contourf() function takes a
meshgrid of points and plots a filled contour plot. The color of the plot indicates the predicted class label
for each point.
The code also plots the data points in different colors, depending on their class label. The blue points are
class 0, and the red points are class 1. The decision boundary separates the blue points from the red
points.
Part 2: Support Vector Machines & Classification

/ 25
4.) Consider the following set of data points in 2D space where the labels are stored in the variable y
and the feature vectors are stored in the variable X:

from sklearn import datasets
X, y = datasets.make_classification(n_samples=100,n_features=2, n_informative=2, n_redu
# Separate data points by class for the sake of plotting
X0 = [X[i] for i in range(len(X)) if y[i]==0]

X1 = [X[i] for i in range(len(X)) if y[i]==1]
# Create scatter plot
fig = plt.figure(figsize=(3, 6))

plt.scatter([x[0] for x in X0], [x[1] for x in X0], color='red', label='Class 0')
plt.scatter([x[0] for x in X1], [x[1] for x in X1], color='blue', label='Class 1')
# Set the labels and title
plt.xlabel('X1')
plt.ylabel('X2')
plt.title('data')
plt.legend()
plt.show()
a) Please split the dataset into a training set and a test set. You can use
sklearn.model_selection.train_test_split for this purpose.
Solution:

from sklearn.model_selection import train_test_split
X, y = datasets.make_classification(n_samples=100, n_features=2, n_informative=2, n_red
# Split the dataset into training set and test set

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=4

X0_train = [X_train[i] for i in range(len(X_train)) if y_train[i] == 0]
X1_train = [X_train[i] for i in range(len(X_train)) if y_train[i] == 1]
X0_test = [X_test[i] for i in range(len(X_test)) if y_test[i] == 0]
X1_test = [X_test[i] for i in range(len(X_test)) if y_test[i] == 1]
# Create scatter plot for training set

fig = plt.figure(figsize=(6, 6))
plt.scatter([x[0] for x in X0_train], [x[1] for x in X0_train], color='red', label='Cla
plt.scatter([x[0] for x in X1_train], [x[1] for x in X1_train], color='blue', label='Cl
# Create scatter plot for test set

plt.scatter([x[0] for x in X0_test], [x[1] for x in X0_test], color='lightcoral', marke
plt.scatter([x[0] for x in X1_test], [x[1] for x in X1_test], color='lightblue', marker

plt.xlabel('X1')
plt.ylabel('X2')
plt.title('Data (Train/Test)')
plt.legend()
plt.show()
b) For at least two different kernel functions (e.g. 'linear', 'rbf', 'poly', or 'sigmoid'), do the
following:
1. Please train a Support Vector Machine classifier on the training set. Use the SVC class from
sklearn.svm for the task.
2. Please evaluate the classifier on the test set using accuracy as the metric. You can use the
sklearn.metrics.accuracy_score to achieve this.
3. Now please visualize the decision boundary of both classifiers. You can create a meshgrid of
points in the feature space, predict the label of each point with the trained classifier, and then
use a contour plot to visualize the decision boundary.
Solution 1-):
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
X, y = datasets.make_classification(n_samples=100, n_features=2, n_informative=2, n_red
# Split the dataset into training set and test set

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=4
# Train a linear SVM classifier

clf_linear = SVC(kernel='linear')
clf_linear.fit(X_train, y_train)
# Train a radial basis function (RBF) SVM classifier

clf_rbf = SVC(kernel='rbf', gamma='scale')
clf_rbf.fit(X_train, y_train)
# Train a polynomial SVM classifier

clf_poly = SVC(kernel='poly', degree=3)
clf_poly.fit(X_train, y_train)
# Evaluate the classifiers on the test set

print('Linear SVM accuracy:', clf_linear.score(X_test, y_test))
print('RBF SVM accuracy:', clf_rbf.score(X_test, y_test))
print('Polynomial SVM accuracy:', clf_poly.score(X_test, y_test))
Linear SVM accuracy: 0.85
RBF SVM accuracy: 0.85
Polynomial SVM accuracy: 0.75
In [32]: from sklearn.metrics import accuracy_score
# Evaluate the classifier on the test set using accuracy

y_pred_linear = svm_linear.predict(X_test)
accuracy_linear = accuracy_score(y_test, y_pred_linear)
print("Accuracy (Linear Kernel):", accuracy_linear)
y_pred_rbf = svm_rbf.predict(X_test)
accuracy_rbf = accuracy_score(y_test, y_pred_rbf)
print("Accuracy (RBF Kernel):", accuracy_rbf)
Accuracy (Linear Kernel): 0.85
Accuracy (RBF Kernel): 0.85

# Create a meshgrid of points in the feature space

x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02), np.arange(y_min, y_max, 0.02))
Z_linear = svm_linear.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
Z_rbf = svm_rbf.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
# Plot the decision boundary using a contour plot

plt.figure(figsize=(10, 6))
plt.contourf(xx, yy, Z_linear, alpha=0.8, cmap=plt.cm.RdBu)
plt.xlabel('X1')
plt.ylabel('X2')
plt.title('Decision Boundary (Linear Kernel)')
plt.legend()
plt.show()
plt.figure(figsize=(10, 6))
plt.contourf(xx, yy, Z_rbf, alpha=0.8, cmap=plt.cm.RdBu)
plt.xlabel('X1')
plt.ylabel('X2')
plt.title('Decision Boundary (RBF Kernel)')
plt.legend()
plt.show()
Part 3: Distance Metrics & Nearest-Neighbor
Methods / 25
5.) Is the edit distance on text strings a metric? Please provide an answer and explain your
reasoning.
Solution:
Edit distance or Levenshtein distance is a way to quantify how different two strings are by calculating the
least number of operations(insert, delete, substitute) necessary to change one string into the other. Edit
distance is a quite useful metric especially in NLP. There are few properties of Levenshtein metric which
are:
-Non-negativity: The edit distance between two strings must be non-negative.
-Identity of indiscernibles: The distance between a string and itself must be zero because there is no any
single operation required to transform.
-Symmetry: The edit distance between A-B must be the same with B-A because number of operations
required to transform one into another must be the same and symmetrical.
-Triangle inequality: The edit distance between string A and string C is no greater than the sum of the edit
distances between string A and string B and string B and string C.
Overall, edit distance is a flexible metric that can be used in a number of contexts for assessing how similar
strings are to one another.
6.) Construct a two-class point set on n ≥ 10 points in two dimensions, where every point would be
misclassiﬁed according to its nearest neighbor (kNN classification with k = 1).
Solution:

n = 10 # Number of points in each class
# Class 1 points
class1_points = np.random.normal(2, 1, size=(n, 2))
# Class 2 points
class2_points = np.random.normal(8, 1, size=(n, 2))
# Plotting the point set
plt.scatter(class1_points[:, 0], class1_points[:, 1], c='red', label='Class 1')
plt.scatter(class2_points[:, 0], class2_points[:, 1], c='blue', label='Class 2')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.title('Two-Class Point Set')
plt.show()
misclassified_points = []
for point in class1_points:

distance = np.inf
nearest_neighbor = None
# Finding nearest neighbor from class2_points

for neighbor in class2_points:
dist = np.linalg.norm(point - neighbor)
if dist < distance:
distance = dist
nearest_neighbor = neighbor
misclassified_points.append((point, nearest_neighbor, 2)) # Misclassified as Class

distance = np.inf
nearest_neighbor = None
# Finding nearest neighbor from class1_points

for neighbor in class1_points:
dist = np.linalg.norm(point - neighbor)
if dist < distance:
distance = dist
nearest_neighbor = neighbor
misclassified_points.append((point, nearest_neighbor, 1)) # Misclassified as Class

# Plotting the point set with misclassified points
for point, neighbor, cls in misclassified_points:
plt.scatter(point[0], point[1], c='green' if cls == 1 else 'purple', label='Misclas
plt.plot([point[0], neighbor[0]], [point[1], neighbor[1]], '--', color='black')

plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.title('Misclassified Points (kNN with k=1)')
plt.show()
7.) How does classification performance change when you classify by the 3 nearest neighbors (kNN
classification with k = 3)?
Solution:
In [11]: k = 3 # Number of nearest neighbors
misclassified_points_k1 = []
misclassified_points_k3 = []

# k = 1
nearest_neighbor_k1 = class2_points[np.argmin(np.linalg.norm(class2_points - point,
misclassified_points_k1.append((point, nearest_neighbor_k1, 2)) # Misclassified as
# k = 3
nearest_neighbors_k3 = class2_points[np.argsort(np.linalg.norm(class2_points - poin
misclassified_points_k3.append((point, nearest_neighbors_k3, 2)) # Misclassified a

# k = 1
nearest_neighbor_k1 = class1_points[np.argmin(np.linalg.norm(class1_points - point,
misclassified_points_k1.append((point, nearest_neighbor_k1, 1)) # Misclassified as
# k = 3
nearest_neighbors_k3 = class1_points[np.argsort(np.linalg.norm(class1_points - poin
misclassified_points_k3.append((point, nearest_neighbors_k3, 1)) # Misclassified a
# Plotting the point set with misclassified points for k = 3
for point, neighbors, cls in misclassified_points_k3:
plt.scatter(point[0], point[1], c='green' if cls == 1 else 'purple', label='Misclas
for neighbor in neighbors:
plt.plot([point[0], neighbor[0]], [point[1], neighbor[1]], '--', color='black')

plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.title('Misclassified Points (kNN with k=3)')
plt.show()
Part 4: Networks & Clustering / 25
8.) For each of the following graph-theoretic properties, please provide a use case or an example of
a real-world network that satisfies / does not satisfy the property.
a) Directed vs. undirected.
Solution:
<
Directed graphs (also known as digraphs) have edges that are assigned a specific direction or orientation.
In a directed graph, the edges have a starting vertex and an ending vertex, indicating the direction of the
relationship between the vertices. Examples of real-world networks that can be modeled as directed
graphs include:
1. Social Media Networks: In social media networks like Twitter or Facebook, the relationships between
users can be represented as directed graphs. The edges represent the connections or relationships
between users, and the direction indicates the direction of following or friendship.
2. Internet Web Pages: The World Wide Web can be represented as a directed graph, where web pages
are nodes, and hyperlinks between pages represent directed edges. The directed nature of the graph
signifies the flow of information from one web page to another.
On the other hand, undirected graphs have edges that do not have a specific direction. The relationships
between nodes are symmetric, and the edges represent bi-directional connections. Examples of real-world
networks that can be modeled as undirected graphs include:
1. Friendship Networks: In a social network where friendships are modeled, the relationships are often
undirected. If two individuals are friends, the connection is symmetric, and the graph representing
their friendship network would be undirected.
2. Road Networks: In a road network, the connections between intersections or cities can be represented
as undirected edges. The fact that one road connects two locations in one direction implies that the
reverse direction is also possible.
The choice between using a directed or undirected graph to model a real-world network depends on the
nature of the relationships or connections being represented. If the relationships have a clear
directionality, a directed graph is appropriate. However, if the relationships are symmetric or bidirectional,
an undirected graph is suitable.>
b) Weighted vs. unweighted.
Solution:
<
Weighted graphs assign a weight or numerical value to each edge, indicating the strength, distance, cost,
or any other relevant measure associated with the connection between vertices. In contrast, unweighted
graphs do not have assigned weights, and all edges are considered to have equal importance or distance.
Here are examples of real-world networks that can be modeled as weighted or unweighted graphs:
1. Transportation Networks:
Weighted: A transportation network, such as a subway system or airline routes, can be

represented as a weighted graph. The weights on the edges can represent the distance, travel
time, or cost between different locations.
Unweighted: In a simple road map where the focus is on connectivity rather than specific
distances or travel times, an unweighted graph can be used to represent the road network. Each
road segment is treated as equally connected to its neighboring intersections.
2. Social Networks:
Weighted: In a social network, where relationships can have different strengths or closeness, a
weighted graph can be employed. The weights on the edges can represent the strength of
friendship, frequency of interaction, or any other measure of relationship intensity.
Unweighted: In certain social networks, such as a follow graph on Twitter or an unweighted co-
authorship network, the connections between individuals are considered binary, indicating the
presence or absence of a relationship. In this case, an unweighted graph suffices to represent the
network structure.
The decision to use a weighted or unweighted graph depends on the specific attributes or properties of
the network being modeled. If there are significant variations in the relationships, distances, or strengths
between vertices, a weighted graph provides a more detailed representation. Conversely, if the emphasis is
on connectivity or binary relationships without considering varying weights, an unweighted graph is
sufficient. >
c) Simple vs. non-simple.
Solution:
<
In graph theory, a simple graph is an undirected graph that does not contain any self-loops (an edge
connecting a vertex to itself) or multiple edges (more than one edge connecting the same pair of vertices).
A non-simple graph, therefore, refers to a graph that has self-loops or multiple edges. Let's consider
examples of real-world networks that can be categorized as simple or non-simple:
1. Social Networks:
Simple: In most social networks, such as friendship networks or online social platforms, the
relationships between individuals are typically modeled as simple graphs. Each connection
represents a unique relationship, and self-loops or multiple connections between the same
individuals are not considered.
Non-simple: In certain cases, such as modeling interactions on online forums or social networks
that allow users to follow themselves or have multiple connections with the same individuals, the
graph representation might include self-loops or multiple edges. These scenarios would lead to
non-simple graphs.
2. Transportation Networks:
Simple: Road networks, where each road segment connects two distinct intersections, can be
modeled as simple graphs. There are no self-loops or multiple roads connecting the same pair of
intersections.
Non-simple: In some cases, transportation networks might have self-loops or multiple edges. For
example, in a transportation network where a road or rail line loops back on itself or where
multiple parallel roads or rail lines exist between two locations, the graph representation would
be non-simple.
The decision to model a network as simple or non-simple depends on the specific characteristics and
requirements of the network being studied. In most cases, simple graphs are used when the focus is on
the basic relationships or connections between entities, while non-simple graphs are used when there are
additional considerations such as self-loops or multiple connections that need to be accounted for. >
d) Sparse vs. dense.
Solution:
<
In graph theory, the terms sparse and dense refer to the number of edges in a graph relative to the total
possible number of edges. Let's consider examples of real-world networks that can be categorized as
sparse or dense:
1. Social Networks:
Sparse: In social networks, the number of connections between individuals tends to be relatively
small compared to the total number of possible connections. This is especially true in large-scale
social networks where individuals typically have a limited number of connections compared to
the total population. Therefore, social networks are often considered sparse graphs.
Dense: In certain social networks or small-scale communities where individuals tend to have a
high number of connections or there is a higher level of interconnectedness, the graph
representation may be considered dense.
2. Collaboration Networks:
Sparse: Collaboration networks, such as co-authorship networks or scientific collaboration

networks, are often sparse due to the limited number of collaborations between researchers or
individuals in a specific field. While there may be some highly collaborative communities, overall,
the number of collaborations is much smaller compared to the total possible collaborations.
Dense: In specialized collaboration networks or tightly-knit research communities where
researchers frequently collaborate with each other, the graph representation may be dense.
The characterization of a graph as sparse or dense is relative and depends on the specific context and
network being analyzed. The density of a graph is influenced by the number of vertices and edges as well
as the relationship patterns within the network. Generally, a graph with a low edge-to-vertex ratio is
considered sparse, while a graph with a high edge-to-vertex ratio is considered dense. However, the
threshold for defining a graph as sparse or dense is subjective and can vary based on the specific
application or research domain.>
e) Embedded vs. topological.

Solution:
<
In graph theory, embedded and topological graphs refer to different aspects of representing graphs in
physical space or considering only the connectivity patterns without specific spatial arrangements. Let's
explore examples of real-world networks that can be categorized as embedded or topological:
1. Road Networks:
Embedded: Road networks are often represented as embedded graphs since they have a physical
layout in geographical space. The positions and connections of roads correspond to their real-
world locations. The spatial information of intersections, road segments, and their connectivity is
considered in the graph representation.
Topological: While road networks can be embedded graphs, they can also be studied from a
topological perspective without considering the precise geographical locations. The focus may be
on the connectivity patterns, the relationships between intersections, or the hierarchy of roads,
irrespective of their exact positions.
2. Social Networks:
Topological: Social networks, such as online social platforms, are often represented as topological
graphs, emphasizing the connectivity patterns and relationships between individuals. The graph
structure captures the connections and interactions between users, but it does not consider the
physical embedding of individuals in space.
Embedded: In some cases, social networks can be represented as embedded graphs when the
spatial aspect is relevant. For example, in location-based social networks where connections
between individuals are influenced by physical proximity or geographic factors, the graph
representation may incorporate spatial information.
The distinction between embedded and topological graphs depends on the focus of the analysis and the
relevance of spatial information. In embedded graphs, the physical positions or coordinates of vertices are
considered, whereas topological graphs focus on connectivity patterns without considering the precise
spatial arrangement. The choice between embedded or topological representation depends on the specific
characteristics of the network and the research questions being addressed. >
f) Labeled vs. unlabeled.
Solution:
<
In graph theory, labeled and unlabeled graphs refer to whether the vertices or edges of a graph are
assigned labels or identifiers. Let's explore examples of real-world networks that can be categorized as
labeled or unlabeled:
1. Social Networks:
Labeled: In social networks, it is common to have labeled graphs where the vertices represent
individuals or users and are assigned labels such as usernames or unique identifiers. This labeling
allows for identification and tracking of specific individuals within the network.
Unlabeled: Alternatively, social networks can be represented as unlabeled graphs when the focus
is on the structural properties and relationships rather than the specific identification of
individuals. The vertices represent individuals without any specific labels or identifiers attached to
them.
2. Biological Networks:
Labeled: Biological networks, such as protein-protein interaction networks or gene regulatory

networks, are often labeled graphs. Each vertex or edge represents a biological entity (protein,
gene, etc.) and is assigned a specific label or identifier based on its biological name, symbol, or
accession number.
Unlabeled: In some biological networks where the focus is on the structural patterns or
connectivity, an unlabeled representation may be used. The vertices represent biological entities
without explicit labels attached to them, and the analysis revolves around the network structure
rather than individual entity labels.
The choice between using labeled or unlabeled graphs depends on the specific requirements and goals of
the analysis. Labeled graphs provide additional information and enable identification and tracking of
specific entities within the network. Unlabeled graphs, on the other hand, focus on the overall network
structure and are useful for studying connectivity patterns and relationships without emphasizing
individual labels or identifiers.
It's important to note that in some cases, a graph can have a combination of labeled and unlabeled
components. For example, a social network might have labeled vertices representing users, while the
edges remain unlabeled and represent connections between users. >
9.) Again, consider the following set of data points in 2D space:
In [256… X_knn = np.array([[1, 2], [3, 4], [1, 3], [4, 1], [3, 2], [6, 8], [8, 7], [6, 7], [7, 7
y_knn = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])

X_knn_0 = [X_knn[i] for i in range(len(X_knn)) if y_knn[i]==0]
X_knn_1 = [X_knn[i] for i in range(len(X_knn)) if y_knn[i]==1]
# Create scatter plot

plt.figure()
plt.scatter([x[0] for x in X_knn_0], [x[1] for x in X_knn_0], color='red', label='Class
plt.scatter([x[0] for x in X_knn_1], [x[1] for x in X_knn_1], color='blue', label='Clas

plt.xlabel('X1')
plt.ylabel('X2')
plt.title('data')
plt.legend()
plt.show()
Please perform a KNN classification of the new point p = [4, 4] for k=3 using
sklearn.neighbors.KNeighborsClassifier . Use the Euclidean distance as a metric. (Note: Please
plot the new point p in the same plot with the other points from the dataset, but color it differently
from the points in the dataset.)
Solution:

from sklearn.neighbors import KNeighborsClassifier
# Given dataset
X_knn = np.array([[1, 2], [3, 4], [1, 3], [4, 1], [3, 2], [6, 8], [8, 7], [6, 7], [7, 7
y_knn = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
# Train the KNN classifier

k = 3
model = KNeighborsClassifier(n_neighbors=k)
model.fit(X_knn, y_knn)
# The new point

p = np.array([4, 4]).reshape(1, -1)
# Perform the classification

predicted_class = model.predict(p)
print("The predicted class of the new point p is: ", predicted_class)
# Plot the original data

X_knn_0 = X_knn[y_knn == 0]
X_knn_1 = X_knn[y_knn == 1]
plt.scatter(X_knn_0[:, 0], X_knn_0[:, 1], color='red', label='Class 0')
plt.scatter(X_knn_1[:, 0], X_knn_1[:, 1], color='blue', label='Class 1')
# Plot the new point

plt.scatter(p[0, 0], p[0, 1], color='green', label='New point')
plt.xlabel('X1')
plt.ylabel('X2')
plt.title('Data points and the new point')
plt.legend()
plt.show()
The predicted class of the new point p is: [0]
Finally: Submission
Save your notebook and submit it (as both notebook and PDF file). And please don't forget to ...
... choose a file name according to convention (see Exercise Sheet 1, but please add your group
name as a suffix like _group01 ) and to
... include the execution output in your submission!

Exercise 7 Submission Group 12

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exercise 7 Submission Group 12

Uploaded by

Copyright:

Available Formats

Data Science I

Overall Points: / 100

Submitted by: <Akalin,Alp Uzun,Burak Yalcin,Mehmet>

Part 1: Logistic Regression & Gradient Descent

--Monitoring the loss function.

--Monitoring the training accuracy.

--Plotting the learning curve.

--The learning rate.

--The batch size.

--The number of epochs.

**The starting point.

**The learning rate.

**The loss function.

**Noise in the data.

**Constraints on the model parameters.

**Abrupt changes in the loss function.

3.) Consider the following labeled data points:

In [254… import matplotlib.pyplot as plt

# The data points in 2D feature space

# The labels corresponding to the data points

# Plot the data points

In [6]: import numpy as np

# The data points in 2D feature space

# The labels corresponding to the data points

# Train a logistic regression model

# The labels corresponding to the data points

# Plot the data points

# Create a meshgrid of points to visualize the decision boundary

# Predict the class labels for the meshgrid points

# Plot the decision boundary

# Set the axis limits and labels

# Show the plot

logit(p) = log(p / (1 - p))

Solving for p in the equation logit(p) = 0.5 gives:

Part 2: Support Vector Machines & Classification

In [25]: import numpy as np

X, y = datasets.make_classification(n_samples=100,n_features=2, n_informative=2, n_redu

# Separate data points by class for the sake of plotting

X0 = [X[i] for i in range(len(X)) if y[i]==0]

# Create scatter plot

fig = plt.figure(figsize=(3, 6))

# Set the labels and title

In [26]: import numpy as np

X, y = datasets.make_classification(n_samples=100, n_features=2, n_informative=2, n_red

# Split the dataset into training set and test set

# Separate data points by class for the sake of plotting

# Create scatter plot for training set

# Create scatter plot for test set

# Set the labels and title

X, y = datasets.make_classification(n_samples=100, n_features=2, n_informative=2, n_red

# Split the dataset into training set and test set

# Train a linear SVM classifier

# Train a radial basis function (RBF) SVM classifier

# Train a polynomial SVM classifier

# Evaluate the classifiers on the test set

Linear SVM accuracy: 0.85

RBF SVM accuracy: 0.85

Polynomial SVM accuracy: 0.75

In [32]: from sklearn.metrics import accuracy_score

# Evaluate the classifier on the test set using accuracy

Accuracy (Linear Kernel): 0.85

Accuracy (RBF Kernel): 0.85

In [33]: import numpy as np