Professional Documents
Culture Documents
Graph Theory
Graph Theory
In a weighted graph, each connection between two nodes has a certain value
associated with it. For example, it may represent the number of bonds in a
molecule of ethene, as below.
H H
1
2
C C
1
H H
This is also particularly useful for neural networks because weights between
neurones are important parameters in generating accurate output. In general,
a weight can be thought of as the “magnitude” of an edge which can denote the
value of a relevant metric.
1 2
This is applicable in the transport layer of the TCP/IP stack, when nodes de-
termine the shortest route from source to destination during packet switching.
1
Furthermore, directions can denote journeys. In problems like the Travelling
Salesman Problem, this is significant because journeys are often uni-directional.
A constraint may include beginning at a certain source node and ending at a
given destination node. Therefore, directions can help model fundamental con-
straints such as direction of travel. Within the class of directed graphs, there
also exist several other types of graph. One significant aspect of a directed
graph is whether it is cyclic or acyclic. A directed cyclic graph contains a loop.
When traversing the nodes, it would be possible to go on and on in that loop.
The graph above does not have this feature because one has no other choice
than to go from 1 to 2 to 3 to 4 (if we are starting at node 1). Therefore it
is acyclic. However, adding just one (directed) edge to this graph makes it cyclic.
1 2
Graphs that can be “embedded in the plane” are classified as planar. This
means that, whilst preserving the graph as G(V, E) with the set of vertices V
and the set of edges E, it is possible to configure it such that none of the edges
intersect each other (except at their common endpoints). For instance, the left-
most graph below is actually planar because it can be embedded in the plane
such that no edges touch each other except at shared nodes. It does not appear
to be planar because the edge (6, 1) crosses the edge (2, 3), yet this is easily
solved in the rightmost graph below. Because they are fundamentally the same
graph as G(V, E) is entirely preserved, their planarity is the same: the second
graph is visibly planar, and thus so is the first. A graph is only non-planar if
there is no way to embed it in the plane.
5 5
1 1
4 3 4 3
2 2
6 6
2
edge (2, 4), and yet, when reillustrating the latter of those edges, it ostensibly
becomes planar. So, K4 is planar.
1 4 1 4
2 3 2 3
U1 U2 U3
H1 H2 H3
The nodes by which a face (a region bounded by edges, including the outer,
infinitely large region) is bounded alternate between house and utility, and so
each face is encompassed by at least 4 edges because each house connects to
a given utility exactly once. Since the graph must be planar, any edge must
touch exactly 2 faces. Therefore, the number of faces is never more than half
the number of edges (accounting for edges that share faces).
1
F ≤ (E)
2
Further, Euler’s Characteristic Formula states that V + E − F = 2 for a
planar graph, where V is the number of nodes, E is the number of edges, and
F is the number of faces.
F =V +E−2
1
V +E−2≤ (E)
2
2V + 2E − 4 ≤ E
2V − 4 ≤ −E
E ≤ 2V − 4
E =3×3=9
3
V =6
2V − 4 = 2(6) − 4 = 8
This implies that 9 ≤ 8, and so there must be a flaw in the original assump-
tion (that it would be possible to create a planar graph to connect the houses
to the utilities). Thus, it is impossible to connect the houses and utilities in a
planar arrangement.
Graphs have all sorts of useful applications, even in abstract Computer Sci-
ence problems. So, it is important to represent them as efficiently accessible
data structures. In an adjacency list, each node is mapped to a list of all of its
neighbours. Consider the below graph.
1 2
4 5
For instance, in the graph above there is an edge between nodes 2 and 3.
Therefore, in the adjacency matrix, the cells (2, 3) and (3, 2) are marked 1.
However, there is no edge between nodes 3 and 5, so the cells (3, 5) and (5,
4
3) are marked 0. The adjacency matrix for our graph is below. This could be
implemented in code using a two-dimensional list with one sublist for each row.
1 2 3 4 5
1 0 1 0 0 0
2 1 0 1 1 0
3 0 1 0 0 0
4 0 1 0 1 0
5 0 0 0 1 0
One method of traversal is the Depth First Search (DFS).It has time com-
plexity O(n + E), where n is the number of nodes and E is the number of edges.
This means that the time taken for the algorithm to run increases linearly as
there are more nodes and edges. A simple way to implement this is recursively,
as in the Python code listing below. During a recursive DFS, an empty stack
(with a size that is the number of nodes on the graph) is initialised. This will
help keep track of all the nodes that have been visited or not visited. Add the
source node to the stack.
5
i f s o u r c e n o d e not in v i s i t e d :
print ( s o u r c e n o d e )
v i s i t e d . add ( s o u r c e n o d e )
f o r n e i g h b o u r in a d j l i s t [ s o u r c e n o d e ] :
dfs ( a d j l i s t , neighbour )
Consider the below graph.
1 2
4 5
One commonly cited method - the greedy method - is a heuristic that takes
the optimal choice at each individual stage. Notably, this does not mean that
the overall resulting path is the shortest, as it is a superficial method; it only
goes one edge deep into the proposed path.
For example, in the below graph, if we are aiming to find the shortest path
from node 1 to node 4, we are initially faced with the choice of going to node 2
or to node 3. Node 1 to node 2 has a weight of 6, whereas node 1 to node 3 has
a weight of only 2. So, node 2 is visited. To get to node 4, the only choice from
here is to go directly to node 4, which has a wight of 8. The total distance us-
ing this algorithm is 10, whereas taking the alternative route has a distance of 7.
6
2
1
1 4
8
3
The described algorithm, given the adjacency list {1: [[2, 6], [3, 2]], 2: [[4,
1]], 3: [[4, 8]], 4: [[]]}, would output ([1, 3, 4], 10).
A notable problem that employs techniques in graph theory is the The Trav-
elling Salesman Problem (TSP). The TSP is an NP Hard problem that asks for
the shortest route in a weighted, undirected graph, that traverses every node
exactly once and returns to the starting node.
As in the shortest path problem, the TSP can also be approached using a
type of greedy algorithm known as the Nearest Neighbour Heuristic. In exactly
the same way as before, the nearest neighbour is always pursued. In graphs
with few nodes, it is not a bad solution, because there are few combinations
of paths. Yet, as you increase the number of nodes, there increase the number
of valid paths, and so it becomes an increasingly weak solution. One way to
evaluate how accurate the Nearest Neighbour Heuristic is is to find the ratio of
the distance of its solution to the true minimum distance. That way, a percent-
age score can be obtained. For instance, if it outputs a distance of 51, and the
true minimum distance is 50, then it is 2% off (too high, of course, as it can
never be lower than the true minimum). Another greedy algorithm can be used,
whereby we repeatedly attempt to connect the edges with smallest weight. An
edge is invalid and therefore rejected when it creates a cycle (unless it is the
final connection) or gives a node 3 edges.
Given a reasonable solution, there are certain algorithms that can be ap-
plied to it to refine it into an even better solution. One of the most common
improvement algorithms is an n-opt improvement, defined as considering all
combinations of n edges and reconfiguring them to shorten the path, whilst
keeping the path as a valid solution. Doing this for all possible groups of n
edges makes the graph “n-opt optimal”. For instance, in the below graph, to
make a 2-opt improvement, we would look at all the pairs of edges: namely, all
the edges are (1, 5), (2, 4), (2, 3), (2, 5), and (1, 3). We might start with the first
7
two and swap them around; in other words, we would consider an alternative
graph with edges (1, 2), and (4, 5) instead of (1, 5) and (2, 4). Do this for all
possible pairs of edges, discarding the original graph if the sum of the weights
on the new graph is less than the sum of the weights on the original solution.
Then, the new graph is 2-opt optimal.
2 3 2 3
1 4 1 4
5 5