Professional Documents
Culture Documents
Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before _ by D
Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before _ by D
Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before _ by D
Member-only story
KANs have gained a lot of attention since they were published in April 2024.
They are being used to solve several machine-learning problems that previously used Multi-layer Perceptrons
(MLPs), and their results have been impressive.
They called this new neural network architecture — Graph Kolmogorov-Arnold Networks (GKANs).
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 1/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
They found that GKANs achieve higher accuracy in semi-supervised learning tasks on a real-world graph
dataset (Cora) than the traditional ML models used for Graph Deep Learning, i.e. Graph Convolutional Networks
(GCNs).
Here is a story where we dive deep into GKANs, learn how they are used with graph-structured data, and discuss
how they surpass traditional approaches in Graph Deep Learning.
Graph visualised (Image from author’s upcoming book ‘Computer Science In 100 Images’)
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 2/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Some examples of real-world data that is structured in the form of graphs include:
Biological molecules such as Proteins (Amino acids as nodes and bonds as edges)
Graph Deep Learning is a set of methods developed to learn from such graph-structured data and solve
problems based on this learning.
Graph Embedding generation (generating low dimensional representations of higher dimensional graphs)
Graph anomaly detection (figuring out abnormal nodes or edges that do not match the expected pattern in a
graph)
Traditionally, these problems have been solved using Graph Neural Networks (GNNs) and their variants (notably
Graph Convolutional Networks), which use MLPs at their core.
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 3/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Graph Neural Networks visualised (Image from author’s book ‘AI In 100 Images’)
In other words, GCNs are based on the assumption that node labels y are mathematically dependent on both
the node features X and the graph’s structure (i.e. its adjacency matrix A ).
y = f (X, A)
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 4/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
A multi-layer GCN updates the node representations by aggregating the information from neighbouring nodes
using a layer-wise propagation rule:
Layer-wise propagation rule employed by a Graph Convolutional Network (Image from original research paper)
where:
à = A + I is the Augmented adjacency matrix or the graph's adjacency matrix with added self-connections
for each node. It is the sum of the graph’s adjacency matrix with its identity matrix I.
D~ is a diagonal matrix of à where each diagonal element represents the degree of node i in the
augmented graph
D~^(-1/2) Ã D~^(-1/2) is used to symmetrically normalize Ã, to make sure that each node’s influence is
appropriately scaled by its degree. This normalized adjacency matrix is usually represented with Â.
A simple two-layer GCN’s forward propagation (used for node classification) can be expressed as follows:
Forward propagation in a 2-layer GCN for graph node classification (Image from original research paper)
where:
W(0) and W(1) are the weight matrices for the first and second layers, respectively. These weights are
optimized using Gradient Descent.
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 5/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Now that we know about GCNs let’s move on to learning about KANs.
They are a promising alternative to the currently popular MLPs that are based on the Universal Approximation
Theorem.
The core idea behind KANs is to use learnable univariate activation functions (shaped as a B-Spline) on the
edges and simple summations on the nodes of a neural network.
This contrasts with MLPs that use learnable weights on the edges while having a fixed activation function on
the neural network nodes.
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 6/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
A comparison between MLPs and KANs (Image from the research paper titled ‘KAN: Kolmogorov–Arnold Networks’
published in ArXiv)
On the contrary, given the same number of parameters, they take longer to train compared to MLPs.
They aimed to find out if GKANs could effectively learn from both labelled and unlabeled data in a semi-
supervised setting and outperform traditional graph learning methods.
The team developed two GKAN architectures, which are described below.
In other words, the node embeddings are first aggregated using the normalized adjacency matrix, and then they
are passed through the KAN layer.
where:
H(l) and H(l+1) represent the node feature matrix at layers l and l+1 , respectively
the KANLayer operation applies learnable univariate activation functions or B-Splines to the aggregated node
features ÂH(l)
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 7/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Forward propagation model for GKAN Architecture 1 with L layers (Image created by author)
In other words, the node features are first passed through the KAN layer and then summated using the
normalized adjacency matrix.
where:
H(l) and H(l+1) represent the node feature matrix at layers l and l+1 , respectively
the KANLayer operation applies learnable univariate activation functions or B-Splines to each element of the
input node features H(l)
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 8/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Forward propagation model for GKAN Architecture 2 with L layers (Image created by author)
The Cora dataset is a citation network that consists of documents as nodes and the citation links between these
documents as edges.
There are 7 different classes in this dataset that have 1433 features per document.
Next, their performance was compared to that of a conventional GCN, with a comparable number of
parameters, over both train and test data using a subset of 200 features from the total 1433 available in the
dataset.
GKANs achieved higher accuracy than GCNs for both 100 and 200 feature sets.
On the first 100 features of the dataset, both GKAN architectures achieved higher accuracy than the GCN.
Notably, the GKAN Architecture 2 achieved 61.76% accuracy compared to 53.5% for GCN.
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 9/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Performance of different architectures on the first 100 features of the Cora dataset, where k is the polynomial degree
in the spline functions, g is the spline grid size, and h is the size of hidden layers. (Image from original research paper)
Similarly, both GKAN architectures achieved higher accuracy than the GCN on the first 200 features of the
dataset, with the GKAN Architecture 2 achieving 67.66% accuracy compared to 61.24% for GCN.
Performance of different architectures on the first 100 features of the Cora dataset (Image from original research
paper)
The training and test accuracy plots below showed that GKANs achieved higher accuracy during both the
training and testing phases.
Training and Test Accuracy plots for different architectures (Image from original research paper)
It was also noted that GKAN architectures showed a sharper decrease in loss values during training and required
fewer epochs to be trained.
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 10/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Training and Test Loss for different architectures (Image from original research paper)
It was found that the following led to the most effective GKANs.
Training Time
Although GKANs showed high accuracy and better efficiency with faster convergence, researchers noted that
their training process was relatively slow, requiring future optimizations.
K ANs have opened up a new avenue for improved graph learning and could also be a promising alternative
for other graph learning approaches (including Graph Autoencoders, Graph Transformers, and more)
that use MLPs at their core.
What are your thoughts on them? Have you used KANs in your projects yet? Let me know in the comments below!
Further Reading
Research paper titled ‘GKAN: Graph Kolmogorov-Arnold Networks’ on ArXiv
Software implementation of GKANs on GitHub (yet to be publically released by the research team)
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 11/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Author’s story titled ‘Kolmogorov-Arnold Networks (KANs) Might Change AI As We Know It, Forever’, that explains
KANs in detail
GitHub repository featuring a curated list of projects using Kolmogorov-Arnold Networks (KANs)
Here are my mailing list links if you’d like to stay connected to my work —
Follow
Self- Taught Software Engineer 👨💻 | Emergency Physician 🩺 | AIIMS, New Delhi 👨🎓| Free 'AI In 100 Images' :
https://bamaniaashish.gumroad.com/l/visual_ai
Google’s New Algorithms Just Made Searching Vector Databases Faster Than Ever
A Deep Dive into how Google’s ScaNN and SOAR Search algorithms supercharge the performance of Vector Databases
Jun 18 442 1
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 13/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Open in app
Search
May 28 1.5K 12
Ditch JSON! Here Are 5 (Better) Data Serialization Formats To Use In Your Next Project
Have you heard about “Cap’n Proto”, the Infinity times faster protocol?
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 14/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Apr 1 1.2K 26
Jan Kammerath
6d ago 1.8K 49
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 15/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Tech Employees Are Beyond Burned Out, and This Time They Need More Than Empty Promises
Morale is in the tank. Quiet quitting is through the roof. Here’s why.
5d ago 979 22
Lists
Staff Picks
673 stories · 1099 saves
Self-Improvement 101
20 stories · 2203 saves
Productivity 101
20 stories · 1958 saves
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 16/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
Write Python Functions Like This Or I’ll Reject Your Pull Request
This was the energy I was getting from my tech lead at work. And I actually agree with him at this point.
Jun 22 1.1K 29
Andrew Zuo
Jun 21 526 45
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 17/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
5d ago 340 2
How Bend Works: A Parallel Programming Language That “Feels Like Python but Scales Like
CUDA”
A brief introduction to Lambda Calculus, Interaction Combinators, and how they are used to parallelize operations on Bend /
HVM.
2d ago 482 5
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 18/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 19/19