Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before _ by D

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before

Before | by Dr. Ashish Bama…

Member-only story

Kolmogorov-Arnold Networks (KANs) Are Being Used To


Boost Graph Deep Learning Like Never Before
A deep dive into how Graph Kolmogorov-Arnold Networks (GKANs) are improving Graph Deep Learning to
surpass traditional approaches

Dr. Ashish Bamania · Follow


Published in Level Up Coding
8 min read · 2 days ago

Listen Share More

Image generated with DALL-E 3

KANs have gained a lot of attention since they were published in April 2024.

They are being used to solve several machine-learning problems that previously used Multi-layer Perceptrons
(MLPs), and their results have been impressive.

A team of researchers recently used KANs on Graph-structured data.

They called this new neural network architecture — Graph Kolmogorov-Arnold Networks (GKANs).

And, how did it go — you’d ask?

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 1/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

They found that GKANs achieve higher accuracy in semi-supervised learning tasks on a real-world graph
dataset (Cora) than the traditional ML models used for Graph Deep Learning, i.e. Graph Convolutional Networks
(GCNs).

This is a big step for KANs!

Here is a story where we dive deep into GKANs, learn how they are used with graph-structured data, and discuss
how they surpass traditional approaches in Graph Deep Learning.

But First, What Is Graph Deep Learning?


Graphs are mathematical structures that consist of nodes (or vertices) and edges (or links) connecting these
nodes.

Graph visualised (Image from author’s upcoming book ‘Computer Science In 100 Images’)

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 2/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Some examples of real-world data that is structured in the form of graphs include:

Social network connections (Users as nodes and relationships as edges)

Recommendation systems (Items as nodes and user interactions as edges)

Chemical Molecules and compounds (Atoms as nodes and bonds as edges)

Biological molecules such as Proteins (Amino acids as nodes and bonds as edges)

Transportation networks like Roadways (Intersections as nodes and pathways as edges)

Graph Deep Learning is a set of methods developed to learn from such graph-structured data and solve
problems based on this learning.

Some of these Graph Learning problems involve:

Graph classification (labelling a graph according to its properties)

Node classification (predicting the label of a new node)

Link prediction (predicting the existence of relations/ edges between nodes)

Graph generation (creating new graphs based on existing ones)

Community detection (identifying clusters of densely connected nodes within a graph)

Graph Embedding generation (generating low dimensional representations of higher dimensional graphs)

Graph clustering (grouping similar graph nodes together)

Graph anomaly detection (figuring out abnormal nodes or edges that do not match the expected pattern in a
graph)

Traditionally, these problems have been solved using Graph Neural Networks (GNNs) and their variants (notably
Graph Convolutional Networks), which use MLPs at their core.

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 3/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Graph Neural Networks visualised (Image from author’s book ‘AI In 100 Images’)

Let’s explore Graph Convolutional Networks (GCNs) in a bit more detail.

What Are Graph Convolutional Networks?


A Graph Convolutional Network (GCN) combines a graph’s node features with its topology (or how the nodes are
connected in space). This allows it to effectively capture the dependencies and relationships in the graph.

In other words, GCNs are based on the assumption that node labels y are mathematically dependent on both
the node features X and the graph’s structure (i.e. its adjacency matrix A ).

This can be mathematically expressed as:

y = f (X, A)

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 4/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

A multi-layer GCN updates the node representations by aggregating the information from neighbouring nodes
using a layer-wise propagation rule:

Layer-wise propagation rule employed by a Graph Convolutional Network (Image from original research paper)

where:

à = A + I is the Augmented adjacency matrix or the graph's adjacency matrix with added self-connections
for each node. It is the sum of the graph’s adjacency matrix with its identity matrix I.

D~ is a diagonal matrix of à where each diagonal element represents the degree of node i in the
augmented graph

D~^(-1/2) Ã D~^(-1/2) is used to symmetrically normalize Ã, to make sure that each node’s influence is
appropriately scaled by its degree. This normalized adjacency matrix is usually represented with Â.

H(l) is the matrix of node features at layer l

H(0) represents the initial node features (or X)

W(l) is the trainable weight matrix at layer l

σ represents an activation function (e.g. ReLU)

A simple two-layer GCN’s forward propagation (used for node classification) can be expressed as follows:

Forward propagation in a 2-layer GCN for graph node classification (Image from original research paper)

where:

 is the normalized adjacency matrix

X is the input feature matrix

W(0) and W(1) are the weight matrices for the first and second layers, respectively. These weights are
optimized using Gradient Descent.

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 5/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Overview of a two-layer GCN (Image from original research paper)

Now that we know about GCNs let’s move on to learning about KANs.

Next, What Are KANs?


Kolmogorov-Arnold Network (KAN) is a novel and innovative neural network architecture based on the
Kolmogorov-Arnold representation theorem.

They are a promising alternative to the currently popular MLPs that are based on the Universal Approximation
Theorem.

The core idea behind KANs is to use learnable univariate activation functions (shaped as a B-Spline) on the
edges and simple summations on the nodes of a neural network.

This contrasts with MLPs that use learnable weights on the edges while having a fixed activation function on
the neural network nodes.

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 6/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
A comparison between MLPs and KANs (Image from the research paper titled ‘KAN: Kolmogorov–Arnold Networks’
published in ArXiv)

When compared with MLPs, KANs:

Lead to smaller computational graphs

Are more parameter-efficient and accurate

Converge faster and achieve lower losses

Have steeper scaling laws

Are highly interpretable

On the contrary, given the same number of parameters, they take longer to train compared to MLPs.

The Birth Of GKANs


Considering the advantages that KANs offer, researchers devised a novel hybrid architecture called Graph
Kolmogorov-Arnold Networks (GKANs) that extended the use of KANs on graph-structured data.

They aimed to find out if GKANs could effectively learn from both labelled and unlabeled data in a semi-
supervised setting and outperform traditional graph learning methods.

The team developed two GKAN architectures, which are described below.

GKAN Architecture 1: Activations After Summation


In this architecture, the learnable univariate activation functions are applied to the aggregated node features
after the summation step.

In other words, the node embeddings are first aggregated using the normalized adjacency matrix, and then they
are passed through the KAN layer.

The layer-wise propagation rule for GKAN Architecture 1 is shown below.

Layer-wise propagation rule for GKAN Architecture 1 (Image created by author)

where:

H(l) and H(l+1) represent the node feature matrix at layers l and l+1 , respectively

 is the normalized adjacency matrix

the KANLayer operation applies learnable univariate activation functions or B-Splines to the aggregated node
features ÂH(l)

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 7/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

The forward propagation model for the architecture is expressed as:

Forward propagation model for GKAN Architecture 1 with L layers (Image created by author)

Overview of a two-layer GKAN Architecture 1 (Image from original research paper)

GKAN Architecture 2: Activations Before Summation


In this architecture, the learnable univariate activation functions are applied to the aggregated node features
before the summation step.

In other words, the node features are first passed through the KAN layer and then summated using the
normalized adjacency matrix.

The layer-wise propagation rule for GKAN Architecture 2 is shown below.

Layer-wise propagation rule for GKAN Architecture 2 (Image created by author)

where:

H(l) and H(l+1) represent the node feature matrix at layers l and l+1 , respectively

 is the normalized adjacency matrix

the KANLayer operation applies learnable univariate activation functions or B-Splines to each element of the
input node features H(l)

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 8/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

The forward propagation model for the architecture is expressed as:

Forward propagation model for GKAN Architecture 2 with L layers (Image created by author)

Overview of a two-layer GKAN Architecture 2 (Image from original research paper)

The Performance Of GKANs


GKANs vs. GCN
Both GKAN architectures were first trained on the Cora dataset.

The Cora dataset is a citation network that consists of documents as nodes and the citation links between these
documents as edges.

There are 7 different classes in this dataset that have 1433 features per document.

Next, their performance was compared to that of a conventional GCN, with a comparable number of
parameters, over both train and test data using a subset of 200 features from the total 1433 available in the
dataset.

And the results were quite incredible!

GKANs achieved higher accuracy than GCNs for both 100 and 200 feature sets.

On the first 100 features of the dataset, both GKAN architectures achieved higher accuracy than the GCN.

Notably, the GKAN Architecture 2 achieved 61.76% accuracy compared to 53.5% for GCN.

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 9/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Performance of different architectures on the first 100 features of the Cora dataset, where k is the polynomial degree
in the spline functions, g is the spline grid size, and h is the size of hidden layers. (Image from original research paper)

Similarly, both GKAN architectures achieved higher accuracy than the GCN on the first 200 features of the
dataset, with the GKAN Architecture 2 achieving 67.66% accuracy compared to 61.24% for GCN.

Performance of different architectures on the first 100 features of the Cora dataset (Image from original research
paper)

The training and test accuracy plots below showed that GKANs achieved higher accuracy during both the
training and testing phases.

Training and Test Accuracy plots for different architectures (Image from original research paper)

It was also noted that GKAN architectures showed a sharper decrease in loss values during training and required
fewer epochs to be trained.

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 10/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Training and Test Loss for different architectures (Image from original research paper)

Influence Of Parameters on GKANs


Researchers also evaluated how different parameters impacted the performance of GKANs.

These parameters were:

k : the degree of the polynomial in the spline functions

g: the grid size for the spline functions

h: the size of hidden layers in the network

It was found that the following led to the most effective GKANs.

Lower polynomial degrees ( k = 1 out of {1, 2, 3} )

Intermediate grid sizes ( g = 7 out of {3, 7, 11} )

Moderate hidden layer sizes ( h = 12 out of {8, 12, 16} )

Training Time
Although GKANs showed high accuracy and better efficiency with faster convergence, researchers noted that
their training process was relatively slow, requiring future optimizations.

K ANs have opened up a new avenue for improved graph learning and could also be a promising alternative
for other graph learning approaches (including Graph Autoencoders, Graph Transformers, and more)
that use MLPs at their core.

What are your thoughts on them? Have you used KANs in your projects yet? Let me know in the comments below!

Further Reading
Research paper titled ‘GKAN: Graph Kolmogorov-Arnold Networks’ on ArXiv

Software implementation of GKANs on GitHub (yet to be publically released by the research team)

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 11/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Author’s story titled ‘Kolmogorov-Arnold Networks (KANs) Might Change AI As We Know It, Forever’, that explains
KANs in detail

GitHub repository featuring a curated list of projects using Kolmogorov-Arnold Networks (KANs)

Here are my mailing list links if you’d like to stay connected to my work —

Get an email whenever Dr. Ashish Bamania publishes.


Get an email whenever Dr. Ashish Bamania publishes.
bamania-ashish.medium.com

Ashish’s Substack | Ashish Bamania | Substack


Sharing Everything That I Have Learned & Have Been Learning About, Unfiltered.
ashishbamania.substack.com

Byte Surgery | Ashish Bamania | Substack


🚀 A Deep Dive Into The Best Of Software Engineering ⚙️
bytesurgery.substack.com

Subscribe to Dr. Ashish Bamania on Gumroad


Top Tech & AI Writer On Medium | Self-Taught Software Engineer 👨‍💻 | Emergency Doctor 🩺 | AIIMS,
New Delhi 👨‍🎓
bamaniaashish.gumroad.com

Technology Data Science Programming Artificial Intelligence Machine Learning

Follow

Written by Dr. Ashish Bamania


https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 12/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…
27K Followers · Writer for Level Up Coding

Self- Taught Software Engineer 👨‍💻 | Emergency Physician 🩺 | AIIMS, New Delhi 👨‍🎓| Free 'AI In 100 Images' :
https://bamaniaashish.gumroad.com/l/visual_ai

More from Dr. Ashish Bamania and Level Up Coding

Dr. Ashish Bamania in Level Up Coding

Google’s New Algorithms Just Made Searching Vector Databases Faster Than Ever
A Deep Dive into how Google’s ScaNN and SOAR Search algorithms supercharge the performance of Vector Databases

Jun 18 442 1

Alexander Nguyen in Level Up Coding

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 13/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

The resume that got a software engineer a $300,000 job at Google.


1-page. Well-formatted.

Jun 1 9.5K 118

Open in app

Search

Fareed Khan in Level Up Coding

Building LLaMA 3 From Scratch with Python


Code Your Own Billion Parameter LLM

May 28 1.5K 12

Dr. Ashish Bamania in Level Up Coding

Ditch JSON! Here Are 5 (Better) Data Serialization Formats To Use In Your Next Project
Have you heard about “Cap’n Proto”, the Infinity times faster protocol?

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 14/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Apr 1 1.2K 26

See all from Dr. Ashish Bamania

See all from Level Up Coding

Recommended from Medium

Jan Kammerath

Why Tech Workers Are Fleeing Germany — A Reality Check


Over the past months and years I have seen a number of friends and colleagues leave Germany for good. Some of them were
native Germans…

6d ago 1.8K 49

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 15/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Joe Procopio in Entrepreneurship Handbook

Tech Employees Are Beyond Burned Out, and This Time They Need More Than Empty Promises
Morale is in the tank. Quiet quitting is through the roof. Here’s why.

5d ago 979 22

Lists

Staff Picks
673 stories · 1099 saves

Stories to Help You Level-Up at Work


19 stories · 670 saves

Self-Improvement 101
20 stories · 2203 saves

Productivity 101
20 stories · 1958 saves

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 16/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Liu Zuo Lin in Level Up Coding

Write Python Functions Like This Or I’ll Reject Your Pull Request
This was the energy I was getting from my tech lead at work. And I actually agree with him at this point.

Jun 22 1.1K 29

Andrew Zuo

Async Await Is The Worst Thing To Happen To Programming


I recently saw this meme about async and await.

Jun 21 526 45

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 17/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

Kasper Müller in Cantor’s Paradise

The Mystery of Catalan’s Constant


And famous solutions to infinite series that everyone should know

5d ago 340 2

Lucas de Lima Nogueira in Towards Data Science

How Bend Works: A Parallel Programming Language That “Feels Like Python but Scales Like
CUDA”
A brief introduction to Lambda Calculus, Interaction Combinators, and how they are used to parallelize operations on Bend /
HVM.

2d ago 482 5

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 18/19
29/06/2024, 12:51 Kolmogorov-Arnold Networks (KANs) Are Being Used To Boost Graph Deep Learning Like Never Before | by Dr. Ashish Bama…

See more recommendations

https://levelup.gitconnected.com/kolmogorov-arnold-networks-kans-are-being-used-to-boost-graph-deep-learning-like-never-before-2d39fec7dfc3 19/19

You might also like