How the Al Behind TikTok Works
TikTok is a video-sharing app that let users create
and share short videos. It impresses users with its
personalized "just for you” recommendations
precisely. It is highly addictive and very popular
among Gen Z. Behind it, it is powered by artificial
intelligence technologies.
TikTok Architecture
The architecture of the TikTok recommendation
system includes three components: big data
frameworks, machine learning, and microservices
architecture.
1. Big data frameworks are the starting point of the
system. It provides real-time data streaming
processing, data computing, and data storage.
2. Machine learning is the brain of the
recommendation system. A range of machine
learning and deep learning algorithms and
techniques are applied to build models and generate
recommendations to suit individual preferences.
3. Microservices architecture is the infrastructure
underneath to make the whole system serve fast
and efficiently.
Big Data Frameworks
No data, no intelligence.Most data are coming from the users’ smartphones.
That includes an operating system and installed app
etc. More importantly, TikTok pays special attention
to the users’ activity logs, such as watch time, swipe,
likes, shares, and comments.
TikTok Architecture
The architecture of the TikTok recommendation
system includes three components: big data
frameworks, machine learning, and microservices
architecture.
1. Big data frameworks are the starting point of the
system. It provides real-time data streaming
processing, data computing, and data storage.
2. Machine learning is the brain of the
recommendation system. A range of machine
learning and deep learning algorithms and
techniques are applied to build models and generate
recommendations to suit individual preferences.
3. Microservices architecture is the infrastructure
underneath to make the whole system serve fast
and efficiently.
Big Data Frameworks
No data, no intelligence.
Most data are coming from the users’ smartphones.
That includes an operating system and installed appetc. More importantly, TikTok pays special attention
to the users’ activity logs, such as watch time, swipe,
likes, shares, and comments.
The log data are collected and aggregated through
flume and scribe. They are piped into the Kafka
queue. Then Apache Storm processes data streams
in real-time with other components in the Apache
Hadoop ecosystem.
Apache Hadoop ecosystem is a distributed system
for data processing and storage. This includes
MapReduce, the first generation of distributed data
processing system. It processes data in parallel with
batch processing. YARN is a framework for job
scheduling and cluster resource management. HDFS
is a distributed file system. HBase is a scalable,
distributed database that supports structured data
storage for large tables. Hive is a data warehouse
infrastructure that provides data summarization and
querying. Zookeeper is a high-performance
coordination service.
As data volumes grow fast, real-time data
processing frameworks come into the picture.
Apache Spark is the third-generation framework
that helps with near real-time distributed
processing for big data workloads. Spark enhancesthe performance of MapReduce by processing in
memory. In the last couple of years, TikTok applies
the fourth-generation framework Flink. It is
designed to do real-time streaming processing
natively.
The database systems include MySQL, MongoDB,
and many others.
Machine Learning
This is the center of how TikTok earns the household
name of a hyper-personalized, addictive algorithm.
After vast datasets pour in, next is content analysis,
user profiling, and context analysis. The neural-
network deep learning frameworks such as
TensorFlow are used to perform computer vision and
native language processing (NLP). Computer vision
will decipher images with photos and videos. NLP
includes classification, labeling, and evaluations.
The classic machine learning algorithms are used,
including logistic regression (LR), convolutional
neural network (CNN), recurrent neural network
(RNN), and gradient boosting decision trees (GBDT).
The common recommendation approaches are
applied, such as content-based filtering (CBF),
collaborative filtering (CF), and more advanced
matrix factorization (MF).The secret weapons that TikTok uses to read your
mind are:
1. Algorithm experimental platform: The engineers
experiment with the mixing of multiple machine
learning algorithms such as LR and DNN, and then
run the testing (A/B test) and do the adjustment.
2. Extensive classification and labeling: The models
are based on the users' engagement such as watch
time, swipes, and the commonly used likes or shares
(what you do as a reflection of your subconscious
says more about you than what you say). The
number of user features, vectors, and categories is
more than most of the recommendation systems in
the world — and they keep adding more.
3. User feedback engine: It updates the models after
retrieving feedback from the users in multiple
iterations. The experience management platform is
built on this engine and ultimately improves the
perditions and recommendations.
To solve the cold-start problem in
recommendations, the recall strategy is used. It is to
select thousands of candidates from tens of millions
of videos that have been proven to be popular and
high quality.
Meanwhile, some of the Al work has been moved tothe client-side for a super-fast response. That
includes real-time training, modeling, and reasoning
done on the devices. The machine learning
frameworks such as TensorFlow Lite or ByteNN are
used on the client-side.
Microservices Architecture
TikTok has embraced cloud-native infrastructure.
The recommendation components such as user
profiling, predictions, cold-start, recall, and user
feedback engine are serving as APIs. The services
are hosted in clouds such as Amazon AWS and
Microsoft Azure. As the outcome of the system, the
video curation will be pushed to the users through
the cloud.
TikTok employs Kubernetes-based containerization
technology. Kubernetes is known as a container
orchestrator. It is the toolset to automate the
application's life cycle. Kubeflow is dedicated to
making deployments of machine learning workflows
on Kubernetes.
As part of the cloud-native stack, Service mesh is
another tool to handle service-to-service
communication. It controls how different parts of an
application share data with one another. It inserts
features or services at platform layers, rather thanapplication layers.
Due to the requirement of high concurrency, the
services are built with Go language and gRPC. In
TikTok, Go has become the dominant language in
service development because of its good build-in
network and concurrency support. gRPC is a Remote
Procedure Control framework to build and connect
services efficiently.
The success of Tiktok is that it would go the extra
mile to provide the best user experience. They build
in-house tools to maximize the performance at a
low-level (system level). For example, ByteMesh is
an improved version of Service Mesh, KiteX is a high-
performance Golang gRPC framework, and Sonic is
an enhanced Golang JSON library. Other in-house
tools or systems include parameter servers, ByteNN,
and abase — to name a few.
As a TikTok machine learning principal, Xiang Liang
put it, sometimes the infrastructure beneath is more
important than the (machine learning) algorithms
below.c ‘TikTok Recommendation System
tke
BS Ee
Tw
~pee
Tit ementsCopyrah Lenten Pitre ye