Journal of Human Genetics


Genes, the brain, and artificial intelligence in evolution

Naoyuki Kamatani1

Received: 3 June 2020 / Revised: 9 July 2020 / Accepted: 19 July 2020

© The Author(s), under exclusive licence to The Japan Society of Human Genetics 2020

Three important systems, genes, the brain, and artificial intelligence (especially deep learning) have similar goals, namely,
the maximization of likelihood or minimization of cross-entropy. Animal brains have evolved through predator-prey
interactions in which maximizing survival probability and transmission of genes to offspring were the main objectives.
Coordinate transformation for a rigid body necessary to win predator-prey battles requires a huge amount of matrix
operations in the brain similar to those performed by a powerful GPU. Things (molecules), information (genes), and
energy (ATP) are essential for using Maxwell’s demon model to understand how a living system maintains a low level of
entropy. However, while the history of medicine and biology saw molecular biology and genetics disciplines flourish, the

study of energy has been limited, despite estimates that >10% all human genes code energy-related proteins. Since there
are a large number of molecular and genetic diseases, many energy-related diseases must exist as well. In addition to
mitochondrial disease, common diseases such as neurodegenerative diseases, muscle diseases, cardiomyopathy, and
diabetes are candidates for diseases related to cellular energy shortage. We are developing ATP enhancer, a drug to treat
such diseases. I predict that in the future, the frontier of medicine and biology will involve energy and entropy, and the
frontier of science will be about the cognitive processes that scientists’ brains use to study mathematics and physics. That
will be understood by comparing the abilities that were necessary to survive battles between predators and prey during
evolutionary history.

Introduction The Big History of genes, the brain, and

artificial intelligence
Genes, the brain, and artificial intelligence (especially deep
learning) are important systems for understanding how The Big History of genes, the brain, and artificial intelli-
organisms developed, with differences as well as simila- gence, and their relationship to organismic evolution, began
rities between them. What is clear, however, is the causal with the formation of the Earth, which occurred about 4.6
relationship between the three. In other words, genes led billion years ago [1]. Life then started as bacterium or
to development of the brain, and human brains created archaea about 4 billion years ago, which was also when the
artificial intelligence (Fig. 1). The order of development system of genes, perhaps using DNA, began [1]. Thereafter,
in evolution also shows that genes emerged first, the brain for about 2 billion years, little evolutionary improvement
next, and finally artificial intelligence. Since artificial occurred, but about 2 billion years ago, bacteria and archaea
intelligence was originally intended to mimic the brain, underwent an internal symbiosis and acquired a huge
there is an arbitrary component to the similarity between energy-generating ability that led to a rapid increase in the
the two. sophistication of life [2]. Using that enormous energy-
generating capacity, organisms have become nucleated,
multicellular, and acquired sexual reproductive functions
[2]. Then, about 700 million years ago, they acquired sys-
tems of muscles powered by ATP, which efficiently con-
verted chemical energy into kinetic energy [3]. That is,
* Naoyuki Kamatani animals emerged [4]. That allowed organisms to move on
their own, enabling them to acquire food and reproduce
StaGen Co., Ltd., 4-11-6 Kuramae, KUGA Bldg. 8F, Taito-ku, effectively. In order to achieve those movement related
Tokyo 111-0051, Japan goals, they needed sensory organs that allow for input of
N. Kamatani

system that maintains order against disorder. However, the

origin of the negative entropy by which an organism
maintained decreased levels of entropy was not entirely
clear at the time.
Entropy was originally a concept defined in thermo-
dynamics by Clausius and others, but Boltzmann and others
added to its significance from the aspect of statistical phy-
sics. Entropy was associated with living organisms by
Schrödinger and then a meaning in information theory was
added by Shannon. Deep learning is designed as a system
Fig. 1 Three systems of genes, brain and AI have analogous structures
and the objectives are to maximize the probabilities of achieving the
that aims to minimize cross-entropy [8].
objectives. An arrow indicates a causality and an arrowhead indicates The cross-entropy is interpreted to be proportional to the
the direction of an activity logarithm of the likelihood function multiplied by −1
(in general, the proportional constants are meaningless for
likelihood functions) as will be explained in the following
external data [5] and a nervous system that processes the sentences [9].
information obtained from the sensory organs and transmits Among a total of M different events, let Qi denote the
it to the muscles [6]. In other words, animals possessed probability of the ith event. Imagine that an experiment was
input devices (sensory organs), information-processing performed a total of N times, and Ni denotes the number of
devices (the brain and nervous system), and output devi- times the ith event was observed. Then the likelihood
ces (muscles). This system is, of course, very similar to the function L for the observed events will be:
deep-learning algorithm, which has input, an information- Y N
processing system, and output (Fig. 1). L¼ Qi i ; ð1Þ
This system has evolved to be more and more efficient
by repeated interactions between predators and prey, which
have led animals to evolve in a direction that maximizes the and the logarithm of L is:
probability that the predator would capture the prey and the X X
probability that the prey would escape from the predator. log L ¼ Ni log Qi ¼ N Pi log Qi ; ð2Þ
This is because components of the system that increase the
probability of success (survival) are selected for and passed
on to future generations. As a result, the brain has evolved where the proportion of the ith event is Pi ¼ NNi and
into an extremely sophisticated system in humans, and from Pi ¼ 1. Note that the right side of Eq. (2) is proportional
that, sciences such as mathematics and physics have to the cross-entropy:
emerged. Humans subsequently went on to make prototypes X
of artificial intelligence that mimicked the brain.  Pi log Qi ; ð3Þ
Considering the above steps, it is clear that the three
systems developed in the order of genetic system → brain although the sign is different. If Pi is fixed and Qi can vary
system → artificial intelligence, and there is a causal rela- in Eq. (3), the minimum value of this cross-entropy is
tionship between them (Fig. 1). The genetic system and the entropy, i.e.,  Pi log Pi . This is equivalent to the fact
brain system are likely to have evolved toward maximizing that the maximum value of the likelihood function is the
the probabilities of individual survival and transmission of maximum likelihood. Of course, the entropy itself can be
the genetic data to offspring (Fig. 1). Then, what principles reduced even further by varying Pi. Although the least
are at the basis of the three systems? squares method appears to be different from likelihood and
cross-entropy, it is, in fact, equivalent to the maximum-
likelihood method and the minimization of the cross-
Energy and entropy entropy under the assumption of a normal distribution.
In the case of deep learning applied to supervised
Schrödinger, the founder of quantum theory, proposed in learning tasks, Pi is given externally by the data from
his book “What is life?” that living matter evades the decay supervisors, and the objective is to minimize the cross-
to equilibrium by maintaining a low level of entropy [7]. He entropy by moving Qi. Qi is a function containing many
considered entropy as representative of disorder and nega- parameters, and the parameter changes change Qi. That is,
tive entropy as that of order, referring to concepts derived for the supervised learning task performed in deep learning,
from statistical physics. He thought of the organism as a we aim to change the cross-entropy to entropy. When there
Genes, the brain, and artificial intelligence in evolution

is no supervisor, only Pi is present, so the objective is to the prey and the probability that the prey escapes from the
minimize entropy. predator. That refers, in other words, to the maximization of
The system of genes developed to allow organisms to the likelihood and the minimization of cross-entropy and
survive and produce offspring during the course of evolu- entropy. Such a system should have evolved through
tion [10]. Genes in an individual that has survived and been repeated predator-prey battles, which are analogous to the
able to leave offspring are passed to the next generation. In recent combination of reinforcement learning with deep
other words, if we consider the probability of transmitting neural networks (deep reinforcement learning) that was
genes to an offspring as a likelihood function and the genes used to make a deep-learning algorithm more efficient at
as parameters, they are considered to have evolved due to playing Go by battling itself [14].
the principle of maximum likelihood by which the prob- As described above, genes, the brain, and deep-learning
ability is maximized by changes that occur in the genes. systems are very similar in that they are based on the
This is also equivalent to the minimization of cross-entropy. maximum-likelihood method, i.e., the minimization of
In the supervised state where the environment does not cross-entropy and entropy. It should be noted that these
change, Pi in Eq. (3) is fixed and only Qi changes, but three systems can be considered as sequential layers. I have
during real evolution both Pi and Qi change because the previously proposed a six-layer structure for genomics that
environment can also change. When environmental change describes the entire scientific field for genomics [15]. In this
is unpredictable, the goal is to minimize entropy since there context, I proposed that looking for an element missing in
is no supervisor. In other words, both genetic systems and one of the two different layers with similar structures but
deep-learning algorithms are based on the maximum- present in the other can lead to a new discovery [15]. In
likelihood method that is equivalent, in principle, to the many cases, the missing element is not absent but hidden
cross-entropy and entropy minimization. In this case, the and not easily recognized. If one can recognize the hidden
parameters that can be varied to maximize the likelihood are element, that will lead to a major discovery. This method
genes in the case of heredity and weights and biases in the can be applied to the three sequential layers (genes, brain,
case of deep learning. That is, the genetic and deep-learning and artificial intelligence) of the present study.
systems operate under similar principles of likelihood Then, what system in the predator’s brain makes it
maximization or entropy minimization. effective at capturing the prey? It would not be an arithmetic
However, the major difference between the two is that system like Newtonian mechanics, but a system similar to
the former uses predominantly linear functions, while the Lagrangian mechanics or Hamiltonian mechanics [16] that
latter uses predominantly nonlinear functions. This may be is adaptive to the generalized coordinates. The reason is that
related to the different update rates of the parameters. The not only the prey but also the predator moves, and thus, the
system of inheritance updates parameters about every 20 predator needs to catch the prey based on the images cap-
years in humans. In comparison, in the case of the brain and tured by the predator’s own moving retinas. The predator
deep learning, parameter updates are instantaneous. There- can obtain data regarding angles and distances of objects at
fore, the former can process only a limited number of a point in time, and therefore, the brains of predator and
parameters, while the latter can process a huge number of prey are likely to be using the polar coordinate system to
parameters. The number of genes is about 20,000 in humans which Lagrangian or Hamiltonian mechanics can be
and the number of nucleotides is about 3 billion [11]. In applied, rather than Cartesian coordinates used in New-
comparison, the number of synapses in a human brain is tonian mechanics. When we see perspective drawings, we
thought to be 150 trillion [12]. If a genetic system were to understand that the polar but not the Cartesian coordinate
process such a large number of parameters, the number of system is in accord with the natural visual system in ani-
updates would be so small that it would be difficult to mals. For example, the Lagrangian is a function of gen-
manage the system. eralized coordinates, their time derivatives and time. The
most necessary data for the predator would be the position
at which the prey could be caught relative to their own
Principles of the brain and its evolution position. Their internal arithmetic systems need to be able to
handle coordinate transformation. Furthermore, the position
It is unclear what principle drives brain evolution and of the prey should also be recognized as a rigid body, not a
development, but it is likely that the maximum-likelihood point. It is necessary to process information based on the
method is at work, especially in the cortex [13]. Animals image data reflected in the retinas of the predator’s two eyes
have survived through numerous battles between predators and correctly transmit course changes to the muscles to
and prey for about 700 million years. Therefore, those achieve the goal of acquiring the prey. The brain’s need for
interactions should have altered the synaptic plasticity of the massive matrix operations to describe a moving rigid body
brain to maximize the probability that the predator catches can be understood by analogous needs of a GPU to perform
N. Kamatani

massive matrix operations on computer graphics [17, 18]. In represented by genes because they are in charge of the
addition, when a rigid body moves and rotates, it is information system, and energy is represented by ATP.
necessary to calculate the Eigenvalues and Eigenvectors of Historically, a main focus of biological research has been
the matrix to determine its center of gravity. As described the components that make up living systems, and from such
above, to win the battle between predator and prey, the investigations, the disciplines of biochemistry and mole-
brain must be equipped with a matrix arithmetic system that cular biology emerged. Recently, a main focus has also
performs enormous amounts of linear operations and ana- been on identifying the genes that encode those compo-
lytical mechanics capable of coordinate axis transformation. nents, and along with development of methods to under-
The function of differentiation is needed to find the speed, stand differences in those genes between individuals, the
and the function of integration is also needed to find the field of genetics has taken center-stage. In other words,
time and distance to achieve the objective. Using these, biology has been actively studied from the two aspects of
predator and prey perform operations to minimize cross- “things” and “information.” However, research in terms of
entropy and entropy, which are equivalent to operations to energy has not yet been fully developed. In fact, at least in
find maximum-likelihood points using the maximum- humans, almost all molecules and genes (and genome
likelihood method. sequences) have already been identified. This may mean
that not much remains to be discovered in such research
fields as molecular biology and genetics, which have been
Things (molecules), information (genes), and focusing on things and information. Of course, I understand
energy (ATP) are important for applying that various characteristics such as concentrations, dis-
Maxwell’s demon model to living organisms tributions, relationships, roles, functions, and other aspects
of molecules and genes are still left to be elucidated, but it
When considering the relationship between entropy and seems obvious to me that we need a new broad and fun-
living organisms, the model of Maxwell’s demon works as damental concept besides things and information. From that
a good reference [19]. The elements that appear in the foundation, I predict that bioenergetic research will develop
model of Maxwell’s demon are molecules, the demon, and to take a position alongside molecular biology and genetic
energy as well as two rooms connected by a small window research. Thereafter, I predict that the study of entropy in
[19] (Fig. 2). If, as Schrödinger said, an organism is a the context of these three elements, namely, things, infor-
system that maintains itself at a low level of entropy, then mation, and energy, will be the focus of future biology and
the model of Maxwell’s demon is really helpful for dis- medicine.
cussing negative entropy and the necessary role that energy
plays in maintaining that low entropy level. Since the rooms
describe a closed system, one can imagine them to represent The study of energy in biology and medicine
the whole body of an individual organism. If we apply this
system to an organism, gas molecules are represented by In Maxwell’s demon model, three elements are important to
small molecules and macromolecules, the demon is maintain a decreased level of entropy: things (molecules),
information (genes), and energy (ATP). In that context,
molecular biology and genetics have been the focuses of
modern biology and medicine, but there have been very few
studies regarding the role of energy in biomedical research.
The critical importance of energy for living organisms is
also suggested by the number and proportion of genes
associated with ATP. The metabolic pathways involved in
ATP production are the glycolytic system and oxidative
phosphorylation in mitochondria. While the number of
genes involved in glycolysis is not large, mitochondria, on
the other hand, are intracellular organelles for ATP synth-
Fig. 2 Maxwell’s demon model. A demon (eye) controls a small door
between two rooms of gas. As individual gas molecules approach the esis, and many genes coding for proteins in mitochondria,
door, the demon quickly opens and shuts the door so that only fast including ATP synthase, are either directly or indirectly
molecules are passed from the left room to the right room, while only involved in ATP production. Mitochondrial proteins are
slow molecules are passed into the other. In this process, energy is
thought to be encoded by about 1500 chromosomal genes
necessary because, otherwise, the second law of thermodynamics does
not hold. The elements necessary for generating negative entropy [20], and there are also many proteins associated with ATP
(order) are “things (molecules),” “information (demon),” and consumption, including kinases and ligases, both of which
“energy (ATP)” require ATP. The former are enzymes that transfer the
Genes, the brain, and artificial intelligence in evolution

phosphate at the γ position of ATP to other compounds, function. In that regard, a genetic disease caused by muta-
while the latter are enzymes that combine two compounds tions in the VCP gene is especially informative, as muta-
together using the energy obtained from ATP degradation. tions in that gene cause both neurodegenerative disease and
The targets of kinases are not limited to proteins, but even inclusion body myopathy [32]. The inclusion bodies of
limiting oneself to genes for protein kinases, their number inclusion body myositis, which is pretty common in the
exceeds 500 [21]. Ligases are not limited to enzymes using elderly, contain the same proteins that accumulate in neu-
ubiquitin, but even if limited to genes for the E3 ubiquitin rodegenerative diseases: amyloid-β, α-synuclein, tau, and
ligases, the number is thought to be 600–700 in the human TDP-43 [33]. The VCP gene encodes a AAA-ATPase,
genome [22]. Thus, there are about 1500 genes involved in which is involved in ubiquitin-dependent proteolysis [34],
ATP production and at least 1100 genes involved in ATP but degradation of a single protein molecule by the
consumption; these values cannot simply be added to the ubiquitin-based proteasome system requires a large energy
earlier value because some of the proteins involved in ATP expenditure: on the order of 300–400 ATP molecules [35].
consumption are likely to be present in the mitochondria. Taken together, one can easily envision a situation wherein
However, it is estimated that genes encoding ATP-related decreased ATP production due to mitochondrial dysfunc-
proteins account for at least 10% of all human genes. tion could lead to decreased proteolytic function by the
proteasome system, and consequently, to increased protein
deposition and manifestation of neurodegenerative diseases
Energy deficiency and diseases such as Alzheimer’s disease and Parkinson’s disease [36].
Based on such a hypothesis, we are developing an ATP
As described above, energy, along with things and infor- enhancer, i.e. a combination of a xanthine oxidase inhibitor
mation, is important for living organisms. In addition, the and inosine. By the administration of the ATP enhancer,
percentage of genes related to ATP in the entire human increased ATP has already been shown in healthy indivi-
genome is very large. That contrasts with the disease duals [37], and dramatic improvements in biomarkers in
situation, wherein a large number of diseases are known to two patients with mitochondrial disease were demonstrated
be related to things (molecular diseases) and information [38]. We also showed that MDS-UPDRS Part III score was
(genetic diseases), but the number ascribed to energy significantly improved by the ATP enhancer in Parkinson’s
(energy diseases) are quite few. Taken together, that data disease [39].
suggests the existence of many energy (ATP) related dis-
eases as well as many diseases related to molecules and
genes. However, with the exception of mitochondrial dis- The future frontier of medicine and biology
ease [23–25], there have not been many studies that actively
elucidate the mechanisms of diseases in terms of energy. I have already mentioned that the important concepts for
Mitochondria are essential organelles that use oxygen to describing the world are things, information, and energy.
produce ATP from the chemical energy of carbon-bearing These are represented in the industrial world, for example,
small molecules. The energy possessed by these small by the manufacturing, information, and energy industries. In
molecules can be used effectively in vivo only after it is biology, great progress has been made in molecular biology
converted to ATP. Mitochondrial disease is a disease caused (for things) and genetics (for information), but not much
by mutations in genes related to mitochondria that cause progress has been made on energy. ATP-related genes are
mitochondrial dysfunction [23–25]. The resulting disorders estimated to account for more than 10% of human genes,
include brain, hearing, visual, muscle, and heart disorders as and the energy industry accounts for about 8% of the global
well as diabetes. Therefore, diseases in which these organs industry. Overall, I predict that the next frontier of biology
are impaired are candidates for energy-related diseases. For and medicine will be energy. This is because all three
example, we already know that some forms of diabetes and concepts: things, information, and energy, are essential for
cardiomyopathy are caused by mutations in mitochondrial decreasing entropy, and the ability to maintain a decreased
genes [26, 27]. One of the candidate energy-related diseases level of entropy is the basic principle upon which living
of interest is neurodegenerative disease. In fact, many things depend.
researchers have proposed, based on a variety of evidence These three concepts are also important for describing
including that from genetic diseases, that Alzheimer’s and diseases. First, many molecular diseases exist, with exam-
Parkinson’s diseases are due to mitochondrial dysfunction ples being diabetes, which is a disease of sugar and insulin,
[28–30]. Mitochondrial function declines by 8% for every and hyperlipidemia, which is a disease of fat. Even if the
10 years increase of age [31]. Thus, neurodegenerative disease is not directly linked to a molecule, indirect con-
diseases may be caused by reduced energy production in nections may exist. For example, hypertension is thought to
aged people due to reduced neuronal mitochondrial indirectly involve a variety of molecules, including sodium,
N. Kamatani

adrenaline, renin, and angiotensin, and neurodegenerative cognitive systems that make up the scientists’ brains are an
disorders such as Alzheimer’s disease are thought to involve inevitable result of evolution and are simply the structure best
molecules called amyloid-β and tau. Second, there are many suited to chasing and catching prey or escaping from pre-
diseases of genes, which represent information. Thus, many dators. Predator and prey are likely to be using the same ever-
genetic diseases as well as common diseases are considered changing polar coordinate system. Even when considering
to be related to genetic information. While there are many quantum theory and looking at the Schrödinger equation,
diseases related to molecules, which are things, and genes, scientists may be using the Hamiltonian because it is a system
which are information, there are lesser known diseases optimized for predator-prey combat. Certainly, the Hamilto-
related to ATP, which is energy. I predict that, in addition to nian represents energy of a system that has been fully
molecules and genes, ATP will be the next frontier, and life incorporated into physics but not into the study of brain or AI
will be understood as a system that uses things, information, systems. It is clear from the example of computer graphics
and energy to maintain a low level of entropy. that a huge number of matrix operations are required to draw
the movement of an object, which is not a point but a rigid
body that moves and rotates. An equivalent ability should
The future frontier of science have been required and implemented in the brain for a pre-
dator to chase and catch a prey. We all know that efficient
Until now, science has disregarded how scientists’ brains implementation of deep-learning algorithms requires GPUs
themselves are involved in the scientific process. That that have been developed to perform a huge amount of matrix
situation occurred because we did not know the details operations for computer graphics. Thus, a huge amount of
underlying the brain’s cognitive mechanisms. However, linear arithmetic should also be needed to win the battle
with the advent of deep learning, it is now possible to build between predator and prey. Brains with such built-in func-
systems that mimic, albeit partially, the brain. I predict that, tions were also used by the scientists who developed the fields
in the future, we will begin to understand the mechanisms of mathematics and physics and those who evaluate their
by which the brain performs the thought processes that validity. In the future, studies will be performed about how
occur during scientific endeavors. Specifically, I expect that the algebraic structure that constitutes the basis of mathema-
the systems of mathematics and physics that scientists have tical thinking is related to the evolution of the brain.
conceived of as plausible will be deemed to have originated In summary, I predict that, in the future, we will study
in cognitive systems that are embedded in scientists’ brains. how cognitive processes in mathematics and physics relate
In order to investigate that we need to understand what the to the evolution of the brain, especially how they have been
brain’s purpose has been in evolution and what systems it enhanced by reinforcement learning through predator-prey
has built to achieve that purpose. This is because the brain interactions. Research on artificial intelligence, which
has not developed suddenly, but has been built up over mimics the human brain, will contribute to this research.
about 700 million years of evolutionary history. In parti- Although I focused on the battle between predator and prey
cular, I speculate that the struggle between predator and as one representative example, struggles between different
prey played a major role in the construction of the brain’s animals with equivalent strength as well as for mating
systems. Of course, the process of mating has also con- activities should have also played important roles.
tributed to its development. That also means that by using
oped their understanding of mathematics and physics.
For example, Einstein’s theory of relativity has as its
principles that physical laws are equivalent and that the speed
light, and it is necessary to use this information to perform
calculations in the brain and capture the prey efficiently. The
