Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

Birth of AI: 1950-1956 Alan Turing published his work “Computer Machinery and

Intelligence” which eventually became The Turing Test, which experts used to measure
computer intelligence. The term “artificial intelligence” was coined and came into
popular use.
Generative AI enables users to quickly generate new content based on a variety of
inputs. Inputs and outputs to these models can include text, images, sounds,
animation, 3D models, or other types of data.
This technology is no longer new since it entered the mainstream in late 2022. While
you may have played with (and enjoyed!) the likes of ChatGPT and Midjourney, they’re
barely more than surface-level distractions.

Corporate use for generative AI is far more sophisticated. If used to its full extent, it
will reduce product-development life cycle time, design drugs in months instead of
years, compose entirely new materials, generate synthetic data, optimize parts design,
automate creativity… In fact, experts predict that by 2025, 30% of outbound
marketing messages from large organization will be synthetically generated, and by
2030, a major blockbuster film will be released with 90% of the film generated by AI.

Going beyond the most headline-grabbing use cases, studies have shown that Gen. AI
increases productivity for a variety of tasks, with specific benefits for low-ability
workers and less experienced employees. Put simply, these tools will level the playing
field.

This is happening today, and will continue to happen, with increasing success, over
the coming decade. That is, if we can navigate the many risks associated with
generative AI. I’m particularly worried about deep fakes, copyright issues,
and malicious uses for fake news.

How Does Generative AI Work?

Generative AI models use neural networks to identify the patterns and structures
within existing data to generate new and original content.

One of the breakthroughs with generative AI models is the ability to leverage different
learning approaches, including unsupervised or semi-supervised learning for training.
This has given organizations the ability to more easily and quickly leverage a large
amount of unlabeled data to create foundation models. As the name suggests,
foundation models can be used as a base for AI systems that can perform multiple
tasks.

Examples of foundation models include GPT-3 and Stable Diffusion, which allow users
to leverage the power of language. For example, popular applications like ChatGPT,
which draws from GPT-3, allow users to generate an essay based on a short text
request. On the other hand, Stable Diffusion allows users to generate photorealistic
images given a text input.

How to Evaluate Generative AI Models?

The three key requirements of a successful generative AI model are:

1. Quality: Especially for applications that interact directly with users, having high-
quality generation outputs is key. For example, in speech generation, poor speech
quality is difficult to understand. Similarly, in image generation, the desired outputs
should be visually indistinguishable from natural images.
2. Diversity: A good generative model captures the minority modes in its data
distribution without sacrificing generation quality. This helps reduce undesired biases
in the learned models.
3. Speed: Many interactive applications require fast generation, such as real-time image
editing to allow use in content creation workflows.

Figure 1: The three requirements of a successful generative AI model.

How to Develop Generative AI Models?

There are multiple types of generative models, and combining the positive attributes of
each results in the ability to create even more powerful models.

Below is a breakdown:
 Diffusion models: Also known as denoising diffusion probabilistic models (DDPMs),
diffusion models are generative models that determine vectors in latent space through
a two-step process during training. The two steps are forward diffusion and reverse
diffusion. The forward diffusion process slowly adds random noise to training data,
while the reverse process reverses the noise to reconstruct the data samples. Novel
data can be generated by running the reverse denoising process starting from entirely
random noise.

A diffusion model can take longer to train than a variational autoencoder (VAE) model,
but thanks to this two-step process, hundreds, if not an infinite amount, of layers can
be trained, which means that diffusion models generally offer the highest-quality
output when building generative AI models.

Additionally, diffusion models are also categorized as foundation models, because they
are large-scale, offer high-quality outputs, are flexible, and are considered best for
generalized use cases. However, because of the reverse sampling process, running
foundation models is a slow, lengthy process.

 Variational autoencoders (VAEs): VAEs consist of two neural networks typically


referred to as the encoder and decoder.
When given an input, an encoder converts it into a smaller, more dense representation
of the data. This compressed representation preserves the information that’s needed
for a decoder to reconstruct the original input data, while discarding any irrelevant
information. The encoder and decoder work together to learn an efficient and simple
latent data representation. This allows the user to easily sample new latent
representations that can be mapped through the decoder to generate novel data.
While VAEs can generate outputs such as images faster, the images generated by
them are not as detailed as those of diffusion models.
 Generative adversarial networks (GANs): Discovered in 2014, GANs were considered
to be the most commonly used methodology of the three before the recent success of
diffusion models. GANs pit two neural networks against each other: a generator that
generates new examples and a discriminator that learns to distinguish the generated
content as either real (from the domain) or fake (generated).

The two models are trained together and get smarter as the generator produces better
content and the discriminator gets better at spotting the generated content. This
procedure repeats, pushing both to continually improve after every iteration until the
generated content is indistinguishable from the existing content.
While GANs can provide high-quality samples and generate outputs quickly, the
sample diversity is weak, therefore making GANs better suited for domain-specific
data generation.

Another factor in the development of generative models is the architecture


underneath. One of the most popular is the transformer network. It is important to
understand how it works in the context of generative AI.

Transformer networks: Similar to recurrent neural networks, transformers are


designed to process sequential input data non-sequentially.

Two mechanisms make transformers particularly adept for text-based generative AI


applications: self-attention and positional encodings. Both of these technologies help
represent time and allow for the algorithm to focus on how words relate to each other
over long distances

A self-attention layer assigns a weight to each part of an input. The weight signifies
the importance of that input in context to the rest of the input. Positional encoding is
a representation of the order in which input words occur.

A transformer is made up of multiple transformer blocks, also known as layers. For


example, a transformer has self-attention layers, feed-forward layers, and
normalization layers, all working together to decipher and predict streams of tokenized
data, which could include text, protein sequences, or even patches of images.

What are the Applications of Generative AI?

Generative AI is a powerful tool for streamlining the workflow of creatives, engineers,


researchers, scientists, and more. The use cases and possibilities span all industries
and individuals.

Generative AI models can take inputs such as text, image, audio, video, and code and
generate new content into any of the modalities mentioned. For example, it can turn
text inputs into an image, turn an image into a song, or turn video into text.

Here are the most popular generative AI applications:


 Language: Text is at the root of many generative AI models and is considered to be the
most advanced domain. One of the most popular examples of language-based
generative models are called large language models (LLMs). Large language models are
being leveraged for a wide variety of tasks, including essay generation, code
development, translation, and even understanding genetic sequences.
 Audio: Music, audio, and speech are also emerging fields within generative AI.
Examples include models being able to develop songs and snippets of audio clips with
text inputs, recognize objects in videos and create accompanying noises for different
video footage, and even create custom music.
 Visual: One of the most popular applications of generative AI is within the realm of
images. This encompasses the creation of 3D images, avatars, videos, graphs, and
other illustrations. There’s flexibility in generating images with different aesthetic
styles, as well as techniques for editing and modifying generated visuals. Generative AI
models can create graphs that show new chemical compounds and molecules that aid
in drug discovery, create realistic images for virtual or augmented reality, produce 3D
models for video games, design logos, enhance or edit existing images, and more.
 Synthetic data: Synthetic data is extremely useful to train AI models when data
doesn’t exist, is restricted, or is simply unable to address corner cases with the
highest accuracy. The development of synthetic data through generative models is
perhaps one of the most impactful solutions for overcoming the data challenges of
many enterprises. It spans all modalities and use cases and is possible through a
process called label efficient learning. Generative AI models can reduce labeling costs
by either automatically producing additional augmented training data or by learning
an internal representation of the data that facilitates training AI models with less
labeled data.

The impact of generative models is wide-reaching, and its applications are only
growing. Listed are just a few examples of how generative AI is helping to advance and
transform the fields of transportation, natural sciences, and entertainment.

 In the automotive industry, generative AI is expected to help create 3D worlds and


models for simulations and car development. Synthetic data is also being used to train
autonomous vehicles. Being able to road test the abilities of an autonomous vehicle in
a realistic 3D world improves safety, efficiency, and flexibility while decreasing risk
and overhead.
 The field of natural sciences greatly benefits from generative AI. In the healthcare
industry, generative models can aid in medical research by developing new protein
sequences to aid in drug discovery. Practitioners can also benefit from the automation
of tasks such as scribing, medical coding, medical imaging, and genomic analysis.
Meanwhile, in the weather industry, generative models can be used to create
simulations of the planet and help with accurate weather forecasting and natural
disaster prediction. These applications can help to create safer environments for the
general population and allow scientists to predict and better prepare for natural
disasters.
 All aspects of the entertainment industry, from video games to film, animation, world
building, and virtual reality, are able to leverage generative AI models to help
streamline their content creation process. Creators are using generative models as a
tool to help supplement their creativity and work.

What are the Challenges of Generative AI?

As an evolving space, generative models are still considered to be in their early stages,
giving them space for growth in the following areas.

1. Scale of compute infrastructure: Generative AI models can boast billions of


parameters and require fast and efficient data pipelines to train. Significant capital
investment, technical expertise, and large-scale compute infrastructure are necessary
to maintain and develop generative models. For example, diffusion models could
require millions or billions of images to train. Moreover, to train such large datasets,
massive compute power is needed, and AI practitioners must be able to procure and
leverage hundreds of GPUs to train their models.
2. Sampling speed: Due to the scale of generative models, there may be latency present
in the time it takes to generate an instance. Particularly for interactive use cases such
as chatbots, AI voice assistants, or customer service applications, conversations must
happen immediately and accurately. As diffusion models become increasingly popular
due to the high-quality samples that they can create, their slow sampling speeds have
become increasingly apparent.
3. Lack of high-quality data: Oftentimes, generative AI models are used to produce
synthetic data for different use cases. However, while troves of data are being
generated globally every day, not all data can be used to train AI models. Generative
models require high-quality, unbiased data to operate. Moreover, some domains don’t
have enough data to train a model. As an example, few 3D assets exist and they’re
expensive to develop. Such areas will require significant resources to evolve and
mature.
4. Data licenses: Further compounding the issue of a lack of high-quality data, many
organizations struggle to get a commercial license to use existing datasets or to build
bespoke datasets to train generative models. This is an extremely important process
and key to avoiding intellectual property infringement issues.

Many companies such as NVIDIA, Cohere, and Microsoft have a goal to support the
continued growth and development of generative AI models with services and tools to
help solve these issues. These products and platforms abstract away the complexities
of setting up the models and running them at scale.

What are the Benefits of Generative AI?

Generative AI is important for a number of reasons. Some of the key benefits of


generative AI include:

1. Generative AI algorithms can be used to create new, original content, such as images,
videos, and text, that’s indistinguishable from content created by humans. This can be
useful for applications such as entertainment, advertising, and creative arts.
2. Generative AI algorithms can be used to improve the efficiency and accuracy of
existing AI systems, such as natural language processing and computer vision. For
example, generative AI algorithms can be used to create synthetic data that can be
used to train and evaluate other AI algorithms.
3. Generative AI algorithms can be used to explore and analyze complex data in new
ways, allowing businesses and researchers to uncover hidden patterns and trends that
may not be apparent from the raw data alone.
4. Generative AI algorithms can help automate and accelerate a variety of tasks and
processes, saving time and resources for businesses and organizations.

Overall, generative AI has the potential to significantly impact a wide range of


industries and applications and is an important area of AI research and development.

Note: Demonstrating the capabilities of generative models, this section, “What are the
Benefits of Generative AI?” was written by the generative AI model ChatGPT.

You might also like