Professional Documents
Culture Documents
Generative AI Lifecycle Patterns. Part 2_ Maturing GenAI _ Patterns… _ by Ali Arsanjani _ Sep, 2023 _ Medium
Generative AI Lifecycle Patterns. Part 2_ Maturing GenAI _ Patterns… _ by Ali Arsanjani _ Sep, 2023 _ Medium
Open in app
Search Medium
We will explore a non-exhaustive list of techniques that are often combined to make
composite patterns of how you deal with typical problems and challenges encountered as
you seek to adopt Gen AI at the Enterprise level.
You can use this as a checklist for patterns to adopt for using Gen AI at a production scale
in an enterprise or industrial environment. Also you can use this to prepare your
enterprise for Gen AI through awareness of some of the many skills you may need to
overcome common challenges in the journey of its adoption.
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 1/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Iterations and Cycles. Langchain has been a highly popular and useful library to
help create a chain of tasks for Gen AI. But these chains are not one and done, they
are fundamentally an experiment and so we must prepare ourselves, our teams and
our enterprises for cycles of these chains and iterate on our experiments as we cycle
through these chains of tasks.
Below is a diagram that includes almost all of the cycles and iterations we discuss in
this article. Please use it as an indicative reference for the art of the possible, and
adapt / add as you need to accommodate nuances of your specific enterprise needs.
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 2/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
ICL has very similar goals to few-shot learning: to enable models to learn contextual
data without extensive tuning. However, fine-tuning a model involves a supervised
learning setup on a target dataset. In ICL, a model is prompted with a series of
input–label pairs without updating the model’s parameters.
Experiences have shown that LLMs can perform quite an array of complex tasks
through ICL, even as complex as solving mathematical reasoning problems [1]
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 3/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Chain it.
Beyond the basic Prompt → FM → Adapt → Completion pattern, we typically need to
extract data from somewhere, maybe run a predcitive ai algoritjm and then send the
results to a generative AI foundational model. This Chain of TAsks (CoTA, to be
distinguished from Chain of Thought, CoT) pattern is exemplified as:
Chain : extract Data/ Analytics → Run a Predictive [set of] ML Model[s] → Send the
Result to an LLM → Generate an output
You can use a library like LangChain to accomplish may of this chain of tasks.
LangChain includes Models, Chains and Agents.
Chains. Chains are sequences of operations that LangChain can perform on text
or other data. Chains can be used to perform tasks such as text analysis,
summarization, and translation.
Agents. Agents are programs that use LLMs to make decisions and take actions.
Agents can be used to build applications such as chatbots and code analysis
tools.
LangChain also provides integrations with other tools and APIs as well as end to end
chains of tasks needed to complete a workflow. For example:
Integrations with other tools: LangChain can be integrated with other tools,
such as Google Search and Python REPL, to extend its capabilities.
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 4/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
LangChain agents are particularly powerful because they can use LLMs to make
decisions and take actions in a dynamic and data-driven way. For example, a
LangChain agent could be used to build a chatbot that can learn from its
interactions with users and improve its performance over time.
Chatbots: LangChain can be used to build chatbots that can interact with users
in a natural and informative way.
Code analysis: LangChain can be used to analyze code and identify potential
bugs or security vulnerabilities.
Overall, LangChain is a powerful framework that can be used to build a wide variety
of applications using LLMs. It is particularly well-suited for building dynamic and
data-responsive applications.
LangChain agents use an LLM to decide what actions to take and the order to take
them in. They make future decisions by observing the outcome of prior actions.
This allows LangChain agents to learn and adapt over time, becoming more
effective at completing tasks.
Maturity Level 2. The above section is a very typical set of patterns used in an iterative
cycle to leverage Generative AI. Now let’s explore a more mature level that augments the
above.
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 5/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Tune it.
As you evaluate the model response you find it wanting even after substantial effort
in prompt engineering and In context-learning. Here you may need to tune the
foundation model: adapt it to a domain, an industry, a type of output format , a
certain brevity vs rambling output (e.g., as in classification of a set of symptoms).
Adaptor tuning involves adding a new layer to the LLM that is specific to the task at
hand. The new layer is trained on a small dataset of labeled examples. This allows
the LLM to learn the specific features of the task without having to fine-tune all of
its parameters.
LoRA involves approximating the LLM’s parameters with a low-rank matrix. This
can be done by using a technique called matrix factorization. The low-rank matrix is
then fine-tuned on a small dataset of labeled examples. This allows the LLM to learn
the specific features of the task without having to fine-tune all of its parameters.
This allows the very important introduction of Domain specific LLMs. For example,
see how Vertex AI can do this for you at a nominal cost.
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 6/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Maturity Level 3. Now let’s retrieve data before we send the prompt and contextify the
input even more, decreasing the likelihood of hallucination by the LLM.
RAG it.
Access similar documents using semantic search. How is this done? A set of
documents you supply are chunked (read ‘split’) up (sentence by sentence or by
paragraph, or by page, etc.) then converted into an embedding with a Vector
Embedding like textembedding-gecko@latest and then stored in a Vector Database
such as Google’s Vertex Vector Search. The retrieval is done via an Approximate
Nearest Neighbor search (ANN) aka semantic search algorithm. This input may
significantly decrease the possibility of the model’s hallucination and provide the
model with enough relevant context so as to be more knowledgeable about the topic
and return more ‘sensible’ and relevant completions. This process is known as
Retrieval Augmented Generation or RAG. So RAG it.
2. Augmenting the prompt with context retrieved from the Vector Store.
Ground it.
Use an expanded search capability to increase the factual grounding by
allowing/requesting the model to return a reference to where it found the responses
it just gave. RAG does provide grounding, prior the submission to the LLM.
Grounding is after the model issues the output tokens, find a citation and send it
back. Many vendors such as Google Cloud AI provide multiple ways of Factual
Grounding.
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 7/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
RAG is a framework for augmenting LLMs with access to external knowledge bases.
This allows LLMs to generate more accurate and informative text, even on complex
and challenging tasks. RAG works by first retrieving relevant passages from the
knowledge base. The LLM then uses these passages to generate its response.
The main difference between factual grounding and RAG is that factual grounding
focuses on ensuring that the LLM’s generated text is consistent with factual
knowledge, while RAG focuses on generating more accurate and informative text.
Also factual grounding typically uses a knowledge base of factual statements, while
RAG can use any type of external knowledge base, including text documents, code
repositories, and databases.
Factual grounding is typically used as a pre-training step, while RAG can be used as
a post-training step. This means that factual grounding is typically used to improve
the accuracy and relevance of LLMs on a variety of tasks, while RAG is typically
used to improve the accuracy and relevance of LLMs on specific tasks.
FLARE it.
Forward-looking Active Retrieval Augmented Generation. FLARE is a variation of
RAG in which you actively decide when and what to retrieve using a prediction of
the upcoming sentence to anticipate future content and utilize it as the query to retrieve
relevant documents when you evaluate that the retrieved docs contain low-
confidence tokens.
Maturity level 4. We are getting into a very interesting domain here where you can start to
as your LLM for how it is reasoning and what are the steps in accomplishing its task.
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 8/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
In a CoT diagram, each sentence is a direct continuation of the previous one. This
forms a linear chain.
OK you get the idea for the Graph of Thought. GoT it?
Graph of Thought (GoT) is a framework that models the reasoning process of large
language models (LLMs) as a graph. In a Chain of Thought, each sentence is a direct
continuation of the previous one, forming a linear chain. In a Tree of Thought, the
main idea branches off into several related ideas.
GoT allows for dynamic data flow without a fixed sequence. This flexibility is
important in AI, where data can come from multiple sources and may need to be
processed non-linearly.
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 9/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Chain it.
The Chain of Thought model does very well in its simplicity, intuitiveness, and ease
of training. It follows a linear, step-by-step process that is good for tasks naturally
aligned with sequential logic. This imposes the limitations on the model’s ability to
handle complex reasoning tasks that may very well require considering multiple
variables or alternative options or outcomes. Once it sets “its mind” on a particular
‘chain,’ the model may find it challenging to backtrack or explore other avenues,
which may lead to less than optimal outcomes.
a risk of possibly overfitting. Its branching can make it harder to trace the model’s
exact reasoning path, not a great help when it comes to its interpretability.
The Graph of Thought model stands out for its ability to handle high-complexity
tasks involving multiple interconnected variables. Its flexibility allows it to model
non-linear and interconnected relationships, making it highly suitable for real-
world problems with complex, interrelated variables. However, this complexity
demands significant computational resources and sophisticated algorithms for
effective training. The Graph of Thought model is also the most challenging to
interpret; its non-linear interconnected structure doesn’t lend itself to
straightforward explanations, making it difficult to understand the reasoning
behind its decisions and use it for explainability.
ReAct is designed for tasks in which the LLM is allowed to perform certain actions.
For example, a LLM may be able to interact with external APIs to retrieve
information. It addresses issues that LLMs sometimes face, like producing incorrect
facts and compounding errors.
Conclusion
This article reviewed a set of patterns including combining techniques commonly
encountered as you cycle through deriving business value from Gen AI: seek to
make Gen AI Enterprise ready and conversely as you seek to mature the Enterprise
from prototype to product for and with Gen AI.
Few-shot learning
Supervised learning
Factual grounding
Here are some specific references that you may find helpful.
[1] LLMs can perform complex tasks through ICL, even as complex as solving
mathematical reasoning problems:
This paper shows that LLMs can be fine-tuned to perform complex tasks with only a
few examples, using a technique called chain of thought prompting. The authors
demonstrate that LLMs can be used to solve mathematical reasoning problems,
translate languages, and perform other complex tasks with high accuracy.
A Survey on In-context Learning, Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng,
Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, Lei Li, Zhifang Sui.
This survey paper provides a comprehensive overview of ICL for LLMs. The authors
discuss the different ways in which ICL can be used to train LLMs to perform new
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 12/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
tasks, and they provide examples of ICL being used to solve complex tasks such as
mathematical reasoning and code generation.
This paper introduces the chain of thought prompting technique for training LLMs
to perform complex tasks. The authors demonstrate that LLMs trained with chain of
thought prompting can solve mathematical reasoning problems, even when the
problems are presented in a new format. They “explore how generating a chain of
thought — a series of intermediate reasoning steps — significantly improves the
ability of large language models to perform complex reasoning.” In particular, they
show how such reasoning capabilities are an emergent behavior that surfaces “
naturally in sufficiently large language models …, where a few chain of thought
demonstrations are provided as exemplars in prompting. Experiments on three
large language models show that chain of thought prompting improves
performance on a range of arithmetic, commonsense, and symbolic reasoning
tasks.”
Brown, Tom B., et al. “Language models are few-shot learners.” arXiv preprint
arXiv:2005.14165 (2020).
Raffel, Colin, et al. “Exploring the limits of transfer learning with a unified text-
to-text transformer.” arXiv preprint arXiv:1910.10683 (2019).
Prototypical networks are a type of prototype classifier that is used for few-shot
learning. Few-shot learning is a classification technique that uses a small dataset to
adapt to a specific task. Prototypical networks are based on the idea that each class
can be represented by the mean of its examples in a representation space learned
by a neural network.
Snell, Jake, Kevin Swersky, and Samy Bengio. “Prototypical networks for few-
shot learning.” arXiv preprint arXiv:1703.05175 (2017).
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 13/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
They demonstrate that their approach achieves state-of-the-art results on two few-
shot image classification benchmarks, performs well on few-shot regression, and
accelerates fine-tuning for policy gradient reinforcement learning with neural
network policies.
The paper on retrieval augmented generation (RAG) was written by Patrick Lewis, et
al. RAG is a framework for augmenting large language models (LLMs) with access to
external knowledge bases. This allows LLMs to generate more accurate and
informative text, even on complex and challenging tasks.
RAG has been shown to be effective for a variety of tasks, including question
answering, summarization, and translation. It is a promising new approach to
generative AI, and it has the potential to revolutionize the way we interact with
computers.
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 14/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Yi Tay, et al.“Check Your Facts and Try Again: Improving Large Language Models
with External Knowledge and Automated Feedback”. This paper proposes a
method for improving the factual accuracy of LLMs by providing them with
feedback on their generated text. The feedback is based on a knowledge base of
factual statements.
Shunyu Yao, et al. “Tree of Thoughts: Deliberate Problem Solving with Large
Language Models.” arXiv preprint arXiv:2209.06289 (2022).
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 15/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Maciej Besta, et al. “Graph of Thoughts: Solving Elaborate Problems with Large
Language Models”
Terence L Van Zyl. et al. “Machine Learning for Socially Responsible Portfolio
Optimisation”
Follow
Director Google, AI/ML & GenAI| EX: WW Tech Leader, Chief Principal AI/ML Solution Architect, AWS | IBM
Distinguished Engineer and CTO Analytics & ML
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 16/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Ali Arsanjani
403 3
Ali Arsanjani
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 17/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
6 Levels of ML Adoption
Ali Arsanjani, Edited, from a transcription and summary of a recent talk, in collaboration with
Joel Milag, Wow AI and Team
124
Ali Arsanjani
How to Build and Run your Entire End-to-end ML Life-cycle with Scalable
Components
End-to-end Enterprise Scale MLOps Projects: An Overview
23
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 18/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Ali Arsanjani
88
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 19/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Sachin Kulkarni
249 5
ai geek (wishesh)
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 20/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
43 1
Lists
Staff Picks
454 stories · 297 saves
Self-Improvement 101
20 stories · 596 saves
Productivity 101
20 stories · 555 saves
Rémi Toffoli
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 21/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
77 1
All You Need to Know about Vector Databases and How to Use Them to
Augment Your LLM Apps
A Step-by-Step Guide to Discover and Harness the Power of Vector Databases
648 7
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 22/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
Liuyueguang
22 1
Yash Bhaskar
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 23/24
9/23/23, 6:45 PM Generative AI Lifecycle Patterns. Part 2: Maturing GenAI : Patterns… | by Ali Arsanjani | Sep, 2023 | Medium
261 1
https://dr-arsanjani.medium.com/the-generative-ai-lifecycle-1b0c7d9463ec 24/24