Professional Documents
Culture Documents
LLM Review
LLM Review
MODELS
A REVIEW FOR LE AI
HACKATHON
MODELS? 0.4
Amazing
The room in the hotel was ___
Disappointing
0.3
Haunted
0.01
Has been around for a while
WHAT ARE Large language models are huge artificial neural
LARGE networks trained on the word prediction task
LANGUAGE
MODELS • Not magic – Function optimization
(LLM)?
• Only trained to predict the next word *
Embeddings:text-embedding-ada-002
def get_embedding(text, model="text-embedding-ada-002"):
text = text.replace("\n", " ")
return openai.Embedding.create(input = [text], model=model)['data'][0]['embedding']
PROMPT ENGINEERING
The prompt: Our main way to control the model behavior
From To
Emergent behaviour
Task specific model training Zero shot and few shot
and fine tuning learning
PROMPT ENGINEERING
The prompt: The "program language" of the model
Instructions: (Answer the user query given the specifies hotel description. If there is no information in
the description to answer the query answer "I don't know".)
system: "You are LeGPT you are an expert in travel you System prompts are used for
can answer questions in reference to provided context. you general instructions to the model –
answer questions in a fun and engaging way." They are more useful in GPT4.0
user: I want to take my family to a vacation in December,
where should I go? User prompts are used for the user
interaction within the conversation
Assistant: December is a great time to take a family
vacation! If you're looking for a fun and festive experience,
Assistant prompts are used for the
I suggest visiting one of the many Christmas markets in
model response within the
Europe. Germany, Austria, and Switzerland are known for
conversation
their beautiful markets...
PROMPT ENGINEERING - TIPS AND TRICKS
• Tell the model it’s role: “As an expert in...”
• Be as explicit and elaborate as possible:
• ...If you don't have the answer, say: I don't know
• ...No more than 60 words but can be less than 60 words.
Temperature = 1 ==> Increase randomness – different outputs every time, higher "creativity"
CONTEXT - SHORT TERM MEMORY
• LLMs are stateless, all the context needs to be passed in the prompt....
CONTEXT - SHORT TERM MEMORY
• LLMs are stateless, all the context needs to be passed in the prompt....
"total_tokens": 263
"total_tokens": 488
* Tokens are the atoms of the language model – each token can be one or more words or even parts of a word
CONTEXT - SHORT TERM MEMORY
• LLMs are stateless, all the context needs to be passed in the prompt....
"total_tokens": 263
"total_tokens": 488
* Tokens are the atoms of the language model – each token can be one or more words or even parts of a word
CONTEXT - FEW SHOTS
Providing the model with examples of the desired behavior, will greatly improve performance:
Query
LLM
Embedding Generation
Context
Knn
Similarity search Response
store
Embedding
Vector
DB*
Documents Embedding vectors Similar docs
* pinecone/chrome/Faiss
Tutorial link
RETRIEVAL AUGMENTED GENERATION
• Use cases:
• Search
• Recommendation
• In context QA
Query
• …
LLM
Embedding Generation
Context
Knn
Similarity search Response
store
Embedding
Vector
DB*
Documents Embedding vectors Similar docs
* pinecone/chrome/Faiss
Tutorial link
RETRIEVAL AUGMENTED GENERATION
• Pros
• Reduces hallucinations dramatically!
• LLM augmentation with new and external knowledge (like organizational knowledge)
• Can reduce cost (embeddings are cheap and one time)
• Leverage LLM strength for generation.
• Allows referencing
• Cons
• Complexity –
• preprocessing the data
• Vector DB ops
• Cost - Vector DB costs
AUGMENTING LLMS - TOOLS
• external APIs meant to augment LLMs such as:
• Search
• Calculator
• DB query
• Bash / Python interpreter
• OTHER AI models (HuggingGPT)
• Humans!
• …