transcription course 2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Video 1:

In this lesson, you'll get an overview of the various Mistral models. At Mistral, we aspire to build
the best foundation models. Currently, we offer six models for all use cases and business
needs. Let's dive in. We offer two open source models that you can download the model weights
and use it anywhere without restrictions. Mistral 7b is our first model that was released last
September. It outperforms LLAMA models with similar and even greater sizes. You can see in
the chart, the orange bar represents Mistral 7b and the other colored bars represent various
LLAMA models. On the horizontal axis, these labels like knowledge, reasoning, code are
aggregates of standard evaluation benchmarks. We measure performance on a variety of tasks.
For example, the reasoning benchmark includes Halaswag, ArcEasy, ArcChallenge, and other
tasks. The coding benchmark includes HumanEval and MBPP. Mixtro 7b fits on one GPU,
perfect for you to get started experimenting with. Mixtro 8x7b is a sparse mixture of expert
model we released last December. At a high level, imagine you have 8 experts to help you
process your text. Instead of seeing all experts, we choose the top experts to help you. For
more details, the foundation of our model is a transformer block, which composed of two layers,
a feedforward layer and a multi-head attention layer. Each input token goes through the same
layers. How can we add capacity to the model? We duplicate the feedforward layer n times. But
now how do we decide which input token goes to which feedforward layers? We use a router to
map each token to the top k feedforward layers and ignore the token to the top K v4 layers and
ignore the rest. As a result, even though Mixed Troll has 46.7 billion total parameters, but only
uses 12.9 billion parameters per token, providing great performance with fast inference. It
outperforms Lama270B on most benchmarks with eight times faster inference, and it matches or
outperforms GPT-3.5 on most standard benchmarks. And we're excited to be contributing to
some of the amazing models that are available to the AI community. And they're under the open
source Apache 2 license, which means you can download the model weights of both models,
fine tune and customize them for your own use cases and use them without any restrictions. For
example, you can create your own AI applications for commercial use. We are committed to
open models and we'll release more open models in the future. We also offer four optimized
enterprise-grade models. Mistress Small is the best for low latency use centerprise-grade
models. Mr. Small is the best for low-latency use cases. Mr. Medium is suitable for your
language-based tasks that may only require moderate reasoning, such as data extraction,
summarization, and email writing. Mr. Large is our flagship model for your most sophisticated
needs with advanced reasoning capabilities. Mr. Large approaches the performance of GPT-4. It
has native multilingual capabilities. It strongly outperforms NAMA 270B on common sense and
reasoning benchmarks in French, German, Spanish, and Italian. It offers a 32K tokens context
window. In fact, all of our served models have 32K context window length. It excels in instruction
following and is natively capable of function calling. You can use function calling with both Mr.
Small and Mr. Large. By the way, if you're not familiar with function calling, we'll go through how
to do function calling in this course. And finally, we have an embedding model, which offers the
state-of-the-art embeddings for text and can be used for many use cases like clustering and
classification. Many customers leverage our model for a wide variety of applications across
industries, such as banking, telecom, media, logistic, fintech, among others. They have
deployed our models for tasks like reg, content synthesis, code generation, insights generation,
and more. Throughout this course, you will learn how you can use Mistral models on use cases
similar to these. To get started using our models right away, you can use Lucet, our chat
interface to interact directly with our models. Let's go to chat.mistral.ai. We can ask in the chat
to write a Python function to sort the given string. Now we get a nice looking Python function.
Please note that Lucet is currently free to use, So, you just need to sign up and you can use it. If
you are interested in running our open source models locally, you can use transformers, LLAMA
CPP, or OLAMA. Note that a lot of people use a quantized version of our model in order to load
our model into the memory. Keep in mind that the model quantization may harm model
performance, so don't be surprised if the quantized model doesn't perform as well as expected.
Even though it is fun to download Mistral models on your own machine, please keep in mind
that for faster inference, more reliable performance, and to scale, we do recommend using a
hosted service such as LaPlatform where you can access not only the open source models, but
also the enterprise models with a simple API call. To access our models through LaPlatform,
you can create your own API keys. To set up your API keys, go to console.mistral.ai, create your
own account, set up your billing information, and click on API keys to create new API keys. We
offer a pricing, which is cheaper than some of the other hosting platforms. This course will focus
on using our API endpoints or dive into many use cases using our API in the rest of the class.
For the rest of the class, you don't actually need to create an account. We'll provide a Mistral
API key for you to use in the class. In the next lesson, you will learn how to use Mistral API and
learn some of the prompting techniques. Let's get started with the next lesson.

Video 2:
In this lesson, you will learn how to access and prompt Mistral models via API calls, and perform
various tasks like classification, information extraction, personalization, and summarization.
Let's start coding. In the classroom, the libraries are already installed for you, but if you're
running this on your own machine, you will want to install the following, pip install mistral-ai.
Let's just comment this out, because we don't need to run this in this session. We have a helper
function here to help you load the Mistral API key, and another helper function here for you to
load the Mistral models, so you can get started running Mistral API easily. Okay, ask the model,
hello, what can you do? Feel free to pause the video and change the prompt however you like.
At the end of this lesson, I will walk you through the code in the helper function, sso that you can
see how the API calls works and use the API outside of this classroom environment. First, let's
take a look at how you can use our models to classify bank customer inquiries. In this prompt,
you are a bank customer service bot. Your task is to assess customer intent and categorize
customer inquiry. We have a list of predefined categories. If the text doesn't fit in any of the
categories, classify it as customer service. Then you can see here we're providing some
examples for the model to know exactly what we're expecting. Okay, if we want to make sure
our prompt doesn't have any spelling or grammar error, we can ask the model to correct the
spelling and grammar first. Let's run this. And then let's print the respWe can see that it made
some grammar corrections. For example, customer inquiry is now the customer inquiry. Now we
can use this corrected prompt and replace the inquiry with the actual inquiry. I am inquiring
about the availability of your cards in the EU. And then let's run this sale and we get country
support, which is what we expect. Now let's run another inquiry. What is the weather today?
Because this is not in any of the predefined categories, the model correctly categorizes it as
customer service. Now let's come back to take a closer look at this prompt and see what kind of
prompt technique we used. First of all, we use role playing to provide our model a role, which is
a bank customer service bot. This adds personal context to the model. Second, we used
few-shot learning, where we give a few examples in the prompts. Few-shot learning can often
improve model performwhere we give a few examples in the props. Visual learning can often
improve model performance, especially when the task is difficult or when we want the model to
respond in a specific manner. Third, we use denominators like hash or angle brackets to specify
the boundary between different sections of the text. In our example, we use the triple hash to
indicate examples and angle bracket to indicate customer inquiry. And finally, in the case when
a model is verbose, we can add do not provide explanations or notes to make sure the output is
concise. If you're wondering which denominator to use, it doesn't matter. Just choose whichever
you prefer. Next, I would like to show you an example of information extraction. We have seen
many cases where information extraction can be useful. In this example, let's say you have
some medical notes and you would like to extract some information from this text. In this
prompt, we provide the medical notes and ask the model to return JSON format with the
following JSON schema, where we define what we want to extract, the type of this variable and
the list of output options. For example, for diagnosis, the model should output one of these four
options. Let's run this one. When we run the model, we get exact format of what we defined.
Here in the Mr. function, we defined its JSON as true to enable JSON mode. We'll go through
the Python API calls at the end of the lesson. Let's take a look at this prompt again. One
strategy we use here is that we explicitly ask in the prompt to return JSON format. It's important
to ask for the JSON format when we enable the JSON mode. Another strategy we use here is
that we define the JSON schema. We use this JSON schema in the prompt to ensure the
consistency and structure of the JSON output. Note that if we don't have the is JSON equals
true, the output may still be a JSON format, but we recommend you to enable the JSON mode
to return a reliable JSON format. Next, let's take a look at how our models can create
personalized e-mail responses to address customer questions, because large-language models
are really good at personalization tasks. Here's an e-mail where the customer, Anna, is asking
the mortgage lender about the mortgage rate. And here is our prompt. You are a mortgage
lender customer service bot, and your task is to create personalized e-mail responses to
address customer questions. Answer the customer's inquiry using the provided facts below. And
then we have some numbers about the interest rates in the prompts. And similar to what we
have seen before, we use the string format to add the actual e-mail content to this e-mail
variable here. Let's run this cell. And let's run the meAnd let's run the Mr. Model. As you can
see, we get a personalized email to Anna answering her questions based on the facts provided.
With this kind of prompt, you can imagine that you can easily create your own customer service
bot, answer questions about your product. It's important to use clear and concise language
when presenting these facts or your product information. This can help the model to provide
accurate and quick responses to customer queries. Finally, we have summarization.
Summarization is a common task for large-length models, and our model can do a really good
job at summarization. Let's say you want to summarize this newsletter from the batch. And
here's the prompt I tried. You are a commentator. Your task is to write a report on a newsletter.
When presented with a newsletter, cois to write a report on a newsletter, when presented with
the newsletter, come up with interesting questions to ask and answer each question. Afterward,
combine all the information and write a report in the markdown format. Then I have a section to
insert the content of the newsletter and a section for instructions. First, to summarize key points.
Second is to generate three distinct and thought-provoking questions. Third is to write an
analysis report. Let's run our Mistral model. And this is exactly what we asked for. We get a
summary, we get interesting questions, and we get the analysis report. Of course, you can
always ask the model to summarize the newsletter without these instructions. If you have a
complex task, providing step-by-step instructions usually helps the model to use a series of
intermediate reasoning steps to solve complex tasks. In our example, using these steps might
help the model think in each step and generate a more comprehensive report. One interesting
strategy here is that we ask the model to automatically guide the reasoning and understanding
process by generating examples with explanations and steps. Another strategy that's used often
is that you can ask the model to output in a certain format, for example, using a markdown
format. So that's all the prompts I want to show you in this lesson. In this lesson, we used a
helper function to help us load the Mistral model. Here's how the API call works. We first need to
define the Mistral client. You will want to replace this API key variable with your own API key if
you're running Mistral models outside the classroom environment. We also need to define the
chat messages. The chat message can start with a user message or a system message or an
assistant message. AA system message usually sets the behavior and context for the AI
assistant, but it's optional. You can have both the system message and the user message just in
the user message, and experiment and see which kind of messages produce better result. In
this lesson, we'll just have everything in the user message. Then we define how we can get the
model response, where we need to define the model and the messages. If we enable the JSON
mode, we need to add a line here, response format type as JSON object, to specify that we
want the response format as JSON. There are several other not required arguments we can
change here. You can check the API specs to see all the details. Okay, so that's it for this
lesson. We learned how we can prompt Mr. Models to do various tasks. In the next lesson, we'll
take a look at how do you choose which Mr. Models for which use case. See you in the next
lesson.

Video 3:
In this lesson, you will learn how to select the appropriate Mistral models depending on your use
case. Let's take a look. Mistral AI provides five API endpoints featuring five leading language
models. Mistral 7b, Mistral 8x7b, Mistral Small, Mistral Medium, and Mistral Large. Looking at
the model performance such as the MMLU, Multitask Language Understanding Task, Mistral
Large performs the best. In fact, Mistral Large outperforms all the other models across various
benchmarks including reasoning, multilingual tasks, mathematics, and coding. However,
performance might not be the only consideration here. For your applications, you might also
want to consider pricing. We offer competitive pricing on our models and it's worth considering
the performance pricing trade-offs. Mistral models are behind many large language model
applications at scale. Here's a brief overview on the types of use cases we see along with their
respective Mistral model. Simple tasks that one can do in bulk, such as classification, customer
support, or text generation are powered by Mistral Small. Intermediate tasks that require
moderate reasoning like data extraction, summarization, writing emails, and so on are powered
by Mistral Medium. Complex tasks that require advanced reasoning capabilities or are highly
specialized like synthetic text generation, code generation, rag, or agents are powered by
Mistral Large. Let's take a look at some examples. In this notebook, let's first use our helper
function to load the API key. You can replace this with your own API key if you're running this
outside of the class environment. Let's define a Mistral function to call the Mistral Python API
easily. We have seen this code in the previous lesson. For simple tasks like classification, we
can use a smaller size model like MrSmall. For example, let's classify an email as spam or not
spam. Let's run our MrModel. MrSmall was able to classify the email correctly as spam. In fact,
all of our models can give good results here. Using MrSmall is more cost effective and it's faster.
So we recommend using MrSmall to do simple tasks. MrMedium is great for intermediate tasks
that require language transformation. For example, we can ask the model to compose an email
for new customers who have just made their first purchase with our product. Make sure you
have the order details in the prompt. Let's run MrMedium and let's take a look at the response.
And now we get a nice looking email addressing to customer Anna with her order details. Mr.
Large is great for complex tasks that require advanced reasoning capabilities or that are highly
specialized. In this example, let's ask Mr. Large to calculate the difference in payment dates
between the two customers whose payment amounts are close to each other in a given data
set. Let's try Mr. Small first. Here you can see Mr. Small gives the incorrect final answer, but
since our model results are probabilistic, if you actually run this multiple times, it might
sometimes give you the correct result. Okay, now let's run Mr. Large and print the model
response. As you can see, Mr. Large can break down this question into multiple steps and is
able to give us the right answer. Let's try another fun example. Given the purchase details, how
much did I spend on each category? Restaurants, groceries, stuffed animals, and props. And
then we have the transaction details here. Let's first run MrSmall. You can see there are some
mistakes here. For example, the world food wraps should belong to restaurants. And now let's
try MrLarge. It happens to categorize the world food wraps correctly as a restaurant and gives
us the correct answer for each category. In the next example, let's say you're about to meet a
really important person named Andrew and you're hoping to make a good impression with him.
But you only have 20 minutes to chat. When you see him, he asks you, By the way, how do I
find two numbers that add up to a third number? How can Mr. help you make a good
impression? Let's take a look at this coding task. MrLarge is thea good impression. Let's take a
look at this coding task. Mr. Large is the top performer in coding tasks. So let's give it a try.
Great, so now let's run this function. And let's see if the code passed these tests. Let's copy and
paste here. Great, so it looks like our function works. And now Andrew would be very happy that
you give the right answer. In addition, Mr. Large has been specifically trained to understand and
generate text in multiple languages, especially in French, German, Spanish, and Italian. Here is
an example asking French, which one is heavier, a pound of iron or a kilogram of feather? Let's
run Mr. Large. I don't understand French, but I hope it answers correctly that a kilogram of
feather is heavier. Okay, so far we have seen many use cases with our models, but we have not
talked about external tools. Connecting our models, but we have not talked about external tools.
Connecting our models to external tools can help us build applications that are even more
powerful. In the next lesson, you will learn how to use function calling to connect Mr. Models to
tools. See you in the next lesson.

Video 4:
In this lesson, you'll get to implement function calling with Mistral. Function calling allows Mistral
models to connect to external tools, making it easy for you to build applications catering to
specific use cases and practical problems. Let's check it out. At a high level, there are four steps
with function calling. Step one is for users to define tools and user query. A tool can be a
user-defined function or an external API. For example, users can write two functions to extract
the payment status and payment date information. And then when a user asks a question,
what's the status of my payment? This question is directly related to the payment status tool,
and we should use this tool to address the question. Step two is for our model to generate
function arguments when applicable. Based on the tools and the user query, our model is able
to determine that the function we should use is payment status and the function argument is
transaction ID equals T1001. Step three is for users to execute the function to obtain two
results. We're simply passing the function arguments directly in the function and we get the
result from the function, which is paid in this example. Step four is for a model to generate a
final answer. According to the information available, your transaction ID T1001 has been paid. Is
there anything else I can assist you with? Okay, now hopefully you have a good understanding
of how function calling works with Mistral. Let's take a look at the code. Let's import pandas as
pd and assume we have the following data. We have this data frame of transaction ID and the
payment information, and we would like to ask questions about this data frame. So how can we
ask questions about this data without function calling? We could simply passing this data in the
prompt, and then ask the question, given the following data, what is the payment status for the
transaction ID, which is T1001. Again, let's load our Mr. function and run the Mr. model. Okay,
we can see the answer, the payment status for the transaction with the ID T1001 is paid, and
the result tells us how it was able to get the answer from the data. As you can see, Mr. model is
pretty good if you give it the right prompts. But in normal use cases, you may have a pretty large
database of transactions. If you use your entire database and feed into a large-length model,
this may exceed the context window. A more efficient, affordable, and reliable way to do this is
to use function calling to run code to get the correct answer from the data frame. So let's run the
code. As you can see, the answer is T1001. So let's run the code. As you can see, the answer
is T1001. So let's run the affordable and reliable way to do this is to use function calling to run
code to perform this kind of search. You just need to prompt the large-language model to know
when to call each function. Let's take a look at how we do function calling step-by-step. Users
first need to define all the necessary tools for their use cases. For example, here we have a
function, retrieve payment status. If we define the function argument transaction ID is T1001, we
get the status paid from this function. Let's try another function, retrieve payment date. And we
can retrieve the payment date information based on a transaction ID. And then we get the date.
So how do Mr. Models understand these functions? In order for Mr. Models to understand the
functions, you can align the function specs with a JSON schema. For example, here is the
JSON schema for the function, retrieve payment status. Let's specifyschema for the function
retrievePaymentStatus. Let's specify the tool type, which is a function in this case, the function
name, the function description. This will tell our model what does this function do, the
parameters of the function, which includes the arguments of the function, the type of the
argument, and the description of the argument. Let's also specify the required function
argument, which is the transaction ID here. Please keep in mind the name of the function should
be exactly the name of the function we just defined. Also, please make sure you give a good
enough description of your function and a good description of the argument so that our
language model can understand which function to use and the needed argument to use.
Similarly, let's define the JSON specs for the retrievePaymentDate function. Let's combine these
two JSON specs into a list called tools. Then we organize the two functions into a dictionary
where the keys represent the function name and the values are the functions with the data
frame defined. This allows us to call each function based on its function name. To give you an
example, we can call the retrieve payment status function with the argument transaction ID
equals T1001 using this names to function dictionary. Okay, next we need to define a user
query. Suppose a user asks the following question, what's the status of my transaction? A
standalone large-length model will not be able to answer this question as it doesn't have the
business logic that can to access the necessary data. But what if we have the exact tool we can
use to answer this question? We could potentially provide an answer. So let's see how function
calling works in the next step. Step two is for a model to generate function argumStep two is for
our model to generate function arguments. So how do menstrual models know about these
functions and know which function to use? We provide both the user query and the tool specs to
menstrual models. You can have the language model to automatically choose if we should use a
tool or not with the tool choice equals auto, or you can define tool choice as any to force tool
use. Or if you don't want to use any tool, you can set tool choice equals none to prevent tool
use. Okay, let's run this code. Let's see the content of the message. To check the status of your
transaction, I need the transaction ID. Could you please provide it? As you can see, our model
was able to identify if there is any essential information missing for the function, and it will ask
for this essential information. Now let's save the history of the conversation we have in the chat
history and add a user message. My transaction ID is T1001. Now in the chat message, we
have the original user question. What's the status of my transaction ID? We have the assistant
message asking us to provide a transaction ID, and then we have a user message telling the
model my transaction ID is T1001. Let's run our model again with this chat history. Let's take a
look at the response. Now the content of the assistant message is actually empty, but it was
able to return the information in tool calls, telling us the function name is the retrieve payment
status and the arguments we should use for this function. Let's again append the model
response to the chat history. Step three is for a user tthe model response to the chat history.
Step three is for user to execute the function to obtain tool results. Currently, it's the user's
responsibility to execute these functions, and the function execution lies on the user side. Let's
extract some of the function information from the model response. We get the function name
and the function arguments. Now, we can get the function results based on the function name
and the function arguments. We get the results status as paid. Let's define this chat message
with the role tool, name as a function name, and the content as the function result. Let's save
this tool message using the chat message method with the role as tool, name is the function
name, the content is the function result, and we append it to the chat history. And let's take a
look at the chat history right now. Here, as you can see, we have the assistant message telling
us the result of function tool calls with the function information, and we have the tool results
returning the results of the function. Now, step four is for our model to generate a final answer.
Let's just give our model the entire chat history, and it will return a personalized answer, the
status of your transaction. T1001 is paid. Is there anything else you need help with? So that's
the four steps for function calling. MrModel also provides parallel function calling. Feel free to
pause the video and change the prompt, see how can you call both functions at the same time.
For example, in your user message, if you ask for both the status and the date of my transaction
IID TY001, let's see what would the model respond. Let's run this cell. As you can see here, our
model actually responded with two function calls. One is the retrieve payment status, another
one is the retrieve payment date. Now you can execute both functions to get the desired output.
That's it for function calling. In the next lesson, we'll learn about rag and how you can use rag as
a tool for function calling. See you in the next lesson.

Video 5:
In this lesson, you'll practice implementing RAG with Mistral models. Retrieval Augmented
Generation is an AI framework that combines the capabilities of large-length models and
information retrieval systems. It's useful to answer questions or generate content, leveraging
external knowledge. Let's check it out. So why do we need RAG? Large-length models can face
a lot of challenges. For example, it doesn't have access to your internal document, it doesn't
have the most up-to-date information, and it can hallucinate. One of the potential solutions for
these problems is RAG. At a high level, here's how RAG works. When users ask a question
about an internal document or a knowledge base, we retrieve relevant information from the
knowledge base, where all the text embeddings are stored in a vector store. This step is called
rretrieval. Then in the prompt, we include both the user query and the relevant information so
that our model can generate output based on the relevant context. This second step is called
generation. In this lesson, let's take a look at how we can do RAC from scratch. Let's first get an
article from the batch. This is the link of the article we're interested in, and we use an HTML
parser called Beautiful Soup to find the main text of the article. Next, let's split this document
into chunks. It's crucial to do so in a RAC system to be able to more effectively identify and
retrieve the most relevant piece of information. In this example, we simply split our text by
character, combining 512 characters into each chunk, and we get 8 chunks. Depending on your
specific use cases, it may be necessary to customize or experiment with different chunk sizes.
Also, there are various options in terms of how you split the text. You could split by tokens,
sentences, HTML headers, and others, depending on your application. Now with these eight
text chunks, let's create embeddings for each of them. Again, we use a helper function to load
our API key. You can replace this with your own API key outside of the course environment. We
define this getTextEmbedding function using the Mistral Embeddings API endpoint to get
embedding from a single text chunk. Then we use this list comprehension to get text
embeddings for all text chunks. Let's take a look at how it looks. This resulting text embeddings
are numerical vectors representing the text in the vector space. If we take a look at the length of
the first embedding vector, it returns 1024, which means that our embedding dimension is 1024.
Once we get the embeddings, a common practice is to store them in a vector database for
efficient processing and retrieval. There are several vector databases to choose from. In our
simple example, we use an open source vector database, FACE. With FACE, we define an
instance of the index class with the embedding dimension as the argument. We then add the
text embeddings to this indexing structure. When users ask a question, we also need to create
embeddings for this question using the same embedding models as before. Here we get the
question embeddings. Now we can retrieve text chunks from the vector database that's similar
to the question we asked. We can perform a search on the vector database with index.search.
This function returns the distances and the indices of the k most similar vectors to the question
vector in the vector database. And then based on the returned indices, we can retrieve the
actual relevant text chunks that correspond to those indices. As you can see here, we get two
text chunks because we defined k equals two to retrieve the two most similar vectors in the
vector database. Note that there are a lot of different retrieval strategies. In our example, we
used a simple similarity search with embeddings. Depending on your use case, sometimes you
might want to perform metadata filtering first or provide weights to the retrieved documents or
even retrieve a larger parent chunk that the original retrieved chunks beloon to. Finally, we can
offer the retrieved text chunks as the context information within the prompt. Here's a prompt
template where you can include both the retrieved text chunks and the user question in the
prompt. Let's again use this Mr. function we have seen before. With the prompt, we get a
response. So this is how Rack works from scratch. Feel free to use another batch article or
combine multiple batch articles and ask questions about these articles. Also, we just went
through a very basic Rack workflow. If you're interested in more advanced Rack strategies,
there are several other courses you can learn from. If you're developing a complex application
where Rack is one of the tools you can call, or if you have multiple Racks as multiple tools you
can call, then you may consider using Rack instead of function calling. Let's take a look at a
simple exaYou may consider using reg instead of function calling. Let's take a look at a simple
example here. Let's wrap up the reg logics we defined above in a function. We call it QA with
context. Now we organize this function into a dictionary called names to function, as we have
seen in the previous lesson. This might not look that useful with just one function, but if you
have multiple tools This is very useful to organize them into one dictionary. Now we can align
the function specs with the JSON schema to tell our model what this function is about. Where
the function name is QA with context, the required argument is the user question. Now we pass
in the user question and the tool to the model. We get the tool call results with the function name
as QA with context and the arguments is our user question. Let's extract the function information
from the model response. we get the function name and the function arguments. Then we
execute the function to get the function results. As an exercise, please feel free to write another
reg function asking questions about another batch article and provide both of them as tools to
our Mistral model. Just as an exercise, what if we change the user query to write a Python
function to sort the letters in a string? What would happen? It shouldn't use our tool
QAWithContext, right, because this question has nothing to do with this tool. So why would this
happen? It's because we used the tool choice as any, which forces a tool use. Now we change it
to auto, which means the model decides if we use a tool or not. But now it still uses the
QAWithContext tool. Okay, so maybe this is because our description of the tool is too general.
We need to specify the tool to be more specific. Let's add some details in this description. You
answer user question about AI by retrieving relevant context. Let's run this. Doesn't work. Let's
change our description to answer user question about an AI article by retrieving relevant context
about the article. So now the description is more specific. When we run this again, as we can
see, it's returning the Python function we asked in the content and not returning tool calls. This
is exactly what we needed. Of course, if you know that this question is not supposed to use a
tool, we can set to choice as none here. It will guarantee that we're not going to call any tools or
functions. Now, that we're not gonna call any tools or functions. Now, let's try any again.
Remember that any forces function call, and now you can see, even though we changed the
function description, it's still using the function call because our tool choice is any, which forces
function calling. Okay, so the default behavior is auto, and I recommend you to use auto for tool
choice. Just a side note that you can use Mistral to do reg with other tools like Lanchain,
Lamaindex, and Haystack. Check out our documentation to see how it works. In the next
lesson, we'll learn how to create simple UI interfaces with the Mistral models and panel. See you
in the next lesson.

Video 6:
In this lesson, you'll create a simple chat interface to interact with the master models and chat
with an external document. Let's try it out. In this lab, let's create two simple chat interfaces for
interacting with the models and chatting with a document. The first one is a basic chat UI where
you can send a message. For example, write an email to schedule an appointment with my CS
professor to discuss research opportunities. By the way, right now I just want to show you the
end result. We'll walk you through the code in detail in a little bit. Okay, now you can see this
detailed email requesting for appointment to discuss research opportunities to a professor. The
second chatbot is where we can upload a document, for example, tthe batch newsletter and ask
the questions about this document. What are the ways that AI can reduce emissions in
agriculture? Based on the batch newsletter, we get a good output. In this lesson, we'll take a
look at how we can create these two chat interfaces so that you can create your own chat apps
with Mr. Models. Let's get started. Let's first import needed packages. The new package we
have here is panel, which is an open-source data exploration and web app framework. We
import panel as PN. We need to run this line, pn.extension to load customized JavaScript and
CSS extensions for panel applications. Here, we have a Mr. Function we have seen before.
Note that we have these two additional arguments, user and chat interface, that's not used in
the function, bSo we need it to include here for the UI. Now let's simply run these four lines of
code and we get this chat interface. We can interact with the model. Pretty cool, right? Okay,
let's take a closer look at the code. We define a chat interface widget with pn.chat.chat interface.
This widget handles all the UI and the logic of the chatbot. Then we define how the system
responds in the callback function, which is the Mr. function we defined above. Then we can
define the callback username as Mistral to indicate responses from the Mr. model right here. As
you can see here with only a few lines of code, you can get a similar chat interface to interact
with our model. Next let's see how we can create a chat interface to chat with an external
document. WWe have seen this code before we get a newsletter from the batch. We save the
text of the newsletter as the batch.txt file. Okay, now let's copy and paste some rec code just for
a short recap. First of all, in the prompt, we have the context information, which is the retrieved
text chunks based on the user query. And then we have the user query, and we're trying to get
an answer. We have a function to get the text embeddings using Mistral's embedding model.
Another function to run the Mistral model. We recommend using Mistral-large for the best
performance for rec tasks. The main function is this answer question function. We start with the
text we get from the file we upload. We'll talk about this file input widget in a little bit. We split the
document into chunks, load the text chunks into a vector database, creating document into
chunks, load the text chunks into a vector database, create embeddings for a question, retrieve
similar chunks from the vector database, and finally, generate response based on the retrieved
relevant text chunks. And here are the code for the chat interface. In this chat interface, we first
need to define a file input widget for us to upload files. Let's take a look at this widget, file input,
where we can choose a file and upload a file, like that. Then we need to define the chat
interface widget, where we define how the system responds in the callback function, which is
the answer question function we just defined. We can then arrange the file input widget in the
header. As you can see here, the first thing we see is the file input widget with some text
descriptions. We can optionally start the chat interface with a system message so that users
know what to expect from this chatbot. As you can see, here is the system message we defined.
Now let's give it another try. We upload the batch.txt file. We ask a question about this file. And
we get an answer from Mr. Larch. So that's the code for this simple chat UI. As an exercise, feel
free to try different articles from the batch by changing the URL here. And then try to upload a
different article and ask questions about this article and see how Mr. Larch responds. In the next
lesson, I will conclude the course and share some thoughts about the next steps. See you in the
next lesson.

You might also like