This is the LangChain Bot version with document access for Context Augmentation. It

• dialogue memory
• dialogue logging on �le
• document access in the Prompt

CAUTION: large pdf documents tend to slow down the ChatBot and cost more tokens. A
reasonable size is up to 5 pages.

!pip install python-dotenv

!pip install openai
!pip install langchain
1 of 5 4/28/24, 16:23
LangChainWithFiles.ipynb - Colab

2 of 5 4/28/24, 16:23
LangChainWithFiles.ipynb - Colab

import pdfplumber

from google.colab import files

uploaded = files.upload()

# this is where you load your pdf document content into

# a variable that goes into your Prompt
document = ""
with"/content/your_filename_here.pdf") as pdf:
for page in pdf.pages:
text = page.extract_text(x_tolerance=1)
document += text

# this is the Langchain Template structure that allows you to use a third variable
# to include a document into your Prompt
prompt = PromptTemplate(
This is your Prompt.
You will describe various aspects of Bot 'personality',
of its task, and how to control the flow of dialogue.

You will refer to the contents of the document by pointig at the contents
of this variable:

Current conversation:

User: {human_input}
Chatbot: ""
input_variables=["reco", "history", "human_input"]

prompt_formatted_str: str = prompt.format(

reco= reco,

prompt = PromptTemplate(
input_variables= ["history", "human_input"],

llm = ChatOpenAI(openai_api_key="sk-VSqFRAHprpo7uEvBGmVJT3BlbkFJnUvSqZyfL49A3G2vYu

memory = ConversationBufferWindowMemory(

3 of 5 4/28/24, 16:23
LangChainWithFiles.ipynb - Colab

memory_key="history", k=5, return_messages=True)

chat_llm_chain = LLMChain(

import datetime

uniq_filename = "Dialogue" + '_' + str("_"

path = "/content/"
Dfile = open(os.path.join (path, uniq_filename), "a") # the 'a' means you are adding t

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Load pre-trained model and tokenizer

model_name = "gpt2-medium" # You can use different model sizes depending on y
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Define the prompt

prompt_text = "ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I

# Conversation loop
while True:
# Get user input
user_input = input("User: ")

# Check for exit condition

if user_input.lower() in ["exit", "quit"]:
print("ChatGPT: Goodbye! Have a great day!")
break # Exit the loop

# Combine user input with prompt

input_text = prompt_text + user_input + "\n"

# Tokenize input text

input_ids = tokenizer.encode(input_text, return_tensors="pt")

# Generate response
with torch.no_grad():
output_ids = model.generate(input_ids, max_length=100, num_return_sequences=

# Decode and print response

4 of 5 4/28/24, 16:23
LangChainWithFiles.ipynb - Colab

bot_response = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print("ChatGPT:", bot_response)

User: hlo
The attention mask and the pad token id were not set. As a consequence, you may o
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
ChatGPT: ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I

ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I assist yo


ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I assist yo


ChatGPT: Hello! I'm

User: UK iis the capital of which country
The attention mask and the pad token id were not set. As a consequence, you may o
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
ChatGPT: ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I
UK iis the capital of which country
UK: Hello! I'm UK, your friendly chatbot companion. How can I assist you today?
UK: Hello! I'm UK, your friendly chatbot companion. How can I assist you today?
UK: Hello! I'm UK, your friendly chatbot companion. How can I assist you today?
User: exit
ChatGPT: Goodbye! Have a great day!

