1e2RzvrZ 1SueJwXaVHXb0x25yZtvmI0d

LangChainWithFiles.ipynb - Colab https://colab.research.google.com/drive/1e2RzvrZ-1S...
This is the LangChain Bot version with document access for Context Augmentation. It
includes:
• dialogue memory
• dialogue logging on �le
• document access in the Prompt
CAUTION: large pdf documents tend to slow down the ChatBot and cost more tokens. A
reasonable size is up to 5 pages.
!pip install python-dotenv

!pip install openai
!pip install langchain
Collecting python-dotenv
Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.0.1
Collecting openai
Downloading openai-1.23.6-py3-none-any.whl (311 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 311.6/311.6 kB 6.9 MB/s eta 0:00:00
Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-
Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages
Collecting httpx<1,>=0.23.0 (from openai)
Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.6/75.6 kB 12.7 MB/s eta 0:00:00
Requirement already satisfied: pydantic<3,>=1.9.0 in /usr/local/lib/python3.10/di
Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages
Requirement already satisfied: tqdm>4 in /usr/local/lib/python3.10/dist-packages
Requirement already satisfied: typing-extensions<5,>=4.7 in /usr/local/lib/python
Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.10/dist-packag
Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-p
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.9/77.9 kB 11.6 MB/s eta 0:00:00
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
Downloading h11-0.14.0-py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 9.9 MB/s eta 0:00:00
Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.1
Requirement already satisfied: pydantic-core==2.18.1 in /usr/local/lib/python3.10
Installing collected packages: h11, httpcore, httpx, openai
Successfully installed h11-0.14.0 httpcore-1.0.5 httpx-0.27.0 openai-1.23.6
Collecting langchain
Downloading langchain-0.1.16-py3-none-any.whl (817 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 817.7/817.7 kB 9.6 MB/s eta 0:00:00
Requirement already satisfied: PyYAML>=5.3 in /usr/local/lib/python3.10/dist-pack
Requirement already satisfied: SQLAlchemy<3,>=1.4 in /usr/local/lib/python3.10/di
Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /usr/local/lib/python3.10
Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /usr/local/lib/pyth
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain)
Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting langchain-community<0.1,>=0.0.32 (from langchain)
1 of 5 4/28/24, 16:23
Collecting langchain-community<0.1,>=0.0.32 (from langchain)

Downloading langchain_community-0.0.34-py3-none-any.whl (1.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 18.8 MB/s eta 0:00:00
Collecting langchain-core<0.2.0,>=0.1.42 (from langchain)
Downloading langchain_core-0.1.46-py3-none-any.whl (299 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 299.3/299.3 kB 20.9 MB/s eta
Collecting langchain-text-splitters<0.1,>=0.0.1 (from langchain)
Downloading langchain_text_splitters-0.0.1-py3-none-any.whl (21 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
Downloading langsmith-0.1.51-py3-none-any.whl (115 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 116.0/116.0 kB 19.7 MB/s eta
Requirement already satisfied: numpy<2,>=1 in /usr/local/lib/python3.10/dist-pack
Requirement already satisfied: pydantic<3,>=1 in /usr/local/lib/python3.10/dist-p
Requirement already satisfied: requests<3,>=2 in /usr/local/lib/python3.10/dist-p
Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /usr/local/lib/python3.1
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-pa
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dis
import os
import openai
from dotenv import load_dotenv, find_dotenv

_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = "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"
import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')
from langchain.chat_models import ChatOpenAI

from langchain import OpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain.memory import ConversationBufferWindowMemory
from langchain.schema import SystemMessage
from langchain.chains import LLMChain
from langchain.prompts import (
PromptTemplate,
ChatPromptTemplate,
StringPromptTemplate,
MessagesPlaceholder,
BaseChatPromptTemplate
)
!pip install pdfplumber

Requirement already satisfied: pdfplumber in /usr/local/lib/python3.10/dist-packa
Requirement already satisfied: pdfminer.six==20231228 in /usr/local/lib/python3.1
Requirement already satisfied: Pillow>=9.1 in /usr/local/lib/python3.10/dist-pack
Requirement already satisfied: pypdfium2>=4.18.0 in /usr/local/lib/python3.10/dis
Requirement already satisfied: charset-normalizer>=2.0.0 in /usr/local/lib/python
Requirement already satisfied: cryptography>=36.0.0 in /usr/local/lib/python3.10/
Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.10/dist-packa
Requirement already satisfied: pycparser in /usr/local/lib/python3.10/dist-packag
2 of 5 4/28/24, 16:23
import pdfplumber
from google.colab import files

uploaded = files.upload()
# this is where you load your pdf document content into

# a variable that goes into your Prompt
#
document = ""
with pdfplumber.open("/content/your_filename_here.pdf") as pdf:
for page in pdf.pages:
text = page.extract_text(x_tolerance=1)
print(text)
document += text
# this is the Langchain Template structure that allows you to use a third variable
# to include a document into your Prompt
#
prompt = PromptTemplate(
template="""
This is your Prompt.
You will describe various aspects of Bot 'personality',
of its task, and how to control the flow of dialogue.
You will refer to the contents of the document by pointig at the contents
of this variable:
{document}
Current conversation:
{history}
User: {human_input}
Chatbot: ""
""",
input_variables=["reco", "history", "human_input"]
)
prompt_formatted_str: str = prompt.format(

reco= reco,
history="{history}",
human_input="{human_input}")
prompt = PromptTemplate(
input_variables= ["history", "human_input"],
template=prompt_formatted_str
)
llm = ChatOpenAI(openai_api_key="sk-VSqFRAHprpo7uEvBGmVJT3BlbkFJnUvSqZyfL49A3G2vYu
memory = ConversationBufferWindowMemory(
3 of 5 4/28/24, 16:23
memory_key="history", k=5, return_messages=True)
chat_llm_chain = LLMChain(
llm=llm,
prompt=prompt,
memory=memory,
verbose=False)
import datetime
uniq_filename = "Dialogue" + '_' + str(datetime.datetime.now().isoformat(sep="_"
path = "/content/"
Dfile = open(os.path.join (path, uniq_filename), "a") # the 'a' means you are adding t
Start coding or generate with AI.
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
# Load pre-trained model and tokenizer

model_name = "gpt2-medium" # You can use different model sizes depending on y
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)
# Define the prompt

prompt_text = "ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I
# Conversation loop
while True:
# Get user input
user_input = input("User: ")
# Check for exit condition

if user_input.lower() in ["exit", "quit"]:
print("ChatGPT: Goodbye! Have a great day!")
break # Exit the loop
# Combine user input with prompt

input_text = prompt_text + user_input + "\n"
# Tokenize input text

input_ids = tokenizer.encode(input_text, return_tensors="pt")
# Generate response
with torch.no_grad():
output_ids = model.generate(input_ids, max_length=100, num_return_sequences=
# Decode and print response
4 of 5 4/28/24, 16:23
bot_response = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print("ChatGPT:", bot_response)
User: hlo
The attention mask and the pad token id were not set. As a consequence, you may o
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
ChatGPT: ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I
hlo
ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I assist yo
hlo
ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I assist yo
hlo
ChatGPT: Hello! I'm

User: UK iis the capital of which country
The attention mask and the pad token id were not set. As a consequence, you may o
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
ChatGPT: ChatGPT: Hello! I'm ChatGPT, your friendly chatbot companion. How can I
UK iis the capital of which country
UK: Hello! I'm UK, your friendly chatbot companion. How can I assist you today?
User: exit
ChatGPT: Goodbye! Have a great day!
5 of 5 4/28/24, 16:23

1e2RzvrZ 1SueJwXaVHXb0x25yZtvmI0d

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1e2RzvrZ 1SueJwXaVHXb0x25yZtvmI0d

Uploaded by

Copyright:

Available Formats

LangChainWithFiles.ipynb - Colab https://colab.research.google.com/drive/1e2RzvrZ-1S...

!pip install python-dotenv

Collecting langchain-community<0.1,>=0.0.32 (from langchain)

from dotenv import load_dotenv, find_dotenv

from langchain.chat_models import ChatOpenAI

!pip install pdfplumber

from google.colab import files

# this is where you load your pdf document content into

prompt_formatted_str: str = prompt.format(

memory_key="history", k=5, return_messages=True)

uniq_filename = "Dialogue" + '_' + str(datetime.datetime.now().isoformat(sep="_"

Start coding or generate with AI.

Start coding or generate with AI.

# Load pre-trained model and tokenizer

# Define the prompt

# Check for exit condition

# Combine user input with prompt

# Tokenize input text

# Decode and print response

bot_response = tokenizer.decode(output_ids[0], skip_special_tokens=True)

ChatGPT: Hello! I'm

Start coding or generate with AI.

You might also like