CS563-NLP-2024 - Assignment 1

Uploaded by

arjun

0% found this document useful (0 votes)

8 views2 pages

Original Title

CS563-NLP-2024_ Assignment 1

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

8 views2 pages

CS563-NLP-2024 - Assignment 1

Uploaded by

arjun

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 2

Search inside document

CS 563: Natural Language Processing

Assignment-1: Part-of-Speech Tagging

Deadline: 15 February 2024

● Markings will be based on the correctness and soundness of the outputs.

● Marks will be deducted in case of plagiarism.
● Proper indentation and appropriate comments (if necessary) are mandatory.
● Use of frameworks like scikit-learn etc is not allowed.
● All benchmarks(accuracy etc), answers to questions and supporting examples
should be added in a separate file with the name ‘report’.
● All code needs to be submitted in ‘.py’ format. Even if you code it in ‘.IPYNB’
format, download it in ‘.py’ format and then submit
● You should zip all the required files and name the zip file as:
○ <roll_no>_assignment_<#>.zip, eg. 1501cs11_assignment_01.zip.
● Upload your assignment ( the zip file ) in the following link:
○ CS563-NLP-2024-Assignments

Problem Statement:
● The assignment targets to implement Hidden Markov Model (HMM) to perform
Part-of-Speech (PoS) tagging task

Implementation:

HMM based Model:

● HMM Parameter Estimation

○ Input: Annotated tagged dataset
○ Output: HMM parameters
○ Procedure:
■ Step1: Find states.
■ Step2: Calculate Start probability (π).
■ Step3: Calculate transition probability (A)
■ Step4: Calculate emission probability (B)
● Features for HMM:
○ Train two HMM models based on:
■ First order markov assumption (Bigram) where current word PoS
tag is based on the previous and current words
■ Second order markov assumption (Trigram) where current word
PoS tag is based on the previous two words along with the current
word

Testing:
● After calculating all these parameters use these parameters to tag the test input
sequence using the Viterbi algorithm

Dataset:
● Dataset consists of sentences and each word is tagged with its corresponding
PoS tag
● Brown dataset: Brown_train.txt
● Format of dataset:
○ Each line contains <Word/Tag> (word followed by ‘/’ and tag)
○ Sentences are separated by a new line

Documents to submit:
● Model code
● Perform 5 fold cross-validation on the Training dataset and report both average &
individual fold results (Accuracy, Precision, Recall and F-Score).
● Create a confusion matrix using Python Library.
● Briefly discuss Bigram vs Trigram assumption while training HMMs.
● With some examples (good pairs and bad pairs) why the model is confused and
when it is giving correct results. Analyze and Explain the reason behind it.
● Also, Implement a RNN based model for this task and compare the result of both
RNN and HMM.
● Discuss which model is better? With some justification and analysis when RNN is
better than HMM and when HMM is better than RNN and when both fail and
why?
● Write a report (doc or pdf format) on how you are solving the problems as well as
all the results including model architecture (if any).

For any queries regarding this assignment, contact:

Ramakrishna Appicharla (ramakrishnaappicharla@gmail.com),
Arpan Phukan (arpanphukan@gmail.com),
Sandeep Kumar (sandeep.kumar82945@gmail.com), and,
Aizan Zafar (aizanzafar@gmail.com)

Trial Assignment - Business Analyst (Time Series Modeling)
Document2 pages
Trial Assignment - Business Analyst (Time Series Modeling)
Rathor Sushant
No ratings yet
Task - Data Engineering
Document2 pages
Task - Data Engineering
Last Over
No ratings yet
OpenMPCoursework2
Document5 pages
OpenMPCoursework2
Jaokd
100% (1)
Project3 1
Document4 pages
Project3 1
Jacob Daugherty
0% (2)
Exercise 03 - Siting A Fire Tower in Nebraska
Document48 pages
Exercise 03 - Siting A Fire Tower in Nebraska
Jill Clark
100% (1)
CW Sequence Analysis
Document9 pages
CW Sequence Analysis
shk93359
No ratings yet
cs224n Python Review 20
Document47 pages
cs224n Python Review 20
Jill
No ratings yet
Data Engineering Notes
Document11 pages
Data Engineering Notes
Vijay Kumar
No ratings yet
CVPDL hw3
Document26 pages
CVPDL hw3
評
No ratings yet
Cs294a 2011 Assignment
Document5 pages
Cs294a 2011 Assignment
Jose
No ratings yet
CS663-2024-Executive NLP - Assignment Sentiment Analysis
Document4 pages
CS663-2024-Executive NLP - Assignment Sentiment Analysis
mukesh shah
No ratings yet
A 02
Document2 pages
A 02
dsa
No ratings yet
TransCoder - YDATA Seminar
Document32 pages
TransCoder - YDATA Seminar
Chuy Morales
No ratings yet
LLaMA Ankit - Rawat
Document52 pages
LLaMA Ankit - Rawat
Hadi
No ratings yet
Course Outline HCT423 Design and Analysis of Algorithms
Document6 pages
Course Outline HCT423 Design and Analysis of Algorithms
Marlon Tugwete
No ratings yet
Selection Policy 2019
Document21 pages
Selection Policy 2019
Rashmi Sahoo
No ratings yet
Final Project Description
Document3 pages
Final Project Description
VanlocTran
No ratings yet
CSC 603 - Final Project
Document3 pages
CSC 603 - Final Project
bme.engineer.issa.mansour
No ratings yet
Assignment 3
Document2 pages
Assignment 3
sonu23144
No ratings yet
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
Document3 pages
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
rishi gupta
No ratings yet
Data Science Product Development Lecture 2
Document30 pages
Data Science Product Development Lecture 2
Daniyal Raza
No ratings yet
Assignment#3
Document2 pages
Assignment#3
Abhinay Upamaka
No ratings yet
ST3189 Assessed Coursework Project 2023-24
Document2 pages
ST3189 Assessed Coursework Project 2023-24
Naeem Ahmed
No ratings yet
CS102 Computer Programming I: Lecture 0: Course Orientation
Document28 pages
CS102 Computer Programming I: Lecture 0: Course Orientation
JELICEL ATE
No ratings yet
ML Hota Assign3
Document4 pages
ML Hota Assign3
f20211088
No ratings yet
Solution Methodology
Document5 pages
Solution Methodology
Arnab Dey
No ratings yet
ML 1 Project
Document2 pages
ML 1 Project
Chaitanya Sanga
No ratings yet
Igcse 8 Programming
Document32 pages
Igcse 8 Programming
Rini Sandeep
No ratings yet
Algorithm Lec1 DR MME
Document46 pages
Algorithm Lec1 DR MME
malkmoh781.mm
No ratings yet
COMP 4650 6490 Assignment 3 2023-v1.1
Document6 pages
COMP 4650 6490 Assignment 3 2023-v1.1
390942959
No ratings yet
Assignment+1 +Regression+This+assignment+is+to+
Document6 pages
Assignment+1 +Regression+This+assignment+is+to+
Daniel Gonzalez
No ratings yet
SPCC Viva Question For Engineering PDF
Document37 pages
SPCC Viva Question For Engineering PDF
Digvijay Devare
No ratings yet
Week 2-3 - Analysis of Algorithms
Document7 pages
Week 2-3 - Analysis of Algorithms
Xuân Dương Vương
No ratings yet
Information and Coding Theory ECE533 - HW2 DesignProblem ECE533 v4.1
Document2 pages
Information and Coding Theory ECE533 - HW2 DesignProblem ECE533 v4.1
Nikesh Bajaj
No ratings yet
457 Final Project
Document7 pages
457 Final Project
an xingyue
No ratings yet
Assignment 3
Document4 pages
Assignment 3
404xenos
No ratings yet
DAA Module-1
Document96 pages
DAA Module-1
Zeha 1
No ratings yet
Generative AI Roadmap
Document36 pages
Generative AI Roadmap
Rushi Khandare
No ratings yet
Practical-02 PC
Document12 pages
Practical-02 PC
SAMINA ATTARI
No ratings yet
hw1 Problem Set
Document8 pages
hw1 Problem Set
Billy bob
No ratings yet
Bigmart Sales Solution Methodology
Document5 pages
Bigmart Sales Solution Methodology
Arnab Dey
No ratings yet
Design and Anlaysis of Algorithm
Document206 pages
Design and Anlaysis of Algorithm
Sandeep Venupure
No ratings yet
Daa Unit1
Document59 pages
Daa Unit1
raja
No ratings yet
Predictive Analytics Exam-June 2021: Exam PA Home Page
Document10 pages
Predictive Analytics Exam-June 2021: Exam PA Home Page
hninyi theint
No ratings yet
Machine Learning: Assignment 04 Text Classification in Scikit-Learn
Document4 pages
Machine Learning: Assignment 04 Text Classification in Scikit-Learn
Pathan
No ratings yet
Week 1
Document133 pages
Week 1
22ucc052
No ratings yet
Parallel Computing Lab Manual PDF
Document51 pages
Parallel Computing Lab Manual PDF
SAMINA ATTARI
No ratings yet
Project
Document11 pages
Project
lumsian
No ratings yet
Ch6-What Are Algorithms
Document15 pages
Ch6-What Are Algorithms
Gamer Mode
No ratings yet
CSE 5311: Design and Analysis of Algorithms Programming Project Topics
Document3 pages
CSE 5311: Design and Analysis of Algorithms Programming Project Topics
Fgv
No ratings yet
N Gram
Document11 pages
N Gram
ASHU K
No ratings yet
Solution Methodology
Document3 pages
Solution Methodology
mariobravok
No ratings yet
CSC 207
Document59 pages
CSC 207
adandongolambertdaniel
No ratings yet
Data Science
Document8 pages
Data Science
malathula00
No ratings yet
Introduction To Programming and Problem Solving: Daniel Chen
Document29 pages
Introduction To Programming and Problem Solving: Daniel Chen
Divya Thakur
No ratings yet
HNDB25 Unit19
Document7 pages
HNDB25 Unit19
Aung Khaing Min
No ratings yet
2021 Homework3 Introduction
Document8 pages
2021 Homework3 Introduction
Ali Zain
No ratings yet
Ram, Pram, and Logp Models
Document72 pages
Ram, Pram, and Logp Models
Ruby Mehta Malhotra
No ratings yet
ESO207 ProgAssign 2 2022
Document7 pages
ESO207 ProgAssign 2 2022
Rishit Bhutra
No ratings yet
Parallel Computing I Homework 7: Conjugated Gradients: Task 1 CG Algorithm (20 Points)
Document1 page
Parallel Computing I Homework 7: Conjugated Gradients: Task 1 CG Algorithm (20 Points)
linchi
No ratings yet
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
From Everand
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
Mr Troy
No ratings yet
Brown Train
Document1,304 pages
Brown Train
arjun
No ratings yet
WWW - Nta.ac - in & WWW - Jeemain.nic.i N: Website
Document5 pages
WWW - Nta.ac - in & WWW - Jeemain.nic.i N: Website
arjun
No ratings yet
Kinematics of Machines PDF
Document133 pages
Kinematics of Machines PDF
arjun
No ratings yet
Fiitjee: Faridabad
Document4 pages
Fiitjee: Faridabad
arjun
No ratings yet
Bonding Atkins
Document29 pages
Bonding Atkins
arjun
No ratings yet
Outgoing Call in Roaming
Document11 pages
Outgoing Call in Roaming
Prince Garg
No ratings yet
Harris Citadel Data Sheet - tcm26-9030
Document2 pages
Harris Citadel Data Sheet - tcm26-9030
Lucho 2000
No ratings yet
8 Jenis Failover Dan Loadbalance
Document32 pages
8 Jenis Failover Dan Loadbalance
Agung
No ratings yet
Simultaneous Linear Equation
Document25 pages
Simultaneous Linear Equation
Dimas Baskoro Heryunanto
0% (1)
B.E. 2019 Pattern Endsem Exam Timetable For May - June-2023
Document20 pages
B.E. 2019 Pattern Endsem Exam Timetable For May - June-2023
Yadav Sanskar
No ratings yet
Syntax File For Re2
Document7 pages
Syntax File For Re2
Alok Singh
No ratings yet
Format: Floating-Point Arithmetic
Document3 pages
Format: Floating-Point Arithmetic
AZKIYA ZAHRA
No ratings yet
Convert SQL Server Results Into JSON
Document1 page
Convert SQL Server Results Into JSON
luca pilotti
No ratings yet
Datacenter Transformation
Document44 pages
Datacenter Transformation
Federico Candia
No ratings yet
7 Commerce
Document29 pages
7 Commerce
Sandeep das
No ratings yet
A Comprehensive Guideline To Trade Ripple (XRP) On Koinpark
Document2 pages
A Comprehensive Guideline To Trade Ripple (XRP) On Koinpark
davebarter3699
No ratings yet
Computer Science Research Paper Topic Ideas
Document5 pages
Computer Science Research Paper Topic Ideas
nnactlvkg
100% (1)
FortiGate 7.4 Administrator Exam Description
Document3 pages
FortiGate 7.4 Administrator Exam Description
peterkarlaguilar
No ratings yet
User Manual For New Firmware of The 3/3.5-Inch Color Screen: Date: April 2013
Document63 pages
User Manual For New Firmware of The 3/3.5-Inch Color Screen: Date: April 2013
Denny De Leon Laurencio
No ratings yet
Contoh Proposal
Document19 pages
Contoh Proposal
riska
No ratings yet
Honeywell Fg1625r Data Sheet
Document2 pages
Honeywell Fg1625r Data Sheet
Alarm Grid Home Security and Alarm Monitoring
No ratings yet
IT Business Alignment Question Paper Ver1.0
Document6 pages
IT Business Alignment Question Paper Ver1.0
Sharma
No ratings yet
Mysql Oem Presentation MyISAM To InnoDB
Document53 pages
Mysql Oem Presentation MyISAM To InnoDB
Alan Teixeira
No ratings yet
Passive Test 3
Document4 pages
Passive Test 3
INGERMAR
No ratings yet
Global Aviation Safety Plan
Document23 pages
Global Aviation Safety Plan
eliasox123
No ratings yet
Cod. 41 Wifi Alarm User Manual
Document35 pages
Cod. 41 Wifi Alarm User Manual
Joel Martinez
No ratings yet
UG 3 Semester Results - Rayalaseema University, Kurnool
Document1 page
UG 3 Semester Results - Rayalaseema University, Kurnool
SURENDRA
No ratings yet
BSNL JTO Sample Paper-3
Document100 pages
BSNL JTO Sample Paper-3
ramesh_balak
No ratings yet
Tunnel Safety
Document77 pages
Tunnel Safety
Tobin Don
0% (1)
Logistic Regression
Document7 pages
Logistic Regression
Shaafici
No ratings yet
Getting Started - Google Maps JavaScript API v3 - Google Developers
Document5 pages
Getting Started - Google Maps JavaScript API v3 - Google Developers
Dibya Ranjan Nayak
No ratings yet
Ciso Workshop 3 Identity and Zero Trust User Access
Document26 pages
Ciso Workshop 3 Identity and Zero Trust User Access
dmomentsbymm
No ratings yet
Battery Chargers: By: Rajinder Hans
Document51 pages
Battery Chargers: By: Rajinder Hans
RAJINDER HANS
100% (3)
Blu-Ray Disc™ / DVD Player: Operating Instructions
Document40 pages
Blu-Ray Disc™ / DVD Player: Operating Instructions
sham9787578027
No ratings yet