Welcome to Scribd!

Document Classification English

Uploaded by

0% found this document useful (0 votes)

15 views1 page

The document describes a task to classify documents into one of eight categories using machine learning. The classifier will be trained on documents that have already been categorized in a training data file. The input will consist of processed documents with words separated by spaces. The output should assign each document a category number between 1-8 based on the training. Scoring will be based on the percentage of documents correctly categorized.

Original Description:

Original Title

document-classification-English

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

15 views1 page

Document Classification English

Uploaded by

Ima

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 1

Search inside document

Document

Classification
You have been given a stack of documents that have already been processed and some that have not.
Your task is to classify these documents into one of eight categories: [1,2,3,...8]. You look at the
specifications on what a document must contain and are baffled by the jargon. However, you notice that
you already have a large amount of documents which have already been correctly categorize (training
data). You decide to use Machine Learning on this data in order to categorize the uncategorized
documents.

Training Data

In order to figure out what category each document should fall under you will base it on the categories of
the documents in the "trainingdata.txt" file. This file will be included with your program at runtime and will
be named "trainingdata.txt".

The file is formatted as follows:

The first line contains the number of lines that will follow.

Each following line will contain a number (1-8), which is the category number. The number will be followed
by a space then some space seperated words which is the processed document.

Input

The first line in the input file will contain T the number of documents. T lines will follow each containing a
series of space seperated words which represents the processed document.

Output

For each document output a number between 1-8 which you believe this document should be categorized
as.

Sample Input

3
This is a document
this is another document
documents are seperated by newlines

Sample Output

1
4
8

Scoring

Your score for this challenge will be 100* (#correctly categorized - #incorrectly categorized)/(T).

CTE 115-DATA STRUCTURE Latest
Document13 pages
CTE 115-DATA STRUCTURE Latest
Olayiwola Temileyi
No ratings yet
Python Data Import
Document28 pages
Python Data Import
Beni Djohan
100% (1)
Data Structure Manual
Document52 pages
Data Structure Manual
David Gbolagun
No ratings yet
1.1 Introduction
Document49 pages
1.1 Introduction
Vivek Pandey
No ratings yet
Assign 1 AIOU Data Structure
Document22 pages
Assign 1 AIOU Data Structure
Fawad Ahmad Jatt
No ratings yet
TA User Guide
Document130 pages
TA User Guide
Francisco Alfonso Burgos Julián
No ratings yet
Text Mining in R: A Tutorial
Document7 pages
Text Mining in R: A Tutorial
meenana
No ratings yet
Module 1 - Data Representation, and Data Structures-1
Document20 pages
Module 1 - Data Representation, and Data Structures-1
arturogallardoed
No ratings yet
Introduction To Data Structure
Document7 pages
Introduction To Data Structure
ademola
No ratings yet
UNIT - I (Concept of ADT)
Document23 pages
UNIT - I (Concept of ADT)
varun MAJETI
No ratings yet
Data Understanding and Preparation
Document48 pages
Data Understanding and Preparation
MohamedYounes
No ratings yet
CHAPTER 4 and 5 New Hate Speech
Document21 pages
CHAPTER 4 and 5 New Hate Speech
Bless Co
No ratings yet
Quantitative Test Questions
Document1 page
Quantitative Test Questions
PCE PROJECTS PVT. LTD.
No ratings yet
1 Datastructure
Document6 pages
1 Datastructure
Rehan Khan
No ratings yet
Text Mining
Document23 pages
Text Mining
Chakkarawarthi
No ratings yet
Unit-I (DS)
Document67 pages
Unit-I (DS)
tpayu740
No ratings yet
Data Structure: Edusat Learning Resource Material
Document91 pages
Data Structure: Edusat Learning Resource Material
Prerna Agarwal
No ratings yet
UNIT I:Introduction To Algorithm & Program Design
Document13 pages
UNIT I:Introduction To Algorithm & Program Design
madhuri nimse
No ratings yet
Chapter-1 Introduction To Data Structures AND Algorithms
Document10 pages
Chapter-1 Introduction To Data Structures AND Algorithms
alabssi osama
No ratings yet
Data Structure
Document54 pages
Data Structure
35 CS Nasir Shamim
No ratings yet
Data Collection Tips
Document1 page
Data Collection Tips
ahmet
No ratings yet
Data Structure Ch1
Document22 pages
Data Structure Ch1
MD Sadique Ansari
No ratings yet
Unit II Arrays and Strings
Document19 pages
Unit II Arrays and Strings
anilperfect
No ratings yet
Text Mining With R
Document15 pages
Text Mining With R
SumitKishore
No ratings yet
Unit II Arrays and Strings 1
Document26 pages
Unit II Arrays and Strings 1
Nehru Veerabatheran
No ratings yet
Ad3271 - Data Structure Design Lab - Editted
Document54 pages
Ad3271 - Data Structure Design Lab - Editted
suji.anand21
No ratings yet
A Comprehensive Guide To Understand and Implement Text Classification in Python
Document34 pages
A Comprehensive Guide To Understand and Implement Text Classification in Python
rahacse
No ratings yet
Lesson 6: Data Structures: 1 Reading
Document5 pages
Lesson 6: Data Structures: 1 Reading
Nghĩa Huỳnh
No ratings yet
Unit1 DS PDF
Document55 pages
Unit1 DS PDF
Karthikeyan Ramajayam
No ratings yet
FA17 BEE 058 OOP Presentation
Document18 pages
FA17 BEE 058 OOP Presentation
Āhmąđ Āfżął
No ratings yet
Write A Short Note On The Following 1) File in Computing
Document2 pages
Write A Short Note On The Following 1) File in Computing
Adeyemo Samod olawale
No ratings yet
Ds&Al e Content U1, U 3, U 4
Document38 pages
Ds&Al e Content U1, U 3, U 4
Noone
No ratings yet
Data Structure Terminology 2
Document25 pages
Data Structure Terminology 2
Rajendranbehappy
No ratings yet
BHI509 Assignment7
Document3 pages
BHI509 Assignment7
Glazen C
No ratings yet
Data Field: Data Hierarchy Refers To The Systematic Organization of Data, Often in A Hierarchical Form. Data
Document4 pages
Data Field: Data Hierarchy Refers To The Systematic Organization of Data, Often in A Hierarchical Form. Data
Jan Mikko U. Bautista
No ratings yet
Data Structure 3rd Sem Engg by RSD
Document41 pages
Data Structure 3rd Sem Engg by RSD
Ramesh dodamani
No ratings yet
Datastructures and Algorithms
Document8 pages
Datastructures and Algorithms
elias bwire
No ratings yet
Udacity Python Course
Document22 pages
Udacity Python Course
Roll no.17 Shubham Patel
No ratings yet
Database Notes
Document4 pages
Database Notes
Sumathi Selvaraj
No ratings yet
CT-157 Dsaa
Document61 pages
CT-157 Dsaa
arisharehan7
No ratings yet
CT-157 Dsaa
Document63 pages
CT-157 Dsaa
moizbhai999
No ratings yet
Data Structure Intro and Algo
Document11 pages
Data Structure Intro and Algo
Lal babu Lal babu
No ratings yet
PWP Chapter 3
Document48 pages
PWP Chapter 3
Sanket Badave
No ratings yet
Data Structures Tutorial
Document6 pages
Data Structures Tutorial
Gaurav Rajput
No ratings yet
Topic: 4.1.3 Abstract Data Types (ADT) : Chapter: 4.1 Computational Thinking and Problem-Solving
Document6 pages
Topic: 4.1.3 Abstract Data Types (ADT) : Chapter: 4.1 Computational Thinking and Problem-Solving
Dulvin4
No ratings yet
Data Structures by Fareed Sem IV 27-Jan-2020 Pages 213 Completed W1
Document213 pages
Data Structures by Fareed Sem IV 27-Jan-2020 Pages 213 Completed W1
Prasad S R
No ratings yet
Assignment 3:: Due 8am On Mon, Oct 28, 2019
Document10 pages
Assignment 3:: Due 8am On Mon, Oct 28, 2019
Nak Parr
No ratings yet
528 16cacca1b 2020051909053979
Document18 pages
528 16cacca1b 2020051909053979
sanyamtone81
No ratings yet
Chapter 1 - Advanced Data Structures
Document15 pages
Chapter 1 - Advanced Data Structures
wengie
No ratings yet
08 Task Performance 1-OS
Document4 pages
08 Task Performance 1-OS
marjorie Arroyo
No ratings yet
Data Structures Definition
Document2 pages
Data Structures Definition
Shweta Porwal
No ratings yet
Intro Data Structure
Document5 pages
Intro Data Structure
Muzamil Yousaf
No ratings yet
All Document Reader 1708448999806
Document31 pages
All Document Reader 1708448999806
Axmedcarb
No ratings yet
Information Security Awareness - Refresher Course
Document83 pages
Information Security Awareness - Refresher Course
sai damodar
100% (2)
Data Structure Notes
Document9 pages
Data Structure Notes
api-364384084
No ratings yet
Orange Visual Programming
Document222 pages
Orange Visual Programming
carlos_videla7832
No ratings yet
Lecture 1 - Introductory To Data Analytics
Document11 pages
Lecture 1 - Introductory To Data Analytics
Zakwan Wan
No ratings yet
Data Structure Unit 1
Document49 pages
Data Structure Unit 1
Arunachalam Selva
No ratings yet
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet