Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Identify and Extract 

(100 Marks)

Identify and extract scientific entities like Counts and Unit of measure from a piece of a text.

Examples:

Extracted
Sentences Entity Type
Tokens

The patients were divided into 4 groups. 4 groups Counts

A total of 110 students participated in the survey. 110 students Counts

Unit of
A "640 by 480 display" has 640 pixels from side to side. 640 pixels
Measure

Unit of
The tube had a volume of 10 fl. oz. 10 fl. oz.
Measure

Unit of
The sample contained 12 mmol/L of glucose. 12 mmol/L
Measure

This field of view corresponds to a circle on the Earth's surface Unit of


1200 km
having an approximate diameter of 1200 km Measure

What do we mean by Counts?

Any entity that can be counted or measured. A Few examples: 500 people, 1000 images and so on

What do we mean by Unit of measurement?

Unit of measurement is a standard for measuring physical quantities of the same kind. Other
quantities of the same kind can be expressed as multiples of that measurement. For example, meter
is a unit of length, and 10 meters is equal to 10 times the unit meter. There are different systems of
units of measurement like the International System of Units (SI) and the British imperial system.
Each scientific entity should have the following properties:

Identified text: The text that has been tagged as a scientific entity

Identified text offset: The starting and end character position of the identified entity (The input text
should be considered as zero indexed character array)

Entity Type: The type of the scientific entity, if it can be recognized.

Example:

The use of cubic containers instead of cylindrical ones has proved to be helpful when transporting
the substance. Each of the cube can carry 10Kg and has a volume of 1.2 cubic foot.

The scientific entities for the given text are given below:
10Kg,[141, 145],Unit of Measure

1.2 cubic foot,[166, 180],Unit of Measure

Input Format

The only input is a piece of text i.e. sentence(s) or paragraph(s) (with or without spaces)

Constraints

1<= |Length_of_string| <=100000

Output Format

Print the three properties of the identified scientific entity comma(,) separated. If there are multiple
scientific entities, print them in separate lines. If there is no output possible, print NONE.

NOTE: The scientific entity which has appeared earlier in input should be present earlier in the
output.

Sample TestCase 1

Input

The shipping container can pack 70 cubes in one go.

Output

70 cubes,[32,40],Counts

Sample TestCase 2

Input

On March 11, the Fukushima nuclear plant had a severe accident due to the 2011 Tohuku earthquak
e.

Output

NONE

SOLUTION:

sent=input()

words=sent.split()

res = []

foll= []

i=0

while i<len(words):

if words[i].isdigit():

res.append(int(words[i]))

foll.append(words[i+1])

i+=1

You might also like