Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 4

Confidential

Navarik
-Attribute Extraction Process road map
Confidential 2
First Iteration
Attributes Extraction with Keywords:

Using a java program Attributes are extracted based on Keywords and its pattern

Values in the table format are handled, if the table are in standard format i.e. first
column will be considered as row header and first row will be considered as column header,
values are extracted by x, y coordination

Attributes Extraction without Keywords:

Using a java program Attributes are extracted that are adjacent , in this process
coverage as been increased but as a duplication issue

Values in the table format are handled, if the tables are in standard format i.e. first
column will be considered as row header and first row will be considered as column header,
values are extracted x, y coordination

Confidential 3
Second Iteration
Based on the first iteration output analysis, attributes can be segregated
by its data types so that attribute duplication and wrong values can be eliminated
quality can be improved
Effort Estimation
Task Estimated Time Status
Attribute Variant
collection and its Data
types *
40 Hrs completed
Logic implementation in
java program
24 Hrs Yet-to-start
Testing with sample of 10
documents
16 Hrs Yet-to-start
*Attribute collection will be a on going process currently attributes are collected based on 60 documents
Confidential
Thank You!
Copyright 2013, Mobius Knowledge Services. All rights reserved.
This document is provided for information purposes only. This document may not be reproduced or transmitted in
any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. All
logos are trademarks of their respective owners.
4

You might also like