Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 10

Data Jockey – DJ Tool

Data Jockey Tool


• Tool that transforms incoming patent files in XML format into
standard CXML or JSON files.
Data Jockey Jobs
• PTO of every country sends patent files in different XML structures.
• DJ allows user to create separate jobs for each country and write
country-specific transformation rules in DJ language.
• Therefore one Job for each country having specific transformation rules in
that job.
• DJ executes the Jobs automatically on periodic basis, perform the
transformations and stores the output files in S3 location.
End-to-End Flow
• Content team receives details of patent data for specific country and creates a
mapping document in Excel.
• The mapping document is shared with Solution Analyst (SA) team who analyze
and understand the mapping document.
• SA team member creates and configures a new mapping job in DJ and write
transformation rules DJ language against sample input files.
• SA member uses DJ preview feature to perform iterative testing on the output
generated by the transformation rules and fixes the issues found in the output.
• Once all the transformation rules are finalized, SA member test the creation of
output files at configured location by running the job as process.
• Finalized job is then given to QA for further testing.
DJ Job
• DJ Job consist of following important sections

• Source – This is where you set the location of input files.

• Destination – This is where you set the location of output files.

• Rules – This is where SA writes transformation rules in DJ language.

• Logs – This is where SA/QA can view the messages resulted after job is
executed
Transformation Rules
• Rules are written in DJ code editor in DJ language.
• DJ automatically creates the object model of input file accessible from the
rule.
• SA uses the object model to access the data of xml elements and attributes.
• DJ provides several built functions to extract and transform input data from
data elements and attributes.
• DJ also provides the concepts of variables, if conditional statement and
loops
• DJ also provides output keyword and dot notation syntax to generate the
required elements in CXML or JSON output.
Logs
• User can view the status of job execution in History section.
• User can also view the failed patent records and corresponding
messages.
• The error messages in log helps SA and DEV teams to resolve the
issues either in rules or DJ itself.
Teams
• Following teams are working together to complete the transformation process.
• Content Team
• Creates XML-to-CXML or XML-to-JSON mappings
• Provides key information regarding the transformation rules.
• Provides sample XML files received from the PTO
• Certifies the validity of structure and contents of DJ output.
• DJ DEV Team
• Develop data jockey tool and its key features.
• Resolves any issues reported by SA or QA teams.
• Make enhancements in DJ tool as suggested by SA or QA teams.
• SA Team
• Create DJ jobs and write transformation rules based on mapping document.
• Unit test the rules on sample files for any runtime errors and incorrect mappings and transformations.
• QA Team
• Test the rules on wider set of input files and report the issues to SA / DEV teams
Input File Structure
• Patent files are usually provided by PTO office in ZIP or TAR files.
• ZIP/TAR file is located at S3 location.
• Each ZIP/TAR file may contain one or more input XML files and sub-
folders.
• Each input XML file may contain one or more patent records.
• Each patent record is divided into Logical Units (Sections)
• Each logical unit contains data elements and attributes where are
transformed into CXML or JSON.
• Each sub-folder may contain more folders and input XML files.
Environments
• DJ is being deployed in following environments.
• DEV – Used by developers to test the features and issues.
• INT – Used by SA/QA teams to create and test jobs.
• UAT – Used by SA/QA to test jobs on real-time input files
• PROD – Live environment used by prod team

You might also like