Professional Documents
Culture Documents
Labeling Process Description
Labeling Process Description
1. Target
● Rewrite and optimize the English converted script, make the plot of the script coherent.
● There is no need to check whether all the contents of the original novel text have been converted, as long as the
script itself is contextually coherent.
● One annotator is needed to complete a whole novel, which will also improve the annotation efficiency.
● The amount of data converted from each novel ranges from 150-400.
2, Workflow
1. Read the converted script directly, i.e. the JSON version we provided.
2. Rewrite the script according to the original text if there is any place that does not meet the quality requirements
listed below.
3. Mark the rewritten areas in red.
4. For cases where the entire paragraph is about explicit content , we mark the explicit content (explicit content column
is labeled 1), and in such cases, in addition to the original requirements, the plot and details are additionally
required to be fully transformed follow the original text.
5. When the value of True_json column is 0, this data does not need any annotation, annotators can directly skip
this data.
6. When the value of True_json column is 1, annotators need to paste the rewritten JSON into
https://www.bejson.com/explore/index_new/ for JSON format calibration, according to the prompts to adjust
the JSON format to ensure all the JSON format is completely correct.
7. Each book is an independent excel file, different Excel files can not be merged, put all the excel files in the same
folder, do not rename the excel file.
3. Column naming instructions for csv file
a. index: the current csv file row id, row order, this column is prohibited to change
b. topth: book id, this column is prohibited to change
c. target_role: first-person objective role, this column is prohibited to change
d. src_text: the original text, this column is prohibited to change
e. covert_text: the script of the conversion, this column is prohibited to change
f. covert_text_fix_column: The copy of [Convert text], annotators can rewrite in this column, only this column can
be modified
g. detail_flag: detail flag column is 1, it means this row includes Explicit content, it needs to be completely converted,
corresponding to the Workflow requirement 4, this column is prohibited to change
h. True_json: corresponds to the Workflow requirements 5 and 6, True_json column is 1, it means the JSON format
of this row is right; True_json column is 0, it means the JSON format of this row is wrong, this row does not need to be
modified; this column is prohibited to change
2 The plot is The converted content, compared with Revise what doesn't make
complete and the original text, is ideologically sense or incoherent to make
coherent complete, coherent, and has all the major the plot coherent, base on
plot points. For example, the relocation of the story from the original
the scene, the sudden appearance of text, write and fill in the
characters, etc., will lead to the plot not missing parts.
coherent, these need to be rewritten,
especially the intersection of the two
JSON, the beginning and the end need to
be coherent.
3 The plot The plot can be compressed, but only if it Rewrite it correctly.
matches the is complete and coherent in accordance
original text with standard 2.
However, the retained plot needs to
match the original text. For example, if The first narration does not match
the original text is "walking home", then the original text.
the converted text cannot be "on the way
to school".
4 JSON format is Including but not limited to: Modify the format
correct ● Incorrect use of quotation marks:
Within “text”, there shouldn’t be
additional quotation marks enclosing
dialogue or speech.
● Character names should not have
quotation marks or {}.
Examples of labeling
The scale of details of the conversion can not be unified, some parts will be converted in detail, and some parts will be
summarized roughly. The judgment standard is: the coherence is sufficient (if there is a sudden appearance of the plot
and characters in the script after the conversion, please find the pretext in the original text and add it in the corresponding
position). The following are examples of different types of detail.
Original text Convert text clarification
● Delivery requirements: each book for an single excel file, can not be merged, please put all the excel files in the
one folder, and do not modify the excel file name
● Acceptance requirements: each batch of delivery only has two chances of acceptance, if there are two
continuous failures, the results of the whole book will not be accepted, the whole book must be given up, non-
delivery, it must be changed to another person re-labeling
● Quest Audit:
○ Quest Quality Assurance Team (“Quest QA”) will do a thorough spot-checking of the data submitted by
your team based on criteria defined in the instruction doc
○ Accuracy Rate: Quest QA will calculate the Accuracy Rate with the formula:
○ Tips:
■ You are recommended to submit before the Initial Delivery Deadline, so you can get Quest QA
feedback earlier
■ The checkup is rigorous and we always catch errors. There are severe consequences for inadequate
accuracy therefore to make sure you receive full payment and can continue working it's better to
conduct strict review & spotcheck internally
Accuracy Error Rate (E) Payment Post QA Data Acceptance Rule Payment Rule
Rate (A) Calculation
A ≥ 95% E ≤ 5% $/Data * Total Full data acceptance Full payment and proceed to the next
Quantity Accepted batch of data.
90% ≤ A < 5% < E ≤ 10% $/Data * Total Elect to: Pay based on payment rule with penalty
95% Quantity Accepted Option 1. Submit for data
* (1-2*E) acceptance and settle payment with
penalty
Note: each batch of delivery only has two chances of acceptance, if there are two continuous failures, the results of the
whole book will not be accepted, the whole book must be given up, non-delivery, it must be changed to another person
re-labeling and no payment!
Please read the instructions carefully and follow the requirements strictly for labeling and revision!
Here are some Q&A and comments for some of the confusion you may have, that may help you out:
● Q # 1:
○ Original text: “You can't stay here tonight.” I strode out of the room, snatching my Atlas on the way as I
headed for the door. I heard her walking after me and knew she deserved more of an explanation than a flat
refusal. “I need some space, alright? Vampire needs.” I glanced at her and she gave me a pointed look
before shrugging.
○ JSON Text:
{
"type": "dialogue",
"role": "Me",
"text": "No, you can't stay here tonight. 3. I need some space, alright? Vampire needs."
},
Comment: The original text contains a narration and a description of the action in the middle, but the JSON
ignores the action and splices the two separate sentences together, is this acceptable? If not, what kind of
tag should it be under?
Yes, this is acceptable.
But if the JSON contains a description of the action (which the dp does not), is that acceptable? If not,
which tags should it be?
Either narration or action is fine.
● Q # 2:
○ Original text: “Stupidity is dangerous around something like that. What if he starts cutting himself?” I
hissed, running a hand down the back of my neck.
○ JSON text:
{
"type": "dialogue",
"role": "Me",
3. "text": "What if someone starts cutting themselves with it? That could be disastrous."
},
Comment: If the JSON is rephrasing the original text, is this acceptable? If not, what kind of tag should it
be under?
It’s acceptable, if the meaning is correct and the important plots are not lost.
● Q # 3:
○ Original text:“Good. Your first assignment this week is to embrace your inner Fae. Don't take any shit lying
down. No one's judging you if you lose, but if you don't even try to win, you're not one of us. And if you can't
hack it, get out of my class.” He gestured to the door and one boy actually gathered up his things and
hurried out.
○ JSON text:{
"type": "narration",
"text": "Orion emphasized the severity of the upcoming challenges and what it means to truly be
Fae."
},
{
"type": "dialogue",
"role": "Orion",
"text": "Don't take any shit lying down. If you can't hack it, get out of my class."
},
Comment: the dialog in the original text is not provided in its entirety in the “dialogue”, there are parts that
are skipped over but summarized by the narration. Is this allowed?
This is correct.
● Q # 4:
○ JSON text:{
{
"type": "dialogue",
"role": "Orion",
"action": "His voice echoes powerfully through the room.",
"text": "DO YOU THINK IT'S APPROPRIATE TO INTERRUPT MY LESSON WITH YOUR
INDECISIVENESS!?"
}
Comment:Are descriptions of speech performance (e.g., volume timbre, etc.) acceptable ACTIONS?
Yes, acceptable
● Q # 5:
○ Original text: “You smell like mint.” Max looked up from his Atlas with a grin.
“Flea dip,” I said, mirroring his smile. “No more scratching. How's your rash?” I couldn't see any sign of it
now but as Max lifted up his shirt, a faint raised line of flesh was revealed across his stomach.
○ JSON text: {
"type": "dialogue",
"role": "Max",
"action": "3. Max lifted his shirt to reveal a mark on his stomach.",
"text": "You smell like mint."
},
{
"type": "dialogue",
"role": "Me",
"text": "Flea dip. No more scratching. How's your rash?"
},
Comment: what should be done with the same person's action that happens at a later dialogue but the
action is included in the previous dialogue after conversion?
Please correct it and rewrite.