Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 25

Annotation Rules- Rewrite

Labeling process description

1. Target
● Rewrite and optimize the English converted script, make the plot of the script coherent.
● There is no need to check whether all the contents of the original novel text have been converted, as long as the
script itself is contextually coherent.
● One annotator is needed to complete a whole novel, which will also improve the annotation efficiency.
● The amount of data converted from each novel ranges from 150-400.

2, Workflow

It is better to know the general plot of the original text in advance

1. Read the converted script directly, i.e. the JSON version we provided.
2. Rewrite the script according to the original text if there is any place that does not meet the quality requirements
listed below.
3. Mark the rewritten areas in red.
4. For cases where the entire paragraph is about explicit content , we mark the explicit content (explicit content column
is labeled 1), and in such cases, in addition to the original requirements, the plot and details are additionally
required to be fully transformed follow the original text.
5. When the value of True_json column is 0, this data does not need any annotation, annotators can directly skip
this data.
6. When the value of True_json column is 1, annotators need to paste the rewritten JSON into
https://www.bejson.com/explore/index_new/ for JSON format calibration, according to the prompts to adjust
the JSON format to ensure all the JSON format is completely correct.
7. Each book is an independent excel file, different Excel files can not be merged, put all the excel files in the same
folder, do not rename the excel file.
3. Column naming instructions for csv file
a. index: the current csv file row id, row order, this column is prohibited to change
b. topth: book id, this column is prohibited to change
c. target_role: first-person objective role, this column is prohibited to change
d. src_text: the original text, this column is prohibited to change
e. covert_text: the script of the conversion, this column is prohibited to change
f. covert_text_fix_column: The copy of [Convert text], annotators can rewrite in this column, only this column can
be modified
g. detail_flag: detail flag column is 1, it means this row includes Explicit content, it needs to be completely converted,
corresponding to the Workflow requirement 4, this column is prohibited to change
h. True_json: corresponds to the Workflow requirements 5 and 6, True_json column is 1, it means the JSON format
of this row is right; True_json column is 0, it means the JSON format of this row is wrong, this row does not need to be
modified; this column is prohibited to change

Labeling quality check items

Tag Standard Detailed Description Non-Compliant Example Processing methods


Numb (Requirement)
er
1 Conversion is JSON text completed normally, not In this case, you can clearly
complete abnormally interrupted in the middle see that the json stops
abnormally in the middle,
and you need to continue to
write the following content
behind it.

2 The plot is The converted content, compared with Revise what doesn't make
complete and the original text, is ideologically sense or incoherent to make
coherent complete, coherent, and has all the major the plot coherent, base on
plot points. For example, the relocation of the story from the original
the scene, the sudden appearance of text, write and fill in the
characters, etc., will lead to the plot not missing parts.
coherent, these need to be rewritten,
especially the intersection of the two
JSON, the beginning and the end need to
be coherent.
3 The plot The plot can be compressed, but only if it Rewrite it correctly.
matches the is complete and coherent in accordance
original text with standard 2.
However, the retained plot needs to
match the original text. For example, if The first narration does not match
the original text is "walking home", then the original text.
the converted text cannot be "on the way
to school".
4 JSON format is Including but not limited to: Modify the format
correct ● Incorrect use of quotation marks:
Within “text”, there shouldn’t be
additional quotation marks enclosing
dialogue or speech.
● Character names should not have
quotation marks or {}.

The modified content in


covert_text_fix_column, which requires
the correct JSON format, make sure it
can be parsed in
https://www.bejson.com/
5 Content is ● Only two types are allowed: Narration Modify the types
correct and Dialogue. There shouldn’t be a
third type.
● For “Narration”, its text should be
pure narration. If the actions of
characters are mentioned, it's also
acceptable, but their speech or
dialogue shouldn't appear.
● For “Dialogue”, “role”, “action”, and
“text” fields respectively indicate the
character's name, the character's
action, and what the character says.
○ “Action” is defined as actions
related to the character but not
their speech, optional.
6 The text All content from varied characters from In the original text, dialogue Modify the character’s name
corresponds the original text must be mapped into the attributed to character A might be
accurately to right characters correctly in JSON text mistakenly attributed to character B
the respective in the converted JSON text,
characters. especially when the original text
doesn't explicitly specify the speaker,
particularly in cases where a
character speaks multiple sentences
consecutively. This will likely
increase the likelihood of conversion
errors.
7 Maintain the For the narrator “I” (protagonist), his/her In the pilot data, Darcy Vega should Modify the character’s name
first-person full name should only appear in the be “I”
perspective. dialogues of other characters; it should
not be presented in “narration” or in the
dialogue attributed to "I."
8 Uniformly use For the same character, the "role" field's Modify the character’s name
the same character name should be consistent
character throughout the entire text. For example, if
names Lucy Grey is sometimes referred to as
throughout the Lucy and at other times as Miss Grey, it
entire text. needs to be unified as Lucy Grey.
9 Explicit content The original text sometimes will contain
must be fully explicit (sexual, violence etc.)
converted. descriptions, and the converted JSON
(explicit text may have adjusted these
content will be descriptions to prevent explicit content.
marked) Slight modifications are acceptable, but
to the extent that remains consistent with
overall plot, without significant deletions
of any info.

Examples of labeling

The scale of details of the conversion can not be unified, some parts will be converted in detail, and some parts will be
summarized roughly. The judgment standard is: the coherence is sufficient (if there is a sudden appearance of the plot
and characters in the script after the conversion, please find the pretext in the original text and add it in the corresponding
position). The following are examples of different types of detail.
Original text Convert text clarification

1 The part marked yellow, is


correct.

A large portion of the


original text has been
compressed into a single
sentence, but it doesn't
affect the overall coherence
or the plot advancement
between the main
characters. So it's correct.
2 The part marked yellow, is
correct
It has been greatly
compressed, but it does not
affect the main plot or
coherence.

The part marked red needs to


be rewritten
[I was picked up] has not been
rewritten, but the direct
appearance of "as Ryle carries
me away" can lead to plot
incoherence.

The part in red needs to be


added (the part in red, which
was not shown after the
conversion).
3 All correct
4 All correct
5 The part marked red has been
converted into the first
narration, which is a good
compression, removing many
plot points but not affecting the
coherence. However, it
changes the meaning of the
original text, which is not "As I
looked in the mirror".
6 There are some internal
thoughts that convert into
dialogue, but they don't affect
the coherence, and what is
said as character dialogue isn't
abrupt, so it's correct.
7 Explicit content, there will be
identification in 1 column, this
part of the content needs to be
completely converted as the
original text, compression is
not acceptable. If there is any
inconsistency with the original
text, it needs to be directly
added or modified.

Data & Payment Settlement Rules


Acceptance Criteria

● Delivery requirements: each book for an single excel file, can not be merged, please put all the excel files in the
one folder, and do not modify the excel file name

● Acceptance requirements: each batch of delivery only has two chances of acceptance, if there are two
continuous failures, the results of the whole book will not be accepted, the whole book must be given up, non-
delivery, it must be changed to another person re-labeling

● Quest Audit:
○ Quest Quality Assurance Team (“Quest QA”) will do a thorough spot-checking of the data submitted by
your team based on criteria defined in the instruction doc
○ Accuracy Rate: Quest QA will calculate the Accuracy Rate with the formula:

Accurate Rate = # of accurate labels / total # of labels spot-checked

○ Tips:
■ You are recommended to submit before the Initial Delivery Deadline, so you can get Quest QA
feedback earlier
■ The checkup is rigorous and we always catch errors. There are severe consequences for inadequate
accuracy therefore to make sure you receive full payment and can continue working it's better to
conduct strict review & spotcheck internally

Payment & Penalty Summary


Evaluation and Payment Calculation Data Acceptance and Payment Settlement

Accuracy Error Rate (E) Payment Post QA Data Acceptance Rule Payment Rule
Rate (A) Calculation
A ≥ 95% E ≤ 5% $/Data * Total Full data acceptance Full payment and proceed to the next
Quantity Accepted batch of data.
90% ≤ A < 5% < E ≤ 10% $/Data * Total Elect to: Pay based on payment rule with penalty
95% Quantity Accepted Option 1. Submit for data
* (1-2*E) acceptance and settle payment with
penalty

Option 2. Revise ONLY QA


mistakes before the Revision
Deadline to reach 95%+ without
penalty
85% ≤ A < 5% < E ≤ 10% No Whole dataset revisions before the Complete revision until hitting 90%+, then
90% acceptance/paym Revision Deadline to get to 90%+, proceed to payment.
ent until the then proceed to payment and the
revised version next batch of data. If accuracy is not achieved with revisions
passes [95]% before the deadline, then the team is
accuracy Maximum number of revisions is 2. disqualified and no payment.
A < 85% E > 15% No payment No data acceptance No payment and disqualification from
project; also a red flag on platform.

Note: each batch of delivery only has two chances of acceptance, if there are two continuous failures, the results of the
whole book will not be accepted, the whole book must be given up, non-delivery, it must be changed to another person
re-labeling and no payment!

Please read the instructions carefully and follow the requirements strictly for labeling and revision!

Here are some Q&A and comments for some of the confusion you may have, that may help you out:

● Q # 1:
○ Original text: “You can't stay here tonight.” I strode out of the room, snatching my Atlas on the way as I
headed for the door. I heard her walking after me and knew she deserved more of an explanation than a flat
refusal. “I need some space, alright? Vampire needs.” I glanced at her and she gave me a pointed look
before shrugging.
○ JSON Text:
{
"type": "dialogue",
"role": "Me",
"text": "No, you can't stay here tonight. 3. I need some space, alright? Vampire needs."
},
Comment: The original text contains a narration and a description of the action in the middle, but the JSON
ignores the action and splices the two separate sentences together, is this acceptable? If not, what kind of
tag should it be under?
Yes, this is acceptable.

But if the JSON contains a description of the action (which the dp does not), is that acceptable? If not,
which tags should it be?
Either narration or action is fine.

● Q # 2:
○ Original text: “Stupidity is dangerous around something like that. What if he starts cutting himself?” I
hissed, running a hand down the back of my neck.
○ JSON text:
{
"type": "dialogue",
"role": "Me",
3. "text": "What if someone starts cutting themselves with it? That could be disastrous."
},
Comment: If the JSON is rephrasing the original text, is this acceptable? If not, what kind of tag should it
be under?
It’s acceptable, if the meaning is correct and the important plots are not lost.
● Q # 3:
○ Original text:“Good. Your first assignment this week is to embrace your inner Fae. Don't take any shit lying
down. No one's judging you if you lose, but if you don't even try to win, you're not one of us. And if you can't
hack it, get out of my class.” He gestured to the door and one boy actually gathered up his things and
hurried out.
○ JSON text:{
"type": "narration",
"text": "Orion emphasized the severity of the upcoming challenges and what it means to truly be
Fae."
},
{
"type": "dialogue",
"role": "Orion",
"text": "Don't take any shit lying down. If you can't hack it, get out of my class."
},
Comment: the dialog in the original text is not provided in its entirety in the “dialogue”, there are parts that
are skipped over but summarized by the narration. Is this allowed?
This is correct.

● Q # 4:
○ JSON text:{
{
"type": "dialogue",
"role": "Orion",
"action": "His voice echoes powerfully through the room.",
"text": "DO YOU THINK IT'S APPROPRIATE TO INTERRUPT MY LESSON WITH YOUR
INDECISIVENESS!?"
}
Comment:Are descriptions of speech performance (e.g., volume timbre, etc.) acceptable ACTIONS?
Yes, acceptable

● Q # 5:
○ Original text: “You smell like mint.” Max looked up from his Atlas with a grin.
“Flea dip,” I said, mirroring his smile. “No more scratching. How's your rash?” I couldn't see any sign of it
now but as Max lifted up his shirt, a faint raised line of flesh was revealed across his stomach.
○ JSON text: {
"type": "dialogue",
"role": "Max",
"action": "3. Max lifted his shirt to reveal a mark on his stomach.",
"text": "You smell like mint."
},
{
"type": "dialogue",
"role": "Me",
"text": "Flea dip. No more scratching. How's your rash?"
},

Comment: what should be done with the same person's action that happens at a later dialogue but the
action is included in the previous dialogue after conversion?
Please correct it and rewrite.

You might also like