Professional Documents
Culture Documents
Comparative Analysis of Actual Language Usage and Selected Grammar and Orthographical Rules in PH Languages
Comparative Analysis of Actual Language Usage and Selected Grammar and Orthographical Rules in PH Languages
Roger Stone and Neri Zamora, "Designing an Alphabet for an Unwritten Language," in 1st MLE Conference,
“Reclaiming the Right to Learn in One's Own Language”, Capitol University, Cagayan de Oro, Feb 18-20, 2010.
Spelling Variants: Potential areas of
confusion for the Philippine Languages
Compound words (e.g. bahaykubo vs. bahay-kubo)
Morphophonemics – the variation due to the collision of affixes
Assimilation – <d> vs <r> (e.g. nandito vs. narito)
Vowels that are dropped from words during affixation
e. g. maibibili / mabibili
The use of the 8 new letters (ch f j ll ñ ng q rr v x z)
e.g. taxi vs. taksi
Roger Stone and Neri Zamora, "Designing an Alphabet for an Unwritten Language," in 1st MLE Conference,
“Reclaiming the Right to Learn in One's Own Language”, Capitol University, Cagayan de Oro, Feb 18-20, 2010.
Corpus-Based Analysis of a Language
Start
Collection of
Sentences and
phrases
Text Corpus of a Spelling Variant
given language Groups Extractor
Spelling
Transformation
Rules
Spelling and
Grammar and Grammar Rules
Orthography Books Spelling Variant
of a given language Counter
Spelling Variant
Group
Frequency
Counts
End
Corpus Collection
Corpus Miner Software
Filipino/Tagalog
Bantay-Wika project
Cebuano
Sun Star news website (www.sunstar.com.ph)
Ilokano
Tawid News Magasin website (www.tawidnewsmag.com)
Lexicon Pruning
Running lexicons or word lists for each language considered
were built.
Solution is retain entries with only letters and the dash ‘-’
symbol
Spelling Variant Extraction: Levenshtein
Edit Operations*
Edit operation Example
anu-ano → ano-ano (tgl)
Substitution nagmaniho →nagmaneho (ceb)
kataltalonan →kataltalunan(ilk)
hinantay →inantay (tgl)
Deletion makaaguwanta →makaagwanta (ceb)
pammutbuteng →pamutbuteng(ilk)
kolehiala→kolehiyala (tgl)
Insertion nakakuhag→nakakuhaag (ceb)
nagdadakkel → nag-dadakkel (ilk)
Collection of
Sentences and
phrases
Text Corpus of a Spelling Variant
given language Groups Extractor
Spelling
Transformation
Rules
Spelling and
Grammar and Grammar Rules
Orthography Books Spelling Variant
of a given language Counter
Spelling Variant
Group
Frequency
Counts
End
Automatic extraction of Spelling Variants
Collection of
Sentences and
phrases
Text Corpus of a Spelling Variant
given language Groups Extractor
Spelling
Transformation
Rules
Spelling and
Grammar and Grammar Rules
Orthography Books Spelling Variant
of a given language Counter
Spelling Variant
Group
Frequency
Counts
End
Reference rule books and grammar sketches
Filipino / Tagalog
• Komisyon sa Wikang Fiilipino. (2001). Alfabeto at Patnubay sa Ispelling ng Wikang Filipino.
Manila.
• UP Sentro ng Wikang Filipino (2008). Gabay sa Ispeling. UP Diliman.
Cebuano
• Michael Tanangkingsing, A functional reference grammar of Cebuano: from a discourse
perspective, Volume 1.: LAP Lambert Academic Publishing, 2011.
• E.S. Godin (2007). Mga Batakan sa Panitik sa Binisaya-Sinugbuanon (Rules on Cebuano-
Visayan Spelling). MSU-IIT, Iligan, Philippines.
Ilokano
• Noemi U. Rosal, Pagbasa at Pagsulat sa mga Wika ng Pilipinas (Ilokano). Agsursurotayo nga ag
Ilokano. Quezon City: Sentro ng Wikang Filipino - Diliman Unibersidad ng Pilipinas, 2011.
• S.E. Benosa (2011). An Ilocano Orthography for MTB-MLE. Unpublished Term Project. UP –
Diliman
Corpus Based Analysis
Start
Collection of
Sentences and
phrases
Text Corpus of a Spelling Variant
given language Groups Extractor
Spelling
Transformation
Rules
Spelling and
Grammar and Grammar Rules
Orthography Books Spelling Variant
of a given language Counter
Spelling Variant
Group
Frequency
Counts
End
Corpus Based Analysis
Start
Collection of
Sentences and
phrases
Text Corpus of a Spelling Variant
given language Groups Extractor
Spelling
Transformation
Rules
Spelling and
Grammar and Grammar Rules
Orthography Books Spelling Variant
of a given language Counter
Spelling Variant
Group
Frequency
Counts
End
Results and Analysis
Agreement
Rule (rulebook-suggested
Variant 1 Variant 2 with
variant)
Rulebook
ano-ano (222) anu-ano (1254) X
<o> vs <u> in reduplication:
sino-sino (94) sinu-sino (298) X
retain the <o> for the
halo-halo (46) halu-halo (19) √
repeated stem (variant 1)
salo-salo (24) salu-salo (112) X
<o> vs <u> conjugated form
of a stem originally ending gulohin (0) guluhin (51) √
with <o> is trasformed to <u> Tugtogin (0) Tugtugin (52) √
(variant 2)
kuwento (1949) kwento (850) √
Rule <uw> vs <w> prefer the
eskuwela (56) eskwela (99) X
use of <uw> (variant 1) for
tuwalya (79) twalya (6) √
academic and professional use
kuwalipikasyon (9) kwalipikasyon (23) X
Cases covered by rule books