Welcome to Scribd!

BARTpho A0 Poster

Uploaded by

0% found this document useful (0 votes)

14 views1 page

This document introduces BARTpho, pre-trained sequence-to-sequence models for Vietnamese including BARTphosyllable and BARTphoword. BARTpho is trained on 20GB of Vietnamese text using a denoising objective to reconstruct corrupted input. Evaluation on summarization, capitalization and punctuation restoration tasks shows BARTpho outperforms multilingual mBART, establishing new state-of-the-art results for Vietnamese natural language generation tasks. The models are publicly released to support research on Vietnamese NLP.

Original Description:

Original Title

BARTpho_a0_poster

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

14 views1 page

BARTpho A0 Poster

Uploaded by

Hmd Nokia

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 1

Search inside document

BARTPHO: PRE-TRAINED SEQUENCE-TO-SEQUENCE MODELS FOR VIETNAMESE

{ NGUYEN LUONG TRAN, DUONG MINH LE AND DAT QUOC NGUYEN } VINAI RESEARCH, VIETNAM

INTRODUCTION SUMMARIZATION TASK

Problems We formulate the summarization task as a monolingual translation problem and fine-tune our BART-
pho and the baseline mBART on the Vietnamese single-document summarization dataset VNDS [4].
• Pre-trained sequence-to-sequence models (e.g. BART, T5, etc.) are proposed to obtain SOTA per- We find that there are duplicate articles in this dataset, therefore we filter the duplicates and conduct
formances for generative NLP tasks experiments on both original and filtered datasets.
• Pre-trained seq2seq models used for Vietnamese: Only multilingual models (mBART, mT5,. . . ), Model Filtered validation set Filtered test set
there is not an existing public monolingual model for Vietnamese R-1 R-2 R-L R-1 R-2 R-L Human
mBART 60.06 28.69 38.85 60.03 28.51 38.74 21/100
• Monolingual models are preferable as dedicated language-specific models still outperform multi-
BARTphosyllable 60.29 29.07 39.02 60.41 29.20 39.22 37/100
lingual ones. Moreover, in Vietnamese, the white space is not only used to mark word boundaries
BARTphoword 60.55 29.89 39.73 60.51 29.65 39.75 42/100
but also to separate syllables that constitute words, for example:
Model Original validation set Original test set
– Syllable-level text: “Chúng tôi là những nghiên cứu viên”We are researchers
R-1 R-2 R-L R-1 R-2 R-L
– Word segmented text: “Chúng_tôiWe làare những nghiên_cứu_viênreseachers ” fastAbs [⋆] _ _ _ 54.52 23.01 37.64
Contributions viBERT2viBERT [∗] _ _ _ 59.75 27.29 36.79
PhoBERT2PhoBERT [∗] _ _ _ 60.37 29.12 39.44
1. Presenting BARTpho with two versions - BARTphosyllable and BARTphoword - the first large-scale mT5 [∗] _ _ _ 58.05 26.76 37.38
monolingual seq2seq models pre-trained for Vietnamese mBART 60.39 29.19 39.18 60.35 29.13 39.21
2. Showing the effectiveness of BARTpho in a comparison with mBART on Vietnamese downstream BARTphosyllable 60.89 29.98 39.59 60.88 29.90 39.64
tasks: Text summarization, Capitalization and Punctuation restoration BARTphoword 61.10 30.34 40.05 61.14 30.31 40.15
[∗] and [⋆] denote the best performing model among different models experimented from previous
3. Publicly releasing our models at: https://github.com/VinAIResearch/BARTpho works [3, 4].

C APITALIZATION AND PUNCTUATION RESTORATION TASK

BARTPHO PRETRAINING
We follow the sequence-to-sequence approach to evaluate and compare our BARTpho and mBART on
• Based on BART [1], including two steps of pre-training: the Vietnamese capitalization and punctuation restoration tasks. The dataset used in this experiment
– Corrupting input text by noising function: was generated automatically using the TED-2020 v1 dataset.
Model Capitalization Punctuation restoration
Comma Period Question Overall
mBART 91.28 67.26 92.19 85.71 78.71
BARTphosyllable 91.98 67.95 91.79 88.15 79.09
– Training seq2seq model to reconstruct the original input text BARTphoword 92.41 68.39 92.05 87.82 79.29
• Pre-training data: 20GB of Vietnamese texts

REFERENCES
BARTPHO ARCHITECTURE [1] M. Lewis et al. “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Gener-
ation, Translation, and Comprehension”. In: ACL. 2020.
[2] Y. Liu et al. “Multilingual Denoising Pre-training for Neural Machine Translation”. In: Transactions
of the ACL 8 (2020).
[3] H. Nguyen et al. “VieSum: How Robust Are Transformer-based Models on Vietnamese Summa-
rization?” In: arXiv preprint arXiv:2110.04257v1 (2021).
[4] V.-H. Nguyen et al. “VNDS: A Vietnamese Dataset for Summarization”. In: NICS. 2019.

• Using the standard sequence-to-sequence Transformer architecture and employing the GeLU acti-

Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
Rating: 4 out of 5 stars
4/5 (6)
3736a g9t Common Rail System
Document180 pages
3736a g9t Common Rail System
Kornelija Balandė
50% (2)
BARTpho: Pre-Trained Sequence-to-Sequence Models For Vietnamese
Document50 pages
BARTpho: Pre-Trained Sequence-to-Sequence Models For Vietnamese
MInh Thanh
No ratings yet
Bart Pho
Document5 pages
Bart Pho
Hoàng Thái
No ratings yet
Gupta Et Al 2019 - Character-Based NMT With Transformer
Document16 pages
Gupta Et Al 2019 - Character-Based NMT With Transformer
mcccccc
No ratings yet
ACL - 2020 - Mike Lewis - BART Denoising Sequence-To-Sequence Pre-Training For Natural Language Generation, Translation, and Comprehension
Document10 pages
ACL - 2020 - Mike Lewis - BART Denoising Sequence-To-Sequence Pre-Training For Natural Language Generation, Translation, and Comprehension
Yun Zhou
No ratings yet
BART: Denoising Sequence-to-Sequence Pre-Training For Natural Language Generation, Translation, and Comprehension
Document10 pages
BART: Denoising Sequence-to-Sequence Pre-Training For Natural Language Generation, Translation, and Comprehension
PALASH CHATURVEDI
No ratings yet
Voice Conversion Matlab Toolbox
Document7 pages
Voice Conversion Matlab Toolbox
borakas2011
No ratings yet
Bart - Bartpho: Bartpho: Pre-Trained Sequence-To-Sequence Models For Vietnamese
Document19 pages
Bart - Bartpho: Bartpho: Pre-Trained Sequence-To-Sequence Models For Vietnamese
nguyentthai96
No ratings yet
Codamosa Icse23
Document13 pages
Codamosa Icse23
690729830
No ratings yet
Signal Processing Manual 2009
Document59 pages
Signal Processing Manual 2009
bhashik
No ratings yet
Lec 02
Document33 pages
Lec 02
Ghhanali Singh
No ratings yet
Promptbreeder - Self-Referential Self-Improvement Via Prompt Evolution
Document64 pages
Promptbreeder - Self-Referential Self-Improvement Via Prompt Evolution
brilliantalex470
No ratings yet
Figure 2 (B) : Transactions On Neural Networks and Learning Systems 4
Document1 page
Figure 2 (B) : Transactions On Neural Networks and Learning Systems 4
xing007
No ratings yet
2018 Winter Model Answer Paper 1
Document23 pages
2018 Winter Model Answer Paper 1
Rohit Pandey
100% (1)
Pretraining Part2 17 Mar 23 PDF
Document38 pages
Pretraining Part2 17 Mar 23 PDF
arpan singh
No ratings yet
Minimizing Evolutionary Algorithms Energy Consumption in The Low-Level Language Zig
Document4 pages
Minimizing Evolutionary Algorithms Energy Consumption in The Low-Level Language Zig
jjmerelo
No ratings yet
Efficient Long-Text Understanding With Short-Text Models
Document16 pages
Efficient Long-Text Understanding With Short-Text Models
Mohammed Eltantawy
No ratings yet
Custom 116619341
Document8 pages
Custom 116619341
Josan Dela Torre
No ratings yet
VLSI LAB MANUAL (18ECL77) - Analog dt14-01-2022
Document148 pages
VLSI LAB MANUAL (18ECL77) - Analog dt14-01-2022
Aamish Priyam
No ratings yet
Paper SpecAugment A Simple Data Augmentation Technique For ASR
Document6 pages
Paper SpecAugment A Simple Data Augmentation Technique For ASR
gupnaval2473
No ratings yet
Weighted Random Patterns For BIST Generated in Cel
Document6 pages
Weighted Random Patterns For BIST Generated in Cel
anshu anu
No ratings yet
Evaluating The Text-to-SQL Capabilities of Large Language Models
Document13 pages
Evaluating The Text-to-SQL Capabilities of Large Language Models
jameswpmscribd
No ratings yet
Baskar Et Al - 2019 - Promising Accurate Prefix Boosting For Sequence-To-Sequence ASR
Document5 pages
Baskar Et Al - 2019 - Promising Accurate Prefix Boosting For Sequence-To-Sequence ASR
momo Jz
No ratings yet
MTEB: Massive Text Embedding Benchmark
Document24 pages
MTEB: Massive Text Embedding Benchmark
ChrisHalden
No ratings yet
Context-Based Machine Translation
Document11 pages
Context-Based Machine Translation
Ou Ss
No ratings yet
Konstantinos Krommydas, Christos D. Antonopoulos, Nikolaos Bellas, Wu-Chun Feng
Document4 pages
Konstantinos Krommydas, Christos D. Antonopoulos, Nikolaos Bellas, Wu-Chun Feng
tan
No ratings yet
CSL 210 Lab02 Introduction To Java 03102022 012146pm
Document5 pages
CSL 210 Lab02 Introduction To Java 03102022 012146pm
Minahil Fatima
No ratings yet
Swe-: C L M R R - W G H I ?: Bench AN Anguage Odels Esolve EAL Orld IT UB Ssues
Document46 pages
Swe-: C L M R R - W G H I ?: Bench AN Anguage Odels Esolve EAL Orld IT UB Ssues
wenyange2020
No ratings yet
NguyenLeHuuDuy 20IT309
Document32 pages
NguyenLeHuuDuy 20IT309
Kết Đoàn Nguyễn
No ratings yet
Memory-Efficient Fine-Tuning
Document17 pages
Memory-Efficient Fine-Tuning
Shrikant Koltur
No ratings yet
MATLAB Workshop Lecture 1
Document46 pages
MATLAB Workshop Lecture 1
haashill
No ratings yet
Boosting The Performance of Transformer Architectu
Document6 pages
Boosting The Performance of Transformer Architectu
Getnete degemu
No ratings yet
Pre-Training For Speech Translation: CTC Meets Optimal Transport
Document19 pages
Pre-Training For Speech Translation: CTC Meets Optimal Transport
Nesbit
No ratings yet
Cho Icassp2019 08683380-1
Document5 pages
Cho Icassp2019 08683380-1
reschy
No ratings yet
A User-Friendly FORTRAN BVP Solver
Document18 pages
A User-Friendly FORTRAN BVP Solver
Lorem Ipsum
No ratings yet
Chapter 5 Introduction To JAGS - Bayesian Hierarchical Models in Ecology
Document12 pages
Chapter 5 Introduction To JAGS - Bayesian Hierarchical Models in Ecology
Chun Wai Soo
No ratings yet
Rao-Blackwellized Particle Filter Using Noise Adap
Document11 pages
Rao-Blackwellized Particle Filter Using Noise Adap
Ramya Divya
No ratings yet
Seq2Edits: Sequence Transduction Using Span-Level Edit Operations
Document17 pages
Seq2Edits: Sequence Transduction Using Span-Level Edit Operations
Sunny Dutta
No ratings yet
XXXXX: Important Instructions To Examiners
Document21 pages
XXXXX: Important Instructions To Examiners
Yogesh Dumane
No ratings yet
2020文本纠错
Document8 pages
2020文本纠错
w1343323213
No ratings yet
Prototype-Driven Learning For Sequence Models
Document8 pages
Prototype-Driven Learning For Sequence Models
mkmanojdevil
No ratings yet
VAPS Tutorial11 01 2010
Document38 pages
VAPS Tutorial11 01 2010
Clarence AG Yue
100% (1)
A Computationally Efficient Vectorized Implementation of Localizing
Document35 pages
A Computationally Efficient Vectorized Implementation of Localizing
21-08523
No ratings yet
Robust Twin Support Vector Machine For Pattern Classification
Document12 pages
Robust Twin Support Vector Machine For Pattern Classification
PeterJLY
No ratings yet
Zmolikova CHiME 2020 Paper
Document4 pages
Zmolikova CHiME 2020 Paper
reschy
No ratings yet
Comparative Performance Analysis of Bat Algorithm and Bacterial Foraging Optimization Algorithm Using Standard Benchmark Functions
Document6 pages
Comparative Performance Analysis of Bat Algorithm and Bacterial Foraging Optimization Algorithm Using Standard Benchmark Functions
Debapriya Mitra
No ratings yet
Dropout Approaches For LSTM Based Speech Recognition Systems Jayadev Billa Information Sciences Institute, University of Southern California, Marina Del Rey, CA 90292, USA
Document5 pages
Dropout Approaches For LSTM Based Speech Recognition Systems Jayadev Billa Information Sciences Institute, University of Southern California, Marina Del Rey, CA 90292, USA
Bouhafs Abdelkader
No ratings yet
Compare Matlab, Octave, FreeMat, and Scilab
Document32 pages
Compare Matlab, Octave, FreeMat, and Scilab
mohsindalvi87
No ratings yet
2210 17189
Document5 pages
2210 17189
994132768
No ratings yet
Huggingface Co Blog Warm Starting Encoder Decoder Data Preprocessing
Document20 pages
Huggingface Co Blog Warm Starting Encoder Decoder Data Preprocessing
Seven 7
No ratings yet
Experiment 9: Aim: Theory
Document4 pages
Experiment 9: Aim: Theory
Varun Vora
No ratings yet
882/submission 882
Document11 pages
882/submission 882
Royal Octave
No ratings yet
Stanford Dataset 2.0
Document9 pages
Stanford Dataset 2.0
Jinyuan Kang
No ratings yet
QQ - GG: Point Any
Document14 pages
QQ - GG: Point Any
Vinícius dos Santos Mello
No ratings yet
15 SentimentAnalysis
Document17 pages
15 SentimentAnalysis
Muhammad Fahmi Pamungkas
No ratings yet
Course Outline of CSE 104
Document4 pages
Course Outline of CSE 104
Ju Ka
No ratings yet
NLP Intro
Document74 pages
NLP Intro
wasemgopaang
No ratings yet
Com 413 Ammar Usman Sabo H21CS018
Document16 pages
Com 413 Ammar Usman Sabo H21CS018
ammarusmansabo117
No ratings yet
DCT - Manual - TE-2015.docx Filename - UTF-8''DCT Manual TE-2015
Document20 pages
DCT - Manual - TE-2015.docx Filename - UTF-8''DCT Manual TE-2015
sana amin
No ratings yet
2022 Acl-Long 331
Document16 pages
2022 Acl-Long 331
ycdu66
No ratings yet
Can Bus Standalone Controller
Document24 pages
Can Bus Standalone Controller
Doan Van Tuan
No ratings yet
Periyar University: Periyar Institute of Distance Education (Pride)
Document200 pages
Periyar University: Periyar Institute of Distance Education (Pride)
VELMURUGAN
No ratings yet
MCV Trucks Trailers and Suspensions
Document77 pages
MCV Trucks Trailers and Suspensions
Carlos Cáceres
No ratings yet
Amazon Pay KYC - Training PPT For Vendors
Document19 pages
Amazon Pay KYC - Training PPT For Vendors
karan patel
No ratings yet
Oracle Minicluster S7-2
Document6 pages
Oracle Minicluster S7-2
Hope Himuyandi
No ratings yet
Es60 26 ND9000 SMC Valmet
Document20 pages
Es60 26 ND9000 SMC Valmet
José Angel Zabaleta
No ratings yet
QLC+ Manual en EN 4.12.2 PDF
Document144 pages
QLC+ Manual en EN 4.12.2 PDF
Silvia Prieto
100% (1)
Moderator Guide
Document19 pages
Moderator Guide
Marius Buys
No ratings yet
Gis 1
Document36 pages
Gis 1
Jaspreet Kapoor
No ratings yet
Module Overview: at The End of The Module, You Should Be Able To
Document51 pages
Module Overview: at The End of The Module, You Should Be Able To
DJ Free Music
No ratings yet
2022 December Microprocessor
Document1 page
2022 December Microprocessor
BRUNO PEGADO212082
No ratings yet
Microsoft Excel 2016 Step-By-Step Guide
Document84 pages
Microsoft Excel 2016 Step-By-Step Guide
Melvin Dipasupil
No ratings yet
Iris Recognition System
Document9 pages
Iris Recognition System
Hani hani
No ratings yet
Hac - Pilot Competencies For Helicopter Wildfire Ops
Document26 pages
Hac - Pilot Competencies For Helicopter Wildfire Ops
Pablo Sánchez
100% (1)
Grade 9 - Lesson 2 2020-2021
Document1 page
Grade 9 - Lesson 2 2020-2021
Ayn Realosa
No ratings yet
UT61C Computer Interface Software
Document7 pages
UT61C Computer Interface Software
daniel villa
No ratings yet
T110se Jupiter Z Crankcase
Document1 page
T110se Jupiter Z Crankcase
hasta
No ratings yet
02.signaling in Telcom NW and SSTP
Document15 pages
02.signaling in Telcom NW and SSTP
VINAY TANWAR
No ratings yet
How Bill Gates Radically Transformed His Public Speaking and Communication Skills
Document6 pages
How Bill Gates Radically Transformed His Public Speaking and Communication Skills
Bharat Shah
No ratings yet
College of Arts and Sciences: Module For Bpo 1 - Fundamentals of Bpo 1
Document16 pages
College of Arts and Sciences: Module For Bpo 1 - Fundamentals of Bpo 1
Neivil Jean N. Egonia
No ratings yet
The Impact of Social Media On Students Aryanna
Document18 pages
The Impact of Social Media On Students Aryanna
clent balaba
No ratings yet
GEA34109 Aero - Grid - Firming - Whitepaper
Document14 pages
GEA34109 Aero - Grid - Firming - Whitepaper
pomauk
No ratings yet
Chapter 2 - Organization and Presentation of Data: Learning Outcomes
Document8 pages
Chapter 2 - Organization and Presentation of Data: Learning Outcomes
Wai Kiki
No ratings yet
중소기업기술로드맵2022-2024 blockchain
Document363 pages
중소기업기술로드맵2022-2024 blockchain
Taehyung Kim
No ratings yet
Rocky Linux Focus Guide
Document24 pages
Rocky Linux Focus Guide
Judson Borges
No ratings yet
Bavarian Otto's Ultimate Maintenance Schedule: Engine Oil
Document1 page
Bavarian Otto's Ultimate Maintenance Schedule: Engine Oil
markolonius
No ratings yet
Pre-Wedding Shoot in Jaipur - 8 Instagram-Worthy Places You Can't Miss
Document12 pages
Pre-Wedding Shoot in Jaipur - 8 Instagram-Worthy Places You Can't Miss
Shivani Taya
No ratings yet
TR17 ME11 Static
Document51 pages
TR17 ME11 Static
emazitov-1
No ratings yet