Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Comparing Selective Masking Methods

for Depression Detection in Social

Identifying those at risk for depression is a crucial issue in which social media
provides an excellent platform for examining the linguistic patterns of depressed
individuals. A significant challenge in a depression classification problem is ensuring
that the prediction model is not overly dependent on keywords, such that it fails to
predict when keywords are unavailable. One promising approach is masking, i.e., by
masking important words selectively and asking the model to predict the masked
words, the model is forced to learn the context rather than the keywords. This study
evaluates seven masking techniques, such as random masking, log-odds ratio, and
the use of attention scores. In addition, whether to predict the masked words during
pretraining or fine-tuning phase was also examined. Last, six class imbalance ratios
were compared to determine the robustness of the masked selection methods. Key
findings demonstrated that selective masking generally outperforms random
masking in terms of classification accuracy. In addition, the most accurate and robust
models were identified. Our research also indicated that reconstructing the masked
words during the pre-training phase is more advantageous than during the fine-
tuning phase. Further discussion and implications were made. This is the first study to
comprehensively compare masking selection methods, which has broad implications
for the field of depression classification and the general NLP.

 Reddit Self-reported Depression Diagnosis (RSDD) dataset and Time-RSDD
dataset (

The datasets should be loaded into the OP_datasets folder

Training Approaches
 BERT further pre-train + fine-tune and FURTHER-02- (adapted from
KE-MLM and
 BERT fine-tune with reconstruction objective (adapted
 Standard BERT fine-tune

Selective Masking Methods

1. Random masking random
2. Depression Lexicon deplex (lexicon.txt
3. Log-odds-ratio logodds (from
4. TF-IDF tfidf (adapted from
5. Sum attention sumatt (adapted from
6. Top attention prop
7. Neural Network NN (adapted

get_datasets contains python script and .ipynb files for extracting, preprocesing and
creating the dataset objects for training
keyword contains .ipynb files for obtaining the keywords and the resulting keywords
in .txt format
src contain the source code for creating a masked dataset and training & evaluation

You might also like