This research is about the effects of mobile learning in education.
Based on the newly developed CREED
(Checklist for the Rigor of Education-Experiment Designs), which focuses on the internal, construct, and statistical-conclusion validities of research, this study investigated the experimental research designs of mobile-learning studies. - Experimental design types This research followed the classic categorization by Campbell and Stanley (1966), and divided the experimental design types into (1) a pre-experiment design, where there was no control group or random assignment procedure; (2) a quasi-experimental design, where there was a control group (contrast group) but no random assignment procedure; and (3) a true-experiment design, where there was both a control group (contrast group) and a random assignment procedure. - Methods for determining baseline equivalence Both experimental and quasi-experimental designs might try to ensure the baseline equivalence of participants before the intervention. The generally used methods may be one of the following ( Fraenkel, Wallen, & Hyun, 2011): 1. Random assignment without a pretest. 2. Random assignment with pretest and equating methods. 3. Quasi-experiments with no use of equating measures, in which experiments did not include a pretest, or included a pretest but did not make any adjustments accordingly. 4. Quasi-experiments in which the t-test applied to pretest scores confirmed that there was no significant difference between the two groups. 5. Quasi-experiments that employed analysis of covariance (ANCOVA) as a statistical control. 6. Quasi-experiments that used gain score as the dependent variable, which means that when comparing the experimental and control groups, the posttest score was replaced by the gain score (posttest score minus pretest score) in order to adjust for the lack of equivalence at the pretest. 7. Quasi-experiments that had a counterbalanced design. Since this design uses the same group of participants for both settings, it avoids the problem of the lack of equivalence at the pretest. 8. Quasi-experiments with a factorial design. Confounding variables that could affect the experimental outcome were listed as independent variables, controlled, and examined by the researchers. For example, using student ability as an independent variable of experimental manipulation and exploring the interaction between the experimental effect and ability levels of students. - Reliability and validity of measurement tools This research adopted the minimum standards published by the WWC (2014), categorizing a study as exhibiting internal consistency when it met the minimum standards of a Cronbach's α of 0.5 or higher, test–retest reliability of 0.4 or higher, or an interrater reliability of 0.5 or higher; otherwise a study was categorized as not meeting the minimum standards. Regarding the validity, this research adopted the minimum standards published by the WWC (2014), categorizing the outcome measure of a study as meeting the minimum standards when it was (1) clearly defined and had a clear interpretation, and (2) measured the construct it was intended to measure. Otherwise a study was categorized as not meeting the minimum standards. - Statistical power This study used the G*Power 3 statistical software (Faul et al., 2007) to calculate the statistical power of each study based on the codes of sample size, the statistical methods used, and the effect sizes coded in each study. Since Cohen (1988) proposed that studies should be designed so that they have an 80% probability of detecting an effect when there is an effect there to be detected, and because a statistical power of less than 0.5 means that a claim about the intervention effect being significant or non-significant is similar to guessing, we categorized the statistical power of the research reviewed in this study into three levels: ≤ 0.50, 0.51–0.79, and ≥0.80.