Home Philosophy

# T Test, ANOVA

Both the parametric mean comparison tests (independent samples t test, one-way ANOVA, factorial ANOVA) and their non-parametric counterparts (e.g., Mann-Whitney U test, Kruskal-Wallis ANOVA) have been used in learning strategy experiments and group comparisons. For example, in my own work (Cromley et al., 2013), we compared three different types of instruction in diagram comprehension strategies with high school students who had been pre- and posttested on diagram comprehension. We used mixed 2 (between: treatments) x 2 (within: pre- and posttest) ANOVAs to show whether any of the groups showed higher scores at posttest, after accounting for pretest scores. Besides statistical power, the repeated-measures ANOVA has less restrictive assumptions than ANCOVA for analyzing pretest-posttest, treatment-comparison group experiments. For a 2-group comparison, there is no difference in power between an independent-samples t test and an ANOVA.

One disadvantage of these group comparisons noted by Freed et al. (this volume) is that not all participants may participate or engage in the treatment at the same level, which lowers statistical power. Another disadvantage of ANOVAs is that large sample sizes are needed—a minimum of 20 participants per cell (i.e., 4 cells in a 2 x 2 design) is recommended. Non-parametric tests have less power but fewer assumptions, and are an important tool. However, some analyses have non-parametric counterparts only in R (e.g., mixed between- and within-subjects design, Feys, 2016). Finally, if classrooms are assigned to treatments (rather than individual participants being assigned to treatments), then a screening test called the Intraclass Correlation Coefficient (ICC) is needed to check whether the assumption of independence of observations is violated. If so, when large numbers of classes were tested, then multilevel modeling needs to be used; with smaller numbers of classes, alternatives to multilevel modeling need to be used (Huang, 2016).

## Correlation

With samples of 60 or larger, correlation has ample statistical power to detect r > .35, and a correlation matrix should always be reported with any dataset that has two or more continuous variables. The disadvantage of correlation was also noted by Freed et al.; correlation is simply a measure of association. The non-parametric Spearman rank correlation may be needed if variables are not normally distributed (Hinkle, Wiersma, & Jurs, 1988).

## Regression

Linear regression has been frequently used in non-experimental studies, but can be used in experiments. As noted by Freed et al., learning strategy researchers test effects of antecedents (e.g., strategy instruction) on strategy use (e.g., Yoon & Jo, 2014), and also test effects of strategy use on outcome variables (e.g., history learning; Deekens, Greene, & Lobczowski, 2018). Regression has the advantage over correlation that multiple independent variables can be used simultaneously to explain variance in a single dependent variable. Such variables can be variables of substantive interest or control’ variables to account for variance in the DV that is not of substantive interest (e.g., students of slightly different ages participate, but age might account for some variance in the DV). Independent variables should always be chosen based on theory and prior research; they should not be chosen based on correlations with the outcome variable, and not by the analysis software as in ‘forward’ or ‘backward’ regression.

It is quite straightforward to test for interactions in regression—that is, to test whether an independent variable has a different strength of effect in one group than in another group (often termed moderation). For example, Leutner, Leopold, and Sumfleth (2009) compared strategy instruction in using imagery (only) to strategy instruction in drawing-to-learn (only), to a combination of both, or neither, with 10th grade chemistry students. They found that the drawing strategy by itself had no effect and the imagery strategy alone had no effect, but having students draw and use imagery together actually harmed learning (a significant interaction). Imagery moderated the effect of drawing on comprehension; there was an interaction between drawing strategy instruction and imagery strategy instruction.

Linear regression can accommodate any type of predictor variable(s)—continuous or categorical—but the dependent variable must be continuous. If the dependent variable is a category, then various different types of logistic regression are available. For example, Greene and Azevedo (2009) categorized posttest performance into low, medium, and high, and used a form of logistic regression to analyze effects of during-learning monitoring on the posttest category. They found that more metacog-nitive monitoring during learning was associated with being in a higher category. Note that continuous variables that show a normal distribution should never be re-coded as dichotomous (‘high’ vs. ‘low’) variables, as this causes a severe loss in statistical power.

Regression does have some disadvantages, in that missing data on any one variable will cause that participant’s data to not be included, measurement error in variables cannot be accounted for, and either antecedents or consequences of learning strategies can be tested, but not both. In addition, if the normality or equal variances assumptions of regression are not met by the data, non-parametric regression is somewhat challenging to learn. As with ANOVA, the ICC needs to be checked and multilevel modeling may be needed.

 Related topics