Desktop version

Home arrow Philosophy

  • Increase font
  • Decrease font

<<   CONTENTS   >>

Qualitative analyses of process data

The strategy data described by Cho et al. (this volume) can be analyzed qualitatively or quantitatively, and different disciplines have different practices. Cho et al. emphasize that with the right tasks, materials, learners, and prompts, the resulting data can be highly informative. Perhaps a disadvantage of think-aloud research is that there are many steps in the design of data collection and during actual data collection where a researcher can go wrong. For example, researchers should not model or demonstrate a think-aloud, as this is known to influence what learners do and do not verbalize. Likewise, prompts need to be very carefully worded, so as not to ask for explanations— otherwise, this becomes a self-explanation prompt rather than a think-aloud protocol.

Likewise, Cho et al. similarly emphasize the importance of careful coding (categorizing) of strategies. The many decision points in coding need to be justified and documented, such as finding a balance between emergent coding (bottom up) and pre-specified codes (top down). As with study design, well-done coded process data can provide insights that are not possible from other analyses, but there are many pitfalls, and re-coding requires a great deal of added effort. In general, it is wise to do initial coding at a more fine-grained level (even though high inter-rater reliability is more difficult, the larger the number of codes that are used), because these fine-grained codes are easy to collapse later on.


The variable-centered quantitative, person-centered quantitative, and qualitative analysis methods reviewed here answer very different kinds of research questions. The variable-centered analyses ask ‘what happens to a variable (on average) when there is a change to another variable?’ The person-centered analyses ask ‘are there subgroups of participants with similar combinations of scores on two or more variables’? If so, variable-centered analyses are used to compare subgroups on antecedent or consequent variables. The qualitative analyses ask ‘what processes happen during learning and in what sequence(s)?’

Beyond those basic divisions, and given that assumptions of a test are met, the advantages and disadvantages of each method largely revolve around five themes: (1) whether directional effects are tested, (2) the number of observed variables which operationalize various measured constructs, (3) the nature of and number of dependent variable(s), (4) whether the researcher wants to test for mediation and/or moderation, and (5) statistical power. Note that all of these issues arise in any kind of data analysis, whether strategic processing is involved or not, and whether the research is in education or some other field of study.

For simple association, the chi square and correlation were discussed; the other tests (t test/ANOVA, regression, SEM) all require stating effects in a certain direction, i.e., independent and dependent variables. For those tests, if variables are switched (a variable that had been mistakenly entered as an independent variable is subsequently entered as a dependent variable) the results will change.

If only a single measure is collected for each construct, then chi square, correlation, t test, ANOVA, or regression are the choices. If multiple measures are collected on at least one construct, then the SEM approaches—SEM, GCM, LCA, GMM—give more statistical power (assuming that sample sizes are large enough to permit SEM).

If dependent variable(s) are continuous, then t test, ANOVA, linear regression, or the SEM approaches are the choices. If dependent variable(s) are categorical, then chi square, logistic regression, or the SEM approaches are the choices. If there is only one dependent variable, then t test, ANOVA, or regression are the choices. If there are multiple dependent variables tapping different constructs, then the SEM approaches— SEM, GCM, LCA, GMM—give more statistical power (assuming that sample sizes are large enough to permit SEM).

If the researcher wants to test for mediation (A affects B, B affects C, therefore A affects C indirectly via B; B mediates the effect of A on C), path analysis or SEM approaches must be used, as by definition there are multiple dependent variables. If the researcher wants to test for moderation (interactions), then regression, factorial ANOVA, path analysis, and SEM are available.

Finally, with regard to statistical power, all of the latent variable approaches—SEM, GCM, LCA, and GMM—require larger samples than the other types of analyses. This is mostly due to the large number of factor loadings, paths, correlations, and error terms that are estimated in these models. On the other hand, the use of FIML increases statistical power compared to dropping all participants missing on any variable.

In summary, the strategy researcher may have more than one option for analyzing strategic processing data, but there are many considerations that must be weighed simultaneously. In practice, screening for assumptions on all independent and dependent variables and then considering the actual sample size(s) obtained will often dictate which specific analysis can be used from among the hypothetical choices. Some kind of data reduction—collapsing codes, creating factor scores, or other composites— might be needed in order to use the non-SEM approaches if SEM approaches had been planned for. At the same time, if latent variable analyses are planned (e.g., collecting multiple indicators of a construct, testing mediation)—and these have many advantages in terms of statistical power and accounting for measurement error—then large sample sizes need to be planned in advance.

<<   CONTENTS   >>

Related topics