Desktop version

Home arrow Philosophy

  • Increase font
  • Decrease font


<<   CONTENTS   >>

Path Analysis

As noted by Freed et al., researchers who are interested in both antecedents and consequences of learning strategies can test both simultaneously using path analysis, which is a simplified type of structural equation modeling (SEM). Each effect of one variable on another is called a path, hence path analysis. A set of variables and the paths (or correlations) that connect them are called a model. Path analysis can be used in non-experimental and experimental studies. For example, Cromley and Azevedo (2007) tested effects of background knowledge on strategy use and effects of strategy use on comprehension in a single path model. There are many advantages of testing a single path model rather than running many regressions: (1) more statistical power/ lower risk of Type I error, (2) there are statistics available to capture the overall fit of the model rather than multiple F tests, and (3) both the direct (single path, regression) effect of a variable and its indirect effect via the mediator variable can be tested. Path analysis can overcome also the other disadvantages of regression noted above: participants can be included even if they have missing data and measurement error in dependent variables can be accounted for. Path analyses can be used to compare the size of effects across different groups (called multi-group modeling), which allows for more flexibility in modeling. Moderation (interactions) can be tested in path analysis as they can in regression. In most SEM software packages, missing data are handled using Full Information Maximum Likelihood (FIML; confusingly, sometimes simply called ML). FIML uses all data that is present on a pair of variables for the numerator of a relation that is tested (analogous to a regression beta weight), but for the denominator it uses only participants with complete data on all variables in the model. Note that FIML does not involve any kind of estimating what the missing data values might be (the latter is termed imputation). Furthermore, if variables are somewhat non-normal, SEM software can address this (Robust Maximum Likelihood or MLR). The free R package lavaan has made SEM more accessible to researchers, at least in terms of software costs.

Path analysis has several disadvantages for the researcher, who has to learn the complexities of SEM and programs/apps (e.g., non-convergence, fit indices, comparing nested models, adding error covariances). Learning the various ‘tips and tricks’ to handle these issues is not a trivial matter and is typically learned in coursework. In multi-group modeling, having a group with only a small number of participants can cause problems. In addition, when many variables are included in a path model, the researcher cannot claim that his or her model is correct,’ merely that it shows adequate fit. Despite the term effect,’ path analysis cannot establish causality—experiments can establish causality, and any of the analyses reviewed here can be used to analyze data from an experiment. As with ANOVA and regression, the ICC needs to be checked and multilevel path modeling may be needed.

Structural Equation Modeling

As noted by Freed et al., SEM has even more advantages over path analysis in terms of removing error variance from measured variables and thus testing relations among more ‘pure’ factors called latent variables. In SEM, sets of measured variables are hypothesized to be driven by a single, unmeasured factor. These factors are then tested for specific, theory-driven influences on each other. Typically the work of creating latent factors—often referred to as a measurement model—is done in the first step. This is an important difference from MANOVA where a set of dependent variables is combined into a composite(s), and the composite(s) is tested for group differences. In many studies, there is no theoretical or empirical basis for the single composite of all dependent variables involved in MANOVA. Hence the popularity of SEM, where latent factors are defined based on a strong theoretical rationale.

In SEM, testing theory-driven relations among latent factors is done in a second step—sometimes called the structural model. As with regression and path analysis, SEM can be used in non-experimental or experimental research. As with path analysis, moderation (interactions) can be tested in SEM (Marsh, Wen, Nagengast, & Hau, 2012). Like path analysis, SEM programs typically use FIML to account for missing data. One example of SEM with strategy use comes from Ahmed and colleagues (Ahmed et al., 2016), who tested Cromley s model of comprehension using SEM in order to better account for measurement error in each variable and construct. They found that neither a background knowledge factor nor a vocabulary factor had effects on strategy use. Strategy use had small and mostly non-significant direct effects on reading comprehension, except in 7th to 8th grades, and had entirely non-significant indirect effects on comprehension.

In addition to the above disadvantages of SEM, larger samples are required compared to most of the other techniques. Sample sizes for SEM are calculated from the number of loadings from factors onto observed variables, paths, and correlations in the model, and error estimates on all outcome variables. In this sense, SEM requires many measured variables, and more variables in a model is better for power. SEM is considered a large-sample technique; some fit statistics are only suitable for samples of N > 250. As with path analysis, the researcher’s ‘best’ model may not be the ‘best possible’ model. As with ANOVA, regression, and path analysis, the ICC needs to be checked and multilevel SEM may be needed.

 
<<   CONTENTS   >>

Related topics