Basic Modeling Framework for Estimating the Partial Correlation

In the absence of nondetectables, we begin with a simple paradigm based on the following two multiple linear regression (MLR) models:

and

where the error terms e_{y}|_{xc} and e_{x}|_{c} are each assumed i.i.d. with mean 0 and homogeneous variances and aX|_{c}, respectively. The vector C is a T-dimensional set

of covariates (e.g., possible confounders), so that models (6.1) and (6.2) characterize the conditional distributions of Y |X,C and X | C. Given that E(Y|C) = Ex|c[E(Y|X,C)] andVar(Y|C) =EX|c[Var(Y|X,C)] + VarX|c[E(Y|X,C)], the above specifications also imply the following MLR model characterizing the distribution of Y| C:

with i.i.d. homogeneous errors e_{y}|_{c} of mean 0 and variance ay^. These specifications (without yet positing further distributional assumptions on the errors in the three models) are sufficient to characterize the partial correlation (P_{yx}|_{c}) between Y and X conditional on C as follows:

where ay|_{c} = Var(Y|C) = PT_{+1}^{c}x|_{c} + ay|_{xc}. Equivalently, we can write

Standard practice (e.g., Kleinbaum et al. 2008; Kutner et al. 2005) dictates estimation of the squared partial correlation in Equation 6.4 based on the error sums of squares from models (6.1) and (6.3), that is,

with the estimator p_{yx}|_{c} taken as the positive or negative square root of Equation 6.6 to correspond with the sign of the ordinary least squares estimate of p_{T}+i from the fit of model (6.1).