A non-technical introduction to path analysis and structural equation modeling II: Heritability studies

In studies of heritability, a field in which path analysis originated, there are no measured variables except the observed focal variable (e.g., height). Path analysis can still be used if we convert the additive model on which any given Analysis of Variance (AOV or ANOVA) is based into an additive model of constructed variables that take the values of the contributions fitted to the first model.

For example, in an agricultural evaluation trial of many varieties replicated one of more times in each of many locations, the AOV model is

Yijk = M +Vi +Lj +VLij +Eijk (eqn. 1)

where Yijk denotes the measured trait y for the ith variety in the jth location and kth replication;
M is a base level for the trait;
Vi is the contribution of the ith variety;
Lj is the contribution of the jth location;
VLij is an additional contribution from the i,jth variety-location combination—in statistical terms, the “variety-location-interaction” contribution; and
Eijk is a noise contribution adding to the trait measurement.

The path model equivalent to equation 1 is
Yx = M +Z1x +Z2x +Z3x +Ex (eqn. 2)

Y is the measured trait as before and x denotes the replicates
Z1x = Vi if x if a replicate of variety i, or 0 otherwise
Z2x = Lj if x if a replicate in location j, or 0 otherwise
Z3x = VLij if x if a replicate of variety i in location j, or 0 otherwise
Ex = Eijk where x is replicate k of variety i in location j

The path coefficients are then set to equal the square root of the ratio of the variance of the contribution (Vi, etc.) to the total variance for the trait (Y). The equation of complete determination becomes
1 = Sum (over w’s) of variance (Zw) / var(Y) (eqn. 3)
where w denotes the different contributions in the Analysis of Variance model.

For the agricultural trial this equation might be written
1 = [var(V) + var(L) + var(VL) + var(E)] / var(Y) (eqn. 4)
where V = variance of the vi terms, etc.

In human studies the var(VL) is ignored or discounted (which is a shortcoming) and this is expressed as
1 = heritability + shared environmental effect + non-shared environmental effect (eqn. 5)

When the same trait is observed in two relatives, their separate path analyses can be linked in one network and the correlation between the relatives calculated (Lynch & Walsh 1998, 826)—provided it is assumed that the contributions (and path coefficients) apply to both and that the noise contributions are uncorrelated. If we have data on correlations for different kinds of relatives (e.g., identical vs. fraternal twins), we can estimate the relative size of the contributions in equations such as 4 and 5. That’s the crux of heritability studies.
(This post is a second supplement [see previous post] to a series laying out a sequence of basic ideas in thinking like epidemiologists, especially epidemiologists who pay attention to possible social influences on the development and unequal distribution of diseases and behaviors in populations [see first post in series and contribute to open-source curriculum http://bit.ly/EpiContribute].)
Lynch, M. and B. Walsh (1998). Genetics and Analysis of Quantitative Traits. Sunderland, MA, Sinauer.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s