Heterogeneity and Data Analysis: Heterogeneity #4, Deviation from the type

Heterogeneity #4, Deviation from the type

Statistical analysis rests on the simplest heterogeneity, namely, variation around a mean.  In this light, I tell education students who will not be taking a statistics course that they should:

Understand the simple chain of thinking below, then enlist or hire a statistician who will use the appropriate recipe for the data at hand.

1. There is a population of individuals. (Population = individuals subject to the same causes of interest.  In addition to these foreground causes, there may also be background, non-manipulatable causes that vary among these individuals.)

2. Variation: For some measurable attribute, the individuals have varying responses to these causes (possibly because of the background causes).

3. You have observations of the measurable attribute for two or more subsets (samples) of the populations.

4. Central question of statistical analysis: Are the subsets sufficiently different in their varying responses that you doubt that they are from the one population (i.e., you doubt that they are subject to all the same foreground causes)? Statisticians answer this question with recipes that are variants of a comparison between the subset averages in relation to the spread around the averages. For the figure below, the statisticians’ comparison means that you are more likely to doubt that subsets A and B are from the same population in the left hand situation than in the right hand one.

5. If you doubt that the subsets are from the same population, investigate further, drawing on other knowledge about the subsets. You hope to expose the causes involved and then take action informed by that knowledge about the cause.

Variation around a mean is not a strong sense of heterogeneity.  The emphasis above is on the means (the circles) more than the variation (the dashed curves).  Statistical analysis distinguishes types (or decides they are not distinguishable) more than it explores the variation (or error, i.e., deviation from type).  Data amenable to a t-test are, however, open to alternative explorations, as illustrated by the final vignette in this series of posts.

(continuing a series of posts—see first post; see next post)


One thought on “Heterogeneity and Data Analysis: Heterogeneity #4, Deviation from the type

  1. Pingback: Heterogeneity and Data Analysis: Heterogeneity #2, Mixture of types (cont.) « Intersecting Processes

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s