Teaching statistical literacy

I produced this sketch of a course to stimulate discussion of how to address the challenge for professional or interdisciplinary doctoral programs, which always require statistics and or quantitative methods courses, in teaching those subjects in a way that accommodates the range of prior preparation that students bring into their programs  (See previous post). Comments welcome, including “must cover” topics.

It is assumed that some students have a statistical background, but the course emphasizes attention to concepts, cases, and controversies with a view to collaborating thoughtfully with specialists.

Technical grounding or a refresher can be the goal of some students; technical expertise can be pursued by others, but this track has to be accompanied by time learning to coach fellow students in the first category.

(This sketch builds on my doctoral course on epidemiological thinking and population health and an undergraduate stats course I last taught 25 years ago.)

PHASE A: Getting going on statistical literacy

1a. Introduction: why do we want to be able to deal with statistics?
General scheme of statistics (Phenomenon–Elevation–Observation–Measurement–Data–Pattern–Prediction…..Causes?)

1b. The course as a learning community

Idea: Developing statistical literacy requires collaboration with others (of differing skills and interests) and reflection on personal and professional development.
Students identify personal, intellectual, professional interests in relation to central themes about inequality, pathways of development, social determinants of health, and policy

1c. Reading and learning strategies
Idea: Developing statistical literacy requires establish our own practices of learning from material we don’t fully grasp at first reading/hearing, practices shaped to complement our own specific interests and work.

Idea (behind requirement that students develop a glossary): Non-specialists need to become comfortable with the fundamental ideas and basic vocabulary of quantitative inquiry in order to converse intelligently with specialists in statistics. (One way to move in that direction is to practice making the ideas accessible to the layperson. Another way is to apply the ideas to a specific area of research and policy and to address any controversies among the ideas.)

2. Phenomena: Exploring the “natural history” of one’s field of inquiry
Idea: Detailed observation (like a naturalist) or detective work–albeit informed by theoretical ideas–may be needed before we can characterize what the phenomenon is we are studying, what questions we need to ask, and what categories we need for subsequent data collection and analysis.

3. The scope and ideology of statistical analysis
Idea: The uses of statistical analysis are many, but have shifted over time, and are subject to recurrent challenges.

Idea: A key issue that renders statistical analysis ideological in its very foundations is who is assumed to be able to take action—who are the “agents”—and who are the subjects that follow directions given by others.

4. Categories

(Elevation of circumstances, categories and identity, control, resolution) (Measurement, precision and notation)
Idea: Collecting and analyzing data requires categories: Have we omitted relevant categories or mixed different phenomena under one label? What basis do we have for subdividing a continuum into categories? How do we ensure correct diagnosis and assignment to categories? What meaning do we intend to give to data collected in our categories?

5. Associations, Predictions, Causes, and Interventions

Explanation and causality in the real world.
(Co-occurence, current factors vs. historical, proximate vs. background, internal vs. external, experimentally controlled vs. natural, unitary vs. multifactorial)
Idea: Relationships among associations, predictions, causes, and interventions run through all the cases and controversies in this course. The idea introduced in this session is that statistical analysis has two faces: One from which the thinking about associations, predictions, causes, and interventions are allowed to cross-fertilize, and the other from which the distinctions among them are vigorously maintained, as in “Correlation is not causation!” The second face views Randomized Control Trial (RCTs) as the “gold-standard” for testing treatments or interventions. The first face recognizes that many hypotheses about treatment and other interventions emerge from observational studies and often such studies provide the only data we have to work with. What are the shortcomings of observational studies we need to pay attention to (e.g., systematic sampling errors leading to unmeasured confounders-see next class)?

6. Confounders & conditioning of analyses
Idea: Statistical associations between any two variables generally vary depending on the values taken by other “confounding” variables. We need to take this dependency (or conditionality) into account when using our analyses to make predictions or hypothesize about causes, but how do we decide which variables are relevant and real confounders?

PHASE B: Basic statistical methods

Statistical analysis includes the construction of observations (in experiments and in the field), summarizing data (statistics, distributions, correlation), testing hypotheses and other statistical inference (including Goodness of Fit).

Concepts and methods will be introduced through lectures, practice classes (on problems of various levels of difficulty) and discussions.

Real cases from the fields of interest to students will be used, and the different interpretations, hidden assumptions, limitations and misuse of statistically derived results will be emphasized.

Students seeking a technical grounding or a refresher will be coached by those seeking technical expertise; the latter taking time to develop and practice ways to coach or consult for students in the former category.  The latter students will, at the same time,  work through Phase C on their own time, contributing to a guidebook that translates the cases and controversies in Phase C into the terms of Phase B so that the former students can understand.

1. Population and sample; Classes and frequency

2. Graphical representation
(Exploratory data analysis, histograms & distributions)

3. Statistics
(Central tendency and dispersion; More graphics: box-and-whisper plots)

4. Distributions of statistics

5. Density Curves
(Normal, Chi-squared, fitting curves to histograms)

6. Statistical “machines”
(Probability or box models, central metaphor of statistics)

7&8. Prediction
(Posterior density, confidence intervals)
(Hypothesis testing, comparison of means, decision rules)

9. Association I
(Experimental studies, G-test, goodness-of-fit)

10. Association II
(Observational studies, least squares fit, regression)

11. Correlation and causation

12. Analysis of variance

PHASE C.  Extending statistical literacy 

Students seeking technical expertise work through the cases and controversies in Phase C, translating the issues into the terms of Phase B so that the former students can understand and they themselves can check their understanding.

1. Heterogeneity within populations and subgroups
Idea: How people respond to treatment may vary from one subgroup to another–When is this a matter of chance or of undetected additional variables? How do we delineate the boundaries between subgroups?

2. Placing individuals in a multileveled context
Idea: Different or even contradictory associations can be detected at different levels of aggregation (e.g., individual, region, nation), but not all influences can be assigned to properties of the individual—Membership in a larger aggregation can influence outcomes even after conditioning on the attributes of the individuals.

3. Life course studies
Idea: How do we identify and disentangle the biological and social factors that build on each other over the life course from gestation through to old age?

4. Multivariable “structural” models of development
Idea: Just as standard regression models allow prediction of a dependent variable on the basis of independent variables, structural models can allow a sequence of predictive steps from root (“exogenous”) through to highest-level variables. Although this kind of model seems to illuminate issues about factors that build up over the life course, there are strong criticisms of using such models to make claims about causes.

PHASE D.  Planning for practice

1. Plans for practice
Students need to define how and when to build on the course to:

  • gain literacy in areas of quantitative methods relevant to their interests, but not addressed or not completed during the semester;
  • create a network of specialists they can consult with after the course is over;
  • employ strategies of reading that allow us to extract take-home lessons from readings even as we skip sections that become too technical for us;
  • continue to cultivate skills and dispositions of critical thinking and of life-long, cooperative learning facilitated by the resources of the internet. (The use of controversies in Phases A & C follows an idea central to critical thinking, namely, that we understand ideas better by holding them in tension with alternatives.)

2. Taking Stock of Course: Where have we come and what do we need to learn to go further?
Idea: In order to move ahead and continue developing, it is important to take stock of what went well and what needs further work.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s