Galton and Biobanks: The data collected limits the questions asked

The data that researchers collect shapes the kinds of patterns and hypotheses or predictions they can make.

Galton, a founding father of the analysis of similarity among relatives, recognized that those similarities say nothing on their own to distinguish ‘between the effects of tendencies received at birth, and of those that were imposed by the circumstances of their after lives’ (Galton 1875, 566).  However, especially for the traits that concerned him, namely, ‘superior faculties’ or abilities that were ‘exceptionally high’ (1892 [1978], viii), Galton concluded at an early stage of his inquiries that ‘nature prevails enormously over nurture’ (1875, 574).  To Galton this was evident in the biographical data he had collected on illustrious men and their kinfolk (1869 [1892, 1978]) and in studies of the life histories of similar and dissimilar twins (1875).  His conclusion is not very convincing today.  After all, at one point he begs the question by defining the traits he was measuring as those that ‘exclude the effects of education’ (1892 [1978], viii).  What remains pertinent, however, is that this conclusion meant he saw no need for data on what we would call environmental or social variables.  He could investigate heredity through the patterns of similarity among relatives.  Conversely, because Galton did not measure any environmental variables he was able only to reach conclusions about (supposedly) inborn characters.

John Frank ( 2005), Scientific Director of the Institute of Population and Public Health of the Canadian Institutes for Health Research, has observed an equivalent but more systemic data-determined limitation in this age of genomics.  Frank, an epidemiologist, asks what data needs to be collected over the life course of individuals so that researchers in say, thirty years, have the information needed to identify the key risk factors and interactions that account for variation in disease incidence and differential age of onset in a population, and for changing patterns for diseases over time.  He assumes that ‘diseases and conditions of later life occur in some and not others because of intense interactions between particular genetic constitutions and particular sequence of social and physical environments.’  There is, however, an uneven playing field.  Genetic samples are cheap to collect and store and need to be collected only once in a lifetime.  Environmental exposures vary over time so that ‘new samples are needed whenever exposure changes, are difficult to store, and are ‘getting costlier (as awareness of chemical/physical/ biological complexity increases).’  Some epidemiologists have secured resources to follow small chorts through time and collect a rich array of data on the individuals (e.g., The Southampton Women’s Survey [Inskip et al. 2006]), but the major investments are being made in collecting primarily genetic and disease data for large samples (e.g., the UK Biobank).   Epidemiologists such as Frank have warned that analyses of such data will depend on crude estimates of environmental factors and be subject to large errors, uncertainties, and non-replicated findings about genetic influences.  In the absence of longituidinal data on environmental exposures, biomedicine has almost no option but to emphasize the effects of genetic factors (but see Davey-Smith and Ebrahim 2007).

Extracted from P. Taylor, “Infrastructure and Scaffolding: Interpretation and Change of Research Involving Human Genetic Information,” Science as Culture, 18(4):435-459, 2009


Davey-Smith, G. and S. Ebrahim (2007). Mendelian randomization: Genetic variants as instruments for strengthening causal influences in observational studies. Biosocial Surveys. M. Weinstein, J. W. Vaupel and K. W. Wachter. (Washington, DC: National Academies Press), 336-366.

Frank, J. (2005). A Tale of (More Than ?) Two Cohorts – from Canada. 3rd International Conference on Developmental Origins of Health and Disease.

Galton, F. (1865). Hereditary talent and character. Macmillan’s Magazine 12: 157-66, 318-327.

Galton, F. (1875). The history of twins, as a criterion of the relative powers of nature and nurture. Fraser’s Magazine 12: 566-576.

Galton, F. (1978). Hereditary Genius. (New York, NY: St. Martin’s Press).

Inskip, H. M., K. M. Godfrey, S. M. Robinson, C. M. Law, D. J. Barker, C. Cooper and SWS Study Group (2006). Cohort profile: The Southampton Women’s Survey. International Journal of Epidemiology 35(1): 42-48.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s