Representations of Significance: Visualization in Data-Intensive Biology

loading page

fridolin.gross,
Pierre-Luc Germain

Abstract

This contribution investigates ways in which results based on large and computationally processed data sets (i.e. high-content and/or high-throughput) are presented in scientific publications in molecular biology. In particular, we focus on the relationship between the visualization and statistical treatment of such data. On the one hand, by their very nature, these data sets appear to necessitate a statistical approach: the measurements are quantitative and the outcome is usually presented in the form of statistical tests providing evidence for the presence or absence of a particular phenomenon or effect. On the other hand, data are often presented visually: representations based on graphical devices such as heatmaps are used for various purposes in scientific communications. However, it seems that not all disciplines dealing with large and complex digital data make the same usage of visual representations. Differences may arise due to the specific nature of the investigated phenomena or due to the rhetoric and analytic traditions of different fields. Visual representations play (at least) two main roles in data-intensive molecular biology. First, they reveal patterns which were not visible to the 'naked eye' (so to speak). But as it depicts a pattern, the representation also endeavor to convey an idea of its robustness or significance, in a way that, we argue, is neither completely distinct nor completely redundant with statistical tests. There are circumstances where a scientific community deems both visual representation and statistical test necessary, and others where specifically one or the other is judged sufficient evidence, etc. We take some important such cases from computational and molecular biology, and argue that they (and more generally the relationship between data visualisation and indications of statistical significance) can yield important insights regarding the different aims they might serve in scientific practice and communication, the epistemic culture of a field, and the practical nature of evidence. In particular, we ask if and how the visual and embodied tradition of molecular biology has carried over into the realm of highly computationally processed experiments, and take some preliminary steps in investigating this hypothesis.