Live Mathematics on Authorea
A Case for Transparency in Science


Authorea is a collaborative platform for writing in research and education, with a focus on web-first, high quality scientific documents.

We offer a tour through our integration of technologies that evolve math-rich papers into transparent, active objects. To enumerate, we currently employ Pandoc and LaTeXML (for authoring), MathJax (for math rendering and clipboard), D3.js (data visualization), iPython (computation), Flotchart and Bokeh (interactive plots).

This paper presents the challenges and rewards of integrating active web components for mathematics, while preserving backwards-compatibility with classic publishing formats. We conclude with an outlook to the next-to-come mathematics enhancements on Authorea, and a technology wishlist for the coming year.



The motivation behind creating Authorea has been to help streamline academic collaboration in writing any flavor of scientific documentation, notably research papers aimed at passing peer-review and getting published as scientific proceedings. While the authorship and submission experience comes first, a goal that comes close second is to also increase the openness of the scientific process, using the final publication as a “looking glass” into the practices and data collection which happened “behind the scenes”.

We proceed to motivate why transparent research has superior properties and use “live mathematics” as one example of how Authorea enables it.

The core of the transparency problem is that we are still using the original publishing metaphor for documents, dating back to the innovations of 16th century Galileo Galilei, while simultaneously working on 21st century projects which are potentially large-scale, high-dimensional, multi-author and/or internationally distributed (Goodman 2014). The usual scientific document submitted to academic venues today is still oriented towards the printed page, remains opaque to the underlying data, of which it presents static snapshots, and is constrained by page count and margin sizes, often preventing it from providing sufficient detail of methodology and experimental setup.

This disconnect between experimental results and publications offers room for unintentional bias and experimental defects to remain unnoticed, making it difficult for reviewers to verify, and for follow-up experiments to continue the work in question. Studies have shown that even journals of the highest impact factors are vulnerable to retractions – see Fig. \ref{fig:retractions} for an illustration derived in (Fang 2011). In 2015 we have also observed a stream of high-profile retractions from some of the best scored journals that illustrate this problem, as tracked and discussed on the website11, seen June 2015 of the recent Retraction Watch initiative (Marcus 2011).

\label{fig:retractions}As shown in (Fang 2011):

Correlation between impact factor and retraction index. The 2010 journal impact factor (37) is plotted against the retraction index as a measure of the frequency of retracted articles from 2001 to 2010 (see text for details). Journals analyzed were Cell, EMBO Journal, FEMS Microbiology Letters, Infection and Immunity, Journal of Bacteriology, Journal of Biological Chemistry, Journal of Experimental Medicine, Journal of Immunology, Journal of Infectious Diseases, Journal of Virology, Lancet, Microbial Pathogenesis, Molecular Microbiology, Nature, New England Journal of Medicine, PNAS, and Science.

Facets of Transparency

To contrast, we offer a brief enumeration of the positive impact of the transparency of methodology and data on the scientific process:


Correctly repeating an experiment, or reproducing a proof, while arriving at the same results is foundational for establishing scientific truths. That is only possible for third-party scientists if the process is described in full detail in the original publication. That includes a range of diverse techniques, from experimental protocols and equipment specifications to exact computational methods and programs, as well as mathematical proof steps and derivations.


Building on, as well as improving, results achieved in prior work depends on first being able to reproduce them, and then being able to modify each step with enhancements or customization relevant to the follow-up experiment. That is only possible if there are no “black box” components in the methodology, i.e. where any step is open to both scrutiny and modification.



While classically referring to people with disabilities, we use the term ”scientific accessibility” in a broader sense. The dissemination of published works could be limited not only by impairments of the reader, but also by a language barrier (both geographically and in terms of terminology and mathematical notation used in different fields), by a technological barrier (e.g. use of closed, proprietary standards or badly maintained custom tools) as well as data blackouts (e.g. disconnect from the underlying datasets summarized by a paper’s figures and tables).


A substantial prerequisite for using a published result as a building block for follow-up work is the ease of access and quality of curation of all referenced materials and datasets. This could be problematic if resources are located behind institutional “paywalls” or restrictive copyright licenses, are too old to be in digital form, or simply remain unavailable for public review due to being considered too minor to be of importance.

Live Mathematics

The vision of ”Live Mathematics”, is a subset of the feature set captured by the ”Active Documents Paradigm” for STEM (Kohlhase 2011). We aim to enhance the transparency of mathematical content, by providing the capabilities to attach underlying numerical data, to encode the mathematical properties as targeted programs, embedded in the document, and by feeding that active data into in