WORKING DRAFT authorea.com/9460
Main Data History
Export
Show Index Toggle 13 comments
  •  Quick Edit
  • FRETBursts: An Open Source Toolkit for Analysis of Freely-Diffusing Single-Molecule FRET

    Abstract

    Single-molecule Förster Resonance Energy Transfer (smFRET) allows probing intermolecular interactions and conformational changes in biomacromolecules, and represents an invaluable tool for studying cellular processes at the molecular scale. smFRET experiments can detect the distance between two fluorescent labels (donor and acceptor) in the 3-10 nm range. In the commonly employed confocal geometry, molecules are free to diffuse in solution. When a molecule traverses the excitation volume, it emits a burst of photons, which can be detected by single-photon avalanche diode (SPAD) detectors. The intensities of donor and acceptor fluorescence can then be related to the distance between the two fluorophores.

    While recent years have seen a growing number of contributions proposing improvements or new techniques in smFRET data analysis, rarely have those publications been accompanied by software implementation. In particular, despite the widespread application of smFRET, no complete software package for smFRET burst analysis is freely available to date.

    In this paper, we introduce FRETBursts, an open source software for analysis of freely-diffusing smFRET data. FRETBursts allows executing all the fundamental steps of smFRET bursts analysis using state-of-the-art as well as novel techniques, while providing an open, robust and well-documented implementation. Therefore, FRETBursts represents an ideal platform for comparison and development of new methods in burst analysis.

    We employ modern software engineering principles in order to minimize bugs and facilitate long-term maintainability. Furthermore, we place a strong focus on reproducibility by relying on Jupyter notebooks for FRETBursts execution. Notebooks are executable documents capturing all the steps of the analysis (including data files, input parameters, and results) and can be easily shared to replicate complete smFRET analyzes. Notebooks allow beginners to execute complex workflows and advanced users to customize the analysis for their own needs. By bundling analysis description, code and results in a single document, FRETBursts allows to seamless share analysis workflows and results, encourages reproducibility and facilitates collaboration among researchers in the single-molecule community.

    Introduction

    Open Science and Reproducibility

    Over the past 20 years, single molecule FRET (smFRET) has grown into one of the most useful techniques in single-molecule spectroscopy (Weiss 1999, Hohlbein 2014). While it is possible to extract information on sub-populations using ensemble measurements (e.g.  (Lerner 2014, Rahamim 2015)), smFRET unique feature is its ability to very straightforwardly resolve conformational changes of biomolecules or measure binding-unbinding kinetics in heterogeneous samples (Selvin 2000, Roy 2008, Schuler 2008, Sisamakis 2010, Haran 2012). smFRET measurements on freely diffusing molecules (the focus of this paper) have the additional advantage, over measurements performed on immobilized molecules, of allowing to probe molecules and processes without perturbation from surface immobilization or additional functionalization needed for surface attachment (Eggeling 1998, Dahan 1999).

    The increasing amount of work using freely-diffusing smFRET has motivated a growing number of theoretical contributions to the specific topic of data analysis (Fries 1998, Eggeling 2001, Zhang 2005, Gopich 2005, Lee 2005, Nir 2006, Antonik 2006, Gopich 2007, Gopich 2008, Camley 2009, Santoso 2010, Torella 2011, Tomov 2012). Despite this profusion of publications, most research groups still rely on their own implementation of a limited number of methods, with very little collaboration or code sharing. To clarify this statement, let us point that our own group’s past smFRET papers merely mention the use of custom-made software without additional details (Lee 2005, Nir 2006). Even though some of these software tools are made available upon request, or sometimes shared publicly on websites, it remains hard to reproduce and validate results from different groups, let alone build upon them. Additionally, as new methods are proposed in literature, it is generally difficult to quantify their performance compared to other methods. An independent quantitative assessment would require a complete reimplementation, an effort few groups can afford. As a result, potentially useful analysis improvements are either rarely or slowly adopted by the community. In contrast with other established traditions such as sharing protocols and samples, in the domain of scientific software, we have relegated ourselves to islands of non-communication.

    From a more general standpoint, the non-availability of the code used to produce scientific results, hinders reproducibility, makes it impossible to review and validate the software’s correctness and prevents improvements and extensions by other scientists. This situation, common in many disciplines, represents a real impediment to the scientific progress. Since the pioneering work of the Donoho group in the 90’s (Buckheit 1995), it has become evident that developing and maintaining open source scientific software for reproducible research is a critical requirement of the modern scientific enterprise (Ince 2012, Vihinen 2015).

    Other disciplines have started tackling this issue (Eglen 2016), and even in the single-molecule field a few recent publications have provided software for analysis of surface-immobilized ex