Antonino Ingargiola Add diff files  almost 8 years ago

Commit id: ee446b5c7314979ed3b96dd77cc9ba78e17af759

deletions | additions      

         

% Template for PLoS  %DIF LATEXDIFF DIFFERENCE FILE  %DIF DEL full_article_928.tex Tue Jun 28 13:25:24 2016  %DIF ADD full_article_161.tex Thu Jun 30 12:52:03 2016  % Version 3.1 February 2015  %  % To compile to pdf, run:  % latex plos.template  % bibtex plos.template  % latex plos.template  % latex plos.template  % dvipdf plos.template  %  % % % % % % % % % % % % % % % % % % % % % %  %  % -- IMPORTANT NOTE  %  % This template contains comments intended  % to minimize problems and delays during our production  % process. Please follow the template instructions  % whenever possible.  %  % % % % % % % % % % % % % % % % % % % % % % %  %  % Once your paper is accepted for publication,  % PLEASE REMOVE ALL TRACKED CHANGES in this file and leave only  % the final text of your manuscript.  %  % There are no restrictions on package use within the LaTeX files except that  % no packages listed in the template may be deleted.  %  % Please do not include colors or graphics in the text.  %  % Please do not create a heading level below \subsection. For 3rd level headings, use \paragraph*{}.  %  % % % % % % % % % % % % % % % % % % % % % % %  %  % -- FIGURES AND TABLES  %  % Please include tables/figure captions directly after the paragraph where they are first cited in the text.  %  % DO NOT INCLUDE GRAPHICS IN YOUR MANUSCRIPT  % - Figures should be uploaded separately from your manuscript file.  % - Figures generated using LaTeX should be extracted and removed from the PDF before submission.  % - Figures containing multiple panels/subfigures must be combined into one image file before submission.  % For figure citations, please use "Fig." instead of "Figure".  % See http://www.plosone.org/static/figureGuidelines for PLOS figure guidelines.  %  % Tables should be cell-based and may not contain:  % - tabs/spacing/line breaks within cells to alter layout or alignment  % - vertically-merged cells (no tabular environments within tabular environments, do not use \multirow)  % - colors, shading, or graphic objects  % See http://www.plosone.org/static/figureGuidelines#tables for table guidelines.  %  % For tables that exceed the width of the text column, use the adjustwidth environment as illustrated in the example table in text below.  %  % % % % % % % % % % % % % % % % % % % % % % % %  %  % -- EQUATIONS, MATH SYMBOLS, SUBSCRIPTS, AND SUPERSCRIPTS  %  % IMPORTANT  % Below are a few tips to help format your equations and other special characters according to our specifications. For more tips to help reduce the possibility of formatting errors during conversion, please see our LaTeX guidelines at http://www.plosone.org/static/latexGuidelines  %  % Please be sure to include all portions of an equation in the math environment.  %  % Do not include text that is not math in the math environment. For example, CO2 will be CO\textsubscript{2}.  %  % Please add line breaks to long display equations when possible in order to fit size of the column.  %  % For inline equations, please do not include punctuation (commas, etc) within the math environment unless this is part of the equation.  %  % % % % % % % % % % % % % % % % % % % % % % % %  %  % Please contact [email protected] with any questions.  %  % % % % % % % % % % % % % % % % % % % % % % % %  \documentclass[10pt,letterpaper]{article}  \usepackage[top=0.85in,left=2.75in,footskip=0.75in]{geometry}  % Use adjustwidth environment to exceed column width (see example table in text)  \usepackage{changepage}  % Use Unicode characters when possible  %\usepackage[utf8]{inputenc}  % textcomp package and marvosym package for additional characters  \usepackage{textcomp,marvosym}  % fixltx2e package for \textsubscript  \usepackage{fixltx2e}  % amsmath and amssymb packages, useful for mathematical formulas and symbols  \usepackage{amsmath,amssymb}  % cite package, to clean up citations in the main text. Do not remove.  \usepackage{cite}  % Use nameref to cite supporting information files (see Supporting Information section for more info)  \usepackage{nameref}  \usepackage{color}  \usepackage[colorlinks=true,  linkcolor=blue,  urlcolor=blue,  citecolor=black]{hyperref}  % line numbers  \usepackage[right]{lineno}  % ligatures disabled  \usepackage{microtype}  \DisableLigatures[f]{encoding = *, family = * }  % rotating package for sideways tables  \usepackage{rotating}  % Remove comment for double spacing  %\usepackage{setspace}  %\doublespacing  \usepackage{graphicx}  \usepackage[space]{grffile}  \usepackage{latexsym}  \usepackage{textcomp}  \usepackage{longtable}  \usepackage{multirow,booktabs}  % You can conditionalize code for latexml or normal latex using this.  \newif\iflatexml\latexmlfalse  \usepackage[utf8]{inputenc}  \usepackage[ngerman,greek,english]{babel}  %% Neutralize any \includegraphics in the document, as PLOS does not allow figures in the final submission  \makeatletter  \let\orig@includegraphics\includegraphics  \AtBeginDocument{\let\includegraphics\PLOS@ignore}  \newcommand{\PLOS@ignore}[2][]{}  \makeatother  % Text layout  \raggedright  \setlength{\parindent}{0.5cm}  \textwidth 5.25in  \textheight 8.75in  % Bold the 'Figure #' in the caption and separate it from the title/caption with a period  % Captions will be left justified  \usepackage[aboveskip=1pt,labelfont=bf,labelsep=period,justification=raggedright,singlelinecheck=off]{caption}  % Use the PLoS provided BiBTeX style  \bibliographystyle{plos2015}  % Remove brackets from numbering in List of References  \makeatletter  \renewcommand{\@biblabel}[1]{\quad#1.}  \makeatother  % Leave date blank  \date{}  % Header and Footer with logo  \usepackage{lastpage,fancyhdr,graphicx}  \usepackage{epstopdf}  \pagestyle{myheadings}  \pagestyle{fancy}  \fancyhf{}  \makeatletter  \lhead{\orig@includegraphics[width=2.0in]{PLOS-submission.eps}}  \makeatother  \rfoot{\thepage/\pageref{LastPage}}  \renewcommand{\footrule}{\hrule height 2pt \vspace{2mm}}  \fancyheadoffset[L]{2.25in}  \fancyfootoffset[L]{2.25in}  \lfoot{\sf PLOS}  %% Include all macros below  \newcommand{\lorem}{{\bf LOREM}}  \newcommand{\ipsum}{{\bf IPSUM}}  \usepackage{color}  \usepackage{listings}  \lstset{ %  backgroundcolor=\color{white}, % choose the background color  basicstyle=\footnotesize\ttfamily, % size of fonts used for the code  breaklines=true, % automatic line breaking only at whitespace  captionpos=b, % sets the caption-position to bottom  commentstyle=\color{OliveGreen}, % comment style  keywordstyle=\color{blue}, % keyword style  stringstyle=\color{black}, % string literal style  language=Python, % Set your language (you can change the language for each code-block optionally)  frame=l, %  xleftmargin=\fboxsep, %  xrightmargin=-\fboxsep, %  }  \hyphenation{smFRET}  \hyphenation{FRETBursts}  %% END MACROS SECTION  %DIF PREAMBLE EXTENSION ADDED BY LATEXDIFF  %DIF UNDERLINE PREAMBLE %DIF PREAMBLE  \RequirePackage[normalem]{ulem} %DIF PREAMBLE  \RequirePackage{color}\definecolor{RED}{rgb}{1,0,0}\definecolor{BLUE}{rgb}{0,0,1} %DIF PREAMBLE  \providecommand{\DIFaddtex}[1]{{\protect\color{blue}\uwave{#1}}} %DIF PREAMBLE  \providecommand{\DIFdeltex}[1]{{\protect\color{red}\sout{#1}}} %DIF PREAMBLE  %DIF SAFE PREAMBLE %DIF PREAMBLE  \providecommand{\DIFaddbegin}{} %DIF PREAMBLE  \providecommand{\DIFaddend}{} %DIF PREAMBLE  \providecommand{\DIFdelbegin}{} %DIF PREAMBLE  \providecommand{\DIFdelend}{} %DIF PREAMBLE  %DIF FLOATSAFE PREAMBLE %DIF PREAMBLE  \providecommand{\DIFaddFL}[1]{\DIFadd{#1}} %DIF PREAMBLE  \providecommand{\DIFdelFL}[1]{\DIFdel{#1}} %DIF PREAMBLE  \providecommand{\DIFaddbeginFL}{} %DIF PREAMBLE  \providecommand{\DIFaddendFL}{} %DIF PREAMBLE  \providecommand{\DIFdelbeginFL}{} %DIF PREAMBLE  \providecommand{\DIFdelendFL}{} %DIF PREAMBLE  %DIF END PREAMBLE EXTENSION ADDED BY LATEXDIFF  %DIF PREAMBLE EXTENSION ADDED BY LATEXDIFF  %DIF HYPERREF PREAMBLE %DIF PREAMBLE  \providecommand{\DIFadd}[1]{\texorpdfstring{\DIFaddtex{#1}}{#1}} %DIF PREAMBLE  \providecommand{\DIFdel}[1]{\texorpdfstring{\DIFdeltex{#1}}{}} %DIF PREAMBLE  %DIF END PREAMBLE EXTENSION ADDED BY LATEXDIFF  \begin{document}  \vspace*{0.35in}  % Title must be 250 characters or less.  \begin{flushleft}  {\Large  \textbf\newline{\input{title}}  }  \newline  % Insert author names, affiliations and corresponding author email (do not include titles, positions, or degrees).  \\  Antonino Ingargiola\textsuperscript{1*},  Eitan Lerner\textsuperscript{1},  SangYoon Chung\textsuperscript{1},  Shimon Weiss\textsuperscript{1},  Xavier Michalet\textsuperscript{1},  \\  \bigskip  \textbf{1} Dept. Chemistry and Biochemistry, Univ. of California Los Angeles, Los Angeles, CA, USA  \bigskip  % Use the asterisk to denote corresponding authorship and provide email address in note below.  * [email protected]  \end{flushleft}  % Please keep the abstract below 300 words  \section*{Abstract}  Single-molecule Förster Resonance Energy Transfer (smFRET) allows  probing intermolecular interactions and conformational changes in  biomacromolecules, and represents an invaluable tool for studying  cellular processes at the molecular scale. smFRET experiments can  detect the distance between two fluorescent labels (donor and acceptor)  in the 3-10~nm range. In the commonly employed confocal geometry,  molecules are free to diffuse in solution. When a molecule traverses  the excitation volume, it emits a burst of photons, which can be detected  by single-photon avalanche diode (SPAD) detectors. The intensities of  donor and acceptor fluorescence can then be related to the distance  between the two fluorophores.  While recent years have seen a growing number of contributions  proposing improvements or new techniques in smFRET data analysis,  rarely have those publications been accompanied by software implementation.  In particular, despite the widespread application of smFRET, no complete  software package for smFRET burst analysis is freely available to date.  In this paper, we introduce FRETBursts, an open source software  for analysis of freely-diffusing smFRET data.  FRETBursts allows executing all the fundamental steps of smFRET bursts  analysis using state-of-the-art as well as novel techniques,  while providing an open, robust and well-documented implementation.  Therefore, FRETBursts represents an ideal platform for comparison  and development of new methods in burst analysis.  We employ modern software engineering principles in order to  minimize bugs and facilitate long-term maintainability.  Furthermore, we place a strong focus on reproducibility by relying on  Jupyter notebooks for FRETBursts execution.  Notebooks are executable documents capturing all the steps of the  analysis (including data files, input parameters, and results) and can  be easily shared to replicate complete smFRET analyzes.  Notebooks allow beginners to execute complex workflows  and advanced users to customize the analysis for their own needs.  By bundling analysis description, code and results in a single document,  FRETBursts allows to seamless share analysis workflows  and results, encourages reproducibility and facilitates collaboration  among researchers in the single-molecule community.  % Please keep the Author Summary between 150 and 200 words  % Use first person. PLOS ONE authors please skip this step.  % Author Summary not valid for PLOS ONE submissions.  %\section*{Author Summary}  \linenumbers  \section*{Introduction}  \subsection*{Open Science and Reproducibility}  Over the past 20 years, single molecule FRET (smFRET) has grown into one of the most  useful techniques in single-molecule spectroscopy~\cite{Weiss_1999,Hohlbein_2014}.  While it is possible to extract information on sub-populations using ensemble measurements  (e.g. ~\cite{Lerner_2014,Rahamim_2015}),  smFRET unique feature is its ability to very straightforwardly resolve conformational  changes of biomolecules or measure binding-unbinding kinetics in heterogeneous  samples~\cite{Selvin_2000,Roy_2008,Schuler_2008,Sisamakis_2010,Haran_2012}.  smFRET measurements on freely diffusing molecules (the focus of this paper)  have the additional advantage, over measurements performed on immobilized molecules,  of allowing to probe molecules and processes without perturbation from surface  immobilization or additional functionalization needed for surface  attachment~\cite{Eggeling_1998,Dahan_1999}.  The increasing amount of work using freely-diffusing smFRET has motivated  a growing number of theoretical contributions to the specific topic of data  analysis~\cite{Fries_1998,Eggeling_2001,Zhang_2005,Gopich_2005,Lee_2005,Nir_2006,Antonik2006,Gopich_2007,Gopich_2008,Camley_2009,Santoso_2010,Torella_2011,Tomov_2012}.  Despite this profusion of publications, most research groups still rely on  their own implementation of a limited number of methods, with very little  collaboration or code sharing.  To clarify this statement, let us point that our own group's past smFRET papers  merely mention the use of custom-made software without additional details~\cite{Lee_2005,Nir_2006}.  Even though some of these software tools are made available upon request,  or sometimes shared publicly on websites,  it remains hard to reproduce and validate results from different groups,  let alone build upon them.  Additionally, as new methods are proposed in literature,  it is generally difficult to quantify their performance compared to other methods.  An independent quantitative assessment  would require a complete reimplementation, an effort few groups can afford.  As a result, potentially useful analysis improvements  are either rarely or slowly adopted by the community.  In contrast with other established traditions such as  sharing protocols and samples, in the domain of scientific software,  we have relegated ourselves to islands of non-communication.  From a more general standpoint, the non-availability of the code  used to produce scientific results, hinders reproducibility,  makes it impossible to review and validate the software's correctness  and prevents improvements and extensions by other scientists.  This situation, common in many disciplines,  represents a real impediment to the scientific progress.  Since the pioneering work of the Donoho group in the 90's~\cite{Buckheit_1995},  it has become evident that developing and maintaining open source scientific software  for reproducible research is a critical requirement of the modern  scientific enterprise~\cite{Ince_2012,Vihinen_2015}.  %Peer-reviewed publications describing such software are also necessary~\cite{Pradal_2013},  %although the debate is still open on the most effective model for peer-reviewing this  %class of publications~\cite{Check_Hayden_2013,Check_Hayden_2015}  %(\href{https://software-carpentry.org/blog/2015/04/quality-is-free-getting-there-isnt.html}{Willson 2015})  %(\href{https://www.mozillascience.org/effective-code-review-for-journals}{Mills 2015})  %(\href{http://ivory.idyll.org/blog/2015-we-live-in-a-bubble.html}{Brown 2015} and \href{http://ivory.idyll.org/blog/on-code-review-of-scientific-code.html}{2013}).  Other disciplines have started tackling this issue~\cite{Eglen_2016},  and even in the single-molecule field a few recent publications have provided  software for analysis of surface-immobilized experiments~\cite{McKinney_2006,Bronson_2009,Greenfeld_2012,K_nig_2013,van_de_Meent_2014}.  For freely-diffusing smFRET experiments, although it is common to find mention of  ``code available from the authors upon reques'' in publications, there is a dearth  of such open source code, with, to our knowledge, the notable exception of a single  example~\cite{Murphy2014}.  To address this issue, we have developed FRETBursts,  an open source Python software for analysis of freely-diffusing single-molecule FRET measurements.  FRETBursts can be used, inspected and modified by anyone interested in using  state-of-the art smFRET analysis methods or implementing modifications or completely new techniques.  FRETBursts therefore represents an ideal platform  for quantitative comparison of different methods for smFRET burst analysis.  Technically, a strong emphasis has been given to the reproducibility of complete analysis  workflows. FRETBursts uses Jupyter Notebooks~\cite{Shen_2014},  an interactive and executable document containing textual narrative, input parameters,  code, and computational results (tables, plots, etc.). A notebook thus captures the various analysis steps  in a document which is easy to share and execute.  To minimize the possibility of bugs being introduced inadvertently~\cite{Soergel_2015},  we employ modern software engineering techniques  such as unit testing and continuous integration~\cite{Wilson_2014,Eglen_2016}.  FRETBursts is hosted on GitHub~\cite{Blischak_2016,Prli__2012},  where users can write comments, report issues or contribute code.  In a related effort, we recently introduced Photon-HDF5~\cite{Ingargiola2016},  an open file format for timestamp-based single-molecule fluorescence  experiments. An other related open source tool is PyBroMo~\cite{Ingargiola_2016},  a freely-diffusing smFRET simulator which produces Photon-HDF5 files that are  directly analyzable with FRETBursts.  Together with all the aforementioned tools, FRETBursts contributes to the growing  ecosystem of open tools for reproducible science in the single-molecule field.  \subsection*{Paper Overview}  This paper is written as an introduction to smFRET burst analysis and  its implementation in FRETBursts.  The aim is illustrating the specificities and  trade-offs involved in various approaches  with sufficient details to enable readers  to customize the analysis for their own needs.  After a brief overview of FRETBursts features (section~\nameref{sec:overview}),  we introduce essential concepts and terminology for smFRET burst analysis  (section~\nameref{sec:concepts}).  In section~\nameref{sec:analysis}, we illustrate the steps involved  in smFRET burst analysis: (i) data loading (section~\nameref{sec:dataload}),  (ii) definition of the excitation alternation periods  (section~\nameref{sec:alternation}), (iii) background correction  (section~\nameref{sec:bg_calc}), (iv) burst search  (section~\nameref{sec:burstsearch}),  (v) burst selection (section~\nameref{sec:burstsel}) and  (vi) FRET histogram fitting (section~\nameref{sec:fretfit}).  As an example  of implementation of an advanced data processing technique,  section~\nameref{sec:bva} walks the reader thorough implementing  Burst Variance Analysis (BVA)~\cite{Torella_2011}.  Finally, section~\nameref{sec:conclusions} summarizes what we believe  to be the strengths of FRETBursts software.  Throughout this paper,  links to relevant sections of documentation and other web resources  are displayed as ``(link)''.  In order to make the text more legible,  we have concentrated Python-specific details in paragraphs titled  \textit{Python details}. These subsections provide deeper insights for readers  already familiar with Python and can be initially skipped by readers who are not.  Finally, note that all commands and figures in this paper can be regenerated  using the accompanying notebooks  (\href{https://github.com/tritemio/fretbursts_paper}{link}).  \section*{FRETBursts Overview}  \label{sec:overview}  \subsection*{Technical Features}  FRETBursts can analyze smFRET measurements  from one or multiple excitation spots~\cite{Ingargiola_2013}. The supported  excitation schemes include single laser, alternating laser excitation (ALEX)  with either CW lasers (μs-ALEX~\cite{Kapanidis_2005})  or pulsed lasers (ns-ALEX~\cite{Laurence_2005} or  pulsed-interleaved excitation (PIE)~\cite{M_ller_2005}).  The software implements both standard and novel algorithms for smFRET data analysis  including background estimation as a function of time (including background accuracy  metrics), sliding-window burst search~\cite{Eggeling_1998},  dual-channel burst search (DCBS)~\cite{Nir_2006} and  modular burst selection methods based on user-defined criteria  (including a large set of pre-defined selection rules). Novel features include burst size  selection with $\gamma$-corrected burst sizes, burst weighting, burst search with  background-dependent threshold (in order to guarantee a minimal signal-to-background  ratio~\cite{Michalet_2012}).  Moreover, FRETBursts provides a large set of fitting options to characterize FRET subpopulations.  In particular, distributions of burst quantities (such as $E$ or $S$) can be assessed  through (1) histogram fitting (with arbitrary model functions),  (2) non-parametric weighted kernel density estimation (KDE), (3) weighted  expectation-maximization (EM), (4) maximum likelihood fitting using Gaussian models  or Poisson statistic. Finally FRETBursts includes a large number of  predefined and customizable plot functions which (thanks to the \textit{matplotlib}  graphic library) produce publication quality plots in a wide range of formats.  Additionally, implementations of population dynamics analysis such  as Burst Variance Analysis (BVA)~\cite{Torella_2011} and two-channel  kernel density distribution estimator (2CDE)~\cite{Tomov_2012}  are available as FRETBursts notebooks  \DIFdelbegin \DIFdel{.  }\DIFdelend \DIFaddbegin \DIFadd{(}\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%20Burst%20Variance%20Analysis.ipynb}{BVA link},  \href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%202CDE%20Method.ipynb}{2CDE link}\DIFadd{).  }\DIFaddend   \subsection*{Software Availability}  FRETBursts is hosted and openly developed on GitHub. FRETBursts homepage  (\href{http://tritemio.github.io/FRETBursts}{link})  contains links to the various resources. \DIFaddbegin \DIFadd{Pre-built packages are provided for  Windows, OS X and Linux. }\DIFaddend Installation instructions  can be found in the Reference Documentation  (\href{http://fretbursts.readthedocs.org/en/latest/getting_started.html}{link}).  A description of FRETBursts execution using Jupyter notebooks is reported  in~\nameref{sec:notebook}. % SI_link  Detailed information on development style, testing strategies and  contributions guidelines are reported in~\nameref{sec:dev}. % SI_link  Finally, to facilitate evaluation and comparison with other software,  we set up an on-line services allowing to execute FRETBursts  without requiring any installation on the user's computer (\href{https://github.com/tritemio/FRETBursts_notebooks#run-online}{link}).  \section*{Architecture and Concepts}  \label{sec:concepts}  In this section, we introduce some general burst analysis concepts  and notations used in FRETBursts.  \subsection*{Photon Streams}  \label{sec:ph_streams}  The raw data collected during a smFRET experiment consists in one or more arrays of  photon timestamps, whose temporal resolution is set by the acquisition hardware,  typically between 10 and 50 ns.  In single-spot measurements, all timestamps are stored in a single array. In multispot  measurements~\cite{Ingargiola_2013}, there are as many timestamps arrays  as excitation spots.  Each array contains timestamps from both donor (D) and acceptor (A) channels.  When alternating excitation lasers are used (ALEX measurements)~\cite{Lee_2005},  a further distinction between photons emitted during the D or A excitation periods can be made.  In FRETBursts, the corresponding sets of photons are called ``photon streams''   and are specified with a \verb|Ph_sel| object  (\href{http://fretbursts.readthedocs.org/en/latest/ph_sel.html}{link}).  In non-ALEX smFRET data, there are 3 photon streams  (table~\ref{tab:ph_sel_smfret}), while in \DIFaddbegin \DIFadd{two-color }\DIFaddend ALEX data,   there are 5 streams (table~\ref{tab:ph_sel_alex}).  The \verb|Ph_sel| class (\href{http://fretbursts.readthedocs.org/en/latest/ph_sel.html}{link})  allows the specification of any combination of photon streams.  For example, in ALEX measurements, the D-emission during A-excitation stream is  usually ignored because it does not contain any useful signal~\cite{Lee_2005}.  To indicate all but photons in this photon stream, the syntax is  \verb|Ph_sel(Dex='DAem', Aex='Aem')|, which indicates selection of donor  and acceptor photons (\verb|DAem|) during donor excitation (\verb|Dex|) and only acceptor  photons (\verb|Aem|) during acceptor excitation (\verb|Aex|).  \begin{table}  \begin{tabular}{l|l}  Photon selection & code \\  \hline  All-photons & \verb|Ph_sel('all')|\\  D-emission & \verb|Ph_sel(Dex='Dem')|\\  A-emission & \verb|Ph_sel(Dex='Aem')|\\  \end{tabular}  \caption{\label{tab:ph_sel_smfret}Photon selection syntax (non-ALEX)}  \end{table}  \begin{table}  \begin{tabular}{l|l}  Photon selection & code \\  \hline  All-photons & \verb|Ph_sel('all')|\\  D-emission during D-excitation & \verb|Ph_sel(Dex='Dem')|\\  A-emission during D-excitation & \verb|Ph_sel(Dex='Aem')|\\  D-emission during A-excitation & \verb|Ph_sel(Aex='Dem')|\\  A-emission during A-excitation & \verb|Ph_sel(Aex='Aem')|\\  \end{tabular}  \caption{\label{tab:ph_sel_alex}Photon selection syntax (ALEX)}  \end{table}  \subsection*{Background Definitions}  \label{sec:bg_intro}  An estimation of the background rates is needed to both select a proper threshold for  burst search, and to correct the raw burst counts by \DIFdelbegin \DIFdel{subtraction of }\DIFdelend \DIFaddbegin \DIFadd{subtracting }\DIFaddend background counts.  The recorded stream of timestamps is the result of two processes: one characterized  by a high count rate, due to fluorescence photons of single molecules crossing the  excitation volume, and another characterized by a lower count rate, due to ``background  counts'' originating from detector dark counts, afterpulsing, out-of-focus molecules  and sample scattering and/or impurities~\cite{Edman_1996,Gopich_2008}.  The signature of these two types of processes can be  observed in the inter-photon delays distribution (i.e. the waiting times  between two subsequent timestamps) as illustrated in figure~\ref{fig:bg_dist_all}(a).  The ``tail'' of the distribution (a straight line in semi-log scale) corresponds  to exponentially-distributed time-delays, indicating that those counts are generated by a  Poisson process. At short  timescales, the distribution departs from the exponential due to the contribution  of the higher rate process of single molecules traversing the excitation volume.  To estimate the background rate (i.e. the inverse of the exponential time constant),  it is necessary to define a time-delay threshold above which the distribution  can be considered exponential.  Finally, a parameter estimation method needs to be specified, such as Maximum  Likelihood Estimation (MLE) or non-linear least squares curve fitting of  the time-delay histogram (both supported in FRETBursts).  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.77\columnwidth]{figures/ph_delays_distrib_all/ph_delays_distrib_all}  \caption{\label{fig:bg_dist_all} \textbf{Inter-photon delays fitted with and exponential function.}  Experimental distributions of inter-photon delays (\textit{dots}) and  corresponding fits of the exponential tail (\textit{solid lines}).  (\textit{Panel a}) An example of inter-photon delays distribution (\textit{red dots}) and an exponential fit  of the tail of the distribution (\textit{black line}).  (\textit{Panel b}) Inter-photon delays distribution and exponential fit for different photon streams as obtained with \texttt{dplot(d, hist\_bg)}. The \textit{dots} represent the experimental histogram for the different photon streams. The \textit{solid lines} represent the corresponding exponential fit of the tail of the distributions. The legend shows abbreviations of the photon streams  and the fitted background rates.%  }  \end{center}  \end{figure}  It is advisable to monitor the background as a function of time  throughout the measurement, in order to account for possible variations.  Experimentally, we found that when the background is not constant,  it usually varies  on time scales of tens of seconds (see figure~\ref{fig:bg_timetrace}).  FRETBursts divides the acquisition in constant-duration time  windows called \textit{background periods} and computes the background rates for  each of these windows (see section~\nameref{sec:bg_calc}).  Note that FRETBursts uses these local background rates also during burst search,  in order to compute time-dependent burst detection thresholds  and for background correction of burst data (see section~\nameref{sec:burstsearch}).  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.91\columnwidth]{figures/background_timetrace/background_timetrace}  \caption{\label{fig:bg_timetrace} \textbf{Background rates as a function of time.}  Estimated background rate as a function of time for two μs-ALEX measurements.  Different colors represent different photon streams.  (\textit{Panel a}) A measurement performed with a sealed sample chamber  exhibiting constant a background as a function of time.  (\textit{Panel b}) A measurement performed on an unsealed sample exhibiting  significant background variations due to sample evaporation and/or  photobleaching (likely impurities on the cover-glass).  These plots are produced by the command  \texttt{dplot(d, timetrace\_bg)} after estimation of background.  Each data point in these figures is computed for a 30~s time window.%  }  \end{center}  \end{figure}  \subsection*{The \texttt{Data} Class}  \label{sec:data_intro}  The \verb|Data| class  (\href{http://fretbursts.readthedocs.org/en/latest/data_class.html}{link})  is the fundamental data container in FRETBursts. It contains the  measurement data and parameters (attributes) as well as several methods  for data analysis (background estimation, burst search, etc...).  All analysis results (bursts data, estimated parameters) are also stored  as \verb|Data| attributes.  There are 3 important ``burst counts'' attributes which contain  the number of photons detected in the donor or the acceptor channel  during donor or acceptor excitation (table~\ref{tab:data_n}).  The attributes in table~\ref{tab:data_n} are background-corrected by default.  Furthermore, \verb|na| is corrected for leakage and direct excitation  (section~\nameref{sec:corrcoeff}) if the relative coefficients are specified  (by default they are 0).  There is also a closely related attribute named \verb|nda| for donor photons  during acceptor excitation. \verb|nda| is normally neglected as it only contains  background.  \begin{table}  \begin{tabular}{l p{0.8\columnwidth}}  Name & Description \\  \hline  \verb|nd| & number of photons detected by the donor channel (during donor excitation period in ALEX case)\\  \verb|na| & number of photons detected by the acceptor channel (during donor excitation period in ALEX case)\\  \verb|naa| & number of photons detected by the acceptor channel during acceptor excitation period (present only in ALEX measurements)\\  \end{tabular}  \caption{\label{tab:data_n}\texttt{Data} attributes names and descriptions for burst photon counts in different photon streams.}  \end{table}  \paragraph*{Python details}  Many \verb|Data| attributes are lists of arrays (or scalars) with the length of the lists  equal to the number of excitation spots. This means that in  single-spot measurements, an array of burst-data  is accessed by specifying the index as 0, for example \verb|Data.nd[0]|.  \verb|Data| implements a shortcut syntax to access the first element of a list  with an underscore, so that an equivalently syntax is  \verb|Data.nd_| instead of \verb|Data.nd[0]|.  \subsection*{Introduction to Burst Search}  \label{sec:burstsearch_intro}  Identifying single-molecule fluorescence bursts in the stream of photons is  one of the most crucial steps in the analysis of freely-diffusing single-molecule FRET data.  The widely used ``sliding window'' algorithm, introduced by the Seidel group in  1998~\cite{Eggeling_1998,Fries_1998}, involves searching for  $m$ consecutive photons detected during a period shorter than  $\Delta t$. In other words, bursts are regions of the photon stream where the  local rate (computed using $m$ photons) is above a minimum threshold rate.  Since a universal criterion to choose the rate threshold and  the number of photons $m$ is, as of today, lacking, it has become a common  practice to manually adjust those parameters for each specific measurement.  \DIFaddbegin \DIFadd{Commonly employed values for $m$ are between 5 and 15 photons.  }\DIFaddend   A more general approach consists in taking into account the background rate of  the specific measurements and in choosing a rate threshold that is $F$ times  larger than the background rate \DIFaddbegin \DIFadd{(typical values for $F$ are between 4 and 9)}\DIFaddend .   This approach ensures that all resulting bursts  have a signal-to-background ratio (SBR) larger than  $(F-1)$~\cite{Michalet_2012}. A consistent criterion for choosing the threshold is  particularly important when comparing different measurements with different background  rates, when the background significantly varies during measurements or in  multi-spot measurements where each spot has a different background rate.  A second important aspect of burst search is the choice of photon stream used  to perform the search.  In most cases, for instance when identifying FRET sub-populations,  the burst search should use all \DIFdelbegin \DIFdel{photons (i.e. APBS). In some }\DIFdelend \DIFaddbegin \DIFadd{the photons, the so called  all-photon burst search (APBS)~\mbox{%DIFAUXCMD  \cite{Eggeling_1998,Fries_1998,Nir_2006}}%DIFAUXCMD  .  In }\DIFaddend other cases, \DIFaddbegin \DIFadd{for example }\DIFaddend when focusing on  donor-only or \DIFdelbegin \DIFdel{acceptor only }\DIFdelend \DIFaddbegin \DIFadd{acceptor-only }\DIFaddend populations, it is better to perform   the search using only donor or acceptor signal.  In order to handle the general case and to provide flexibility,  FRETBursts allows performing the burst search on arbitrary selections of photons.  (see section~\nameref{sec:ph_streams} for more information on photon stream definitions).  Additionally, Nir~\textit{et al.}~\cite{Nir_2006} proposed \DIFdelbegin \DIFdel{DCBS (``}\DIFdelend \DIFaddbegin \DIFadd{a }\DIFaddend dual-channel   burst search \DIFdelbegin \DIFdel{'')   , }\DIFdelend \DIFaddbegin \DIFadd{(DCBS)   }\DIFaddend which can help mitigating artifacts due to photophysics effects such as blinking.  During DCBS, a search is performed \DIFdelbegin \DIFdel{in parallel }\DIFdelend on two photon streams  and bursts are defined as periods during which both photon streams   exhibit a rate higher than  the threshold, implementing the equivalent of an AND logic operation.  Conventionally, the term DCBS refers to a burst search where the two photon streams  are (1) all photons during donor excitation (\verb|Ph_sel(Dex='DAem')|) and  (2) acceptor channel photons during acceptor excitation (\verb|Ph_sel(Aex='Aem')|).  In FRETBursts, the user can choose arbitrary photon streams as input, an in general  this kind of search is called a ``AND-gate burst search''.  After burst search, it is necessary to select  bursts, for instance by specifying a minimum number of photons (or burst size). In the most  basic form, this selection can be performed during burst search by discarding  bursts with size smaller than a threshold $L$ \DIFaddbegin \DIFadd{(typically 30 or higher)}\DIFaddend ,   as originally proposed by  Eggeling~\textit{et al.}~\cite{Eggeling_1998}.  This method, however, neglects the effect  of background and $\gamma$ factor on the burst size and can lead to a selection  bias for some channels and/or sub-populations.  For this reason, we suggest performing a burst size selection after background  correction, taking into account the $\gamma$ factor, as discussed in  sections~\nameref{sec:burstsizeweights} and~\nameref{sec:burstsel}.  In special cases, users may choose to replace (or combine)  the burst selection based on burst size  with another criterion such as burst duration or brightness (see section~\nameref{sec:burstsel}).  \subsection*{Corrected Burst Sizes and Weights}  \label{sec:burstsizeweights}  The number of photons detected during a burst --the ``burst size''--  is computed using either all photons, or photons detected  during donor excitation period. To compute the burst size, FRETBursts uses  one of the following formulas:  \begin{equation}  \label{eq:burstsize_dex}  n_{dex} = n_a + \gamma\,n_d  \end{equation}  \begin{equation}  \label{eq:burstsize_allph}  n_t = n_a + \gamma\,n_d + n_{aa}  \end{equation}  \noindent where $n_d$, $n_a$ and $n_{aa}$ are, similarly to the attributes  in table~\ref{tab:data_n}, the background-corrected  burst counts in different channels and excitation periods.  The factor $\gamma$ takes into account  different fluorescence quantum yields of donor and acceptor fluorophores and different  photon detection efficiencies between donor and acceptor detection  channels~\cite{Deniz_1999,Lee_2005}.  Eq.~\ref{eq:burstsize_dex} includes counts collected during donor excitation periods only,  while eq.~\ref{eq:burstsize_allph} includes all counts.  Burst sizes computed according to eq.~\ref{eq:burstsize_dex}  or~\ref{eq:burstsize_allph} are called $\gamma$-corrected burst sizes.  The burst search algorithm yields a set of bursts whose sizes  approximately follow an exponential distribution.  Compared to bursts with smaller sizes, bursts with large sizes are less frequent,   but contain more information per-burst (having higher SNR).  Therefore, selecting bursts by size is an important step (see \DIFdelbegin \DIFdel{section~}\DIFdelend \nameref{sec:burstsel}).  A threshold set too low may result in unresolvable sub-populations  because of broadening of FRET peaks and appearance of shot-noise artifacts  in the FRET (and \DIFdelbegin \DIFdel{S}\DIFdelend \DIFaddbegin \DIFadd{$S$}\DIFaddend ) distribution (i.e. spurious narrow peaks due to \DIFdelbegin \DIFdel{E and S }\DIFdelend \DIFaddbegin \DIFadd{$E$ and $S$ }\DIFaddend being  computed as the ratio of small integers).  Conversely, too large a threshold may result in too low a number of bursts  therefore poor representation of the FRET distribution.  Additionally, especially when computing fractions of sub-populations  (e.g. ratio of number of bursts in each sub-population),  it is important to use $\gamma$-corrected burst sizes as selection criterion,  in order to avoid under-representing some FRET sub-populations  due to different quantum yields of donor and acceptor dyes and/or  different photon detection efficiencies of donor and acceptor channels.  \DIFaddbegin \DIFadd{An alternative method to apply the $\gamma$ correction is to randomly  discard a constant fraction of photons chosen randomly from either   the Dem or Aem photon stream~\mbox{%DIFAUXCMD  \cite{Nir_2006}}%DIFAUXCMD  . This   simple method transforms the measurement data in order to  achieve $\gamma=1$, overcoming the issue of selection bias between populations.  This approach has also the advantage of preserving  the binomial distribution of D and A photons in each burst, so that peaks  of FRET populations are easier to model statistically.  The only drawback is that, by discarding a fraction of photons,  this method leads to information loss and therefore to a potential   decrease in sensitivity and/or accuracy.  }  \DIFaddend A simple way to mitigate the dependence of the FRET distribution on  the burst size selection threshold is weighting bursts proportionally to their size  so that the bursts with largest sizes will have the largest weights.  Using size as weights (instead of any other monotonically increasing function  of size) can be justified noticing that the variance of bursts proximity ratio (PR) is  inversely proportional to the burst size (see~\nameref{sec:burstweights_theory} for details). % SI_link  In general, a weighting scheme is used for building efficient estimators for a population  parameter (e.g. the population FRET efficiency $E_p$).  But, it can also be used to build weighted histograms or Kernel Density  Estimation (KDE) plots which emphasize FRET subpopulations peaks  without excluding small size bursts.  Traditionally, for optimal results when not using weights, the  FRET histogram is manually adjusted by finding an ad-hoc (high)  size-threshold which selects only bursts with the highest size (and thus lowest variance).  Building size-weighted FRET histograms is a simple method to balance  the need of reducing the peaks width with the need of including as much bursts  as possible to reduce statistical noise.  As a practical example, by fixing the burst size threshold to a low value (e.g. 10-20 photons)  and using weights, is possible to build a FRET histogram with well-defined FRET sub-populations peaks  without the need of searching an optimal burst-size threshold (\nameref{sec:burstweights_theory}).  \paragraph*{Python details}  FRETBursts has the option to weight bursts using $\gamma$-corrected  burst sizes which optionally include acceptor excitation photons \verb|naa|.  A weight proportional to the burst size is applied by passing the argument  \verb|weights='size'| to histogram or KDE plot functions. The \verb|weights|  keyword can be also passed to fitting functions in order to fit  the weighted E or S distributions (see section~\nameref{sec:fretfit}).  Other weighting functions (for example depending quadratically on the size)  are listed in the \verb|fret_fit.get_weights| documentation  (\href{http://fretbursts.readthedocs.org/en/latest/fret_fit.html#fretbursts.fret_fit.get_weights}{link}).  However, using weights different from the size is not recommended  due to their less efficient use of burst information   \DIFaddbegin \DIFadd{(}\nameref{sec:burstweights_theory}\DIFadd{)}\DIFaddend .  \section*{smFRET Burst Analysis}  \label{sec:analysis}  \subsection*{Loading the Data}  \label{sec:dataload}  While FRETBursts can load several data files formats,  we encourage users to adopt the recently introduced Photon-HDF5  file format~\cite{Ingargiola2016}.  Photon-HDF5 is an HDF5-based, open format, specifically designed  for freely-diffusing smFRET and  other timestamp-based experiments.  Photon-HDF5 is a self-documented, platform- and language-independent binary format,  which supports compression and allows saving photon data (e.g. timestamps)  and measurement-specific metadata  (e.g. setup and sample information, authors, provenance, etc.).  Moreover, Photon-HDF5 is designed for long-term data preservation and aims  to facilitate data sharing  between different software and research groups.  All example data files provided with FRETBursts use the Photon-HDF5 format.  To load data from a Photon-HDF5 file, we use the function \verb|loader.photon_hdf5|  (\href{http://fretbursts.readthedocs.org/en/latest/loader.html#fretbursts.loader.photon_hdf5}{link}):  \begin{lstlisting}  d = loader.photon_hdf5(filename)  \end{lstlisting}  \noindent  where \verb|filename| is a string containing the file path.  This command loads the measurement data into the variable \verb|d|,  a \verb|Data| object (see section~\nameref{sec:data_intro}).  The same command can load data from a variety of smFRET measurements supported  by the Photon-HDF5 format, taking advantage of the rich metadata included with each file.  For instance, data generated using different excitation schemes such as CW excitation  or pulsed excitation, single-laser vs two alternating lasers, etc.,  or with any number of excitation spots, are automatically recognized and interpreted accordingly.  FRETBursts also supports loading μs-ALEX data stored in .sm files  (a custom binary format used in the Weiss lab) and  ns-ALEX data stored in .spc files (a binary format used by TCSPC Becker \& Hickl acquisition hardware).  Alternatively, these and other formats (such as ht3, a binary format used by PicoQuant hardware)  can be converted into Photon-HDF5 files using phconvert,  a file conversion library and utility for Photon-HDF5  (\href{http://photon-hdf5.github.io/phconvert/}{link}).  More information on loading different file formats  can be found in the \verb|loader| module's documentation  (\href{http://fretbursts.readthedocs.org/en/latest/loader.html}{link}).  \subsection*{Alternation Parameters}  \label{sec:alternation}  For μs-ALEX and ns-ALEX data, Photon-HDF5 normally stores parameters defining  alternation periods corresponding to donor and acceptor laser excitation.  At load time, a user can plot these parameters and change them if deemed necessary.  In μs-ALEX measurements~\cite{Kapanidis_2004},  CW laser lines are alternated on timescales of the order of 10 to 100~μs.  Plotting an histogram of timestamps modulo the alternation period, it  is possible to identify the donor and acceptor excitation periods (see figure~\ref{fig:altern_hist_double}a).  In ns-ALEX measurements~\cite{Laurence_2005},  pulsed lasers with equal repetition rates are delayed with respect  to one another with typical delays of 10 to 100~ns.  In this case, forming an histogram of TCSPC times (nanotimes) will allow  the definition of periods of fluorescence after excitation  of either the donor or the acceptor (see figure~\ref{fig:altern_hist_double}b).  In both cases, the function  \verb|plot_alternation_hist|  (\href{http://fretbursts.readthedocs.org/en/latest/plots.html#fretbursts.burst_plot.plot_alternation_hist}{link})  will plot the relevant alternation histogram (figure~\ref{fig:altern_hist_double})  using currently selected (or default) values for donor and acceptor excitation periods.  \begin{figure}[h!]  \begin{center}  \includegraphics[width=1\columnwidth]{figures/ALEX_alternation_double/ALEX_alternation_double}  \caption{\label{fig:altern_hist_double}  \textbf{Alternation histograms for μs-ALEX and ns-ALEX measurements.}  Histograms used for the selection/determination  of the alternation periods for two typical smFRET-ALEX experiments.  Distributions of photons detected by donor channel are in \textit{green},  and by acceptor channel in \textit{red}.  The light \textit{green} and \textit{red} shaded areas indicate the donor  and acceptor period definitions.  (a) μs-ALEX alternation histogram, i.e. histogram of timestamps \textit{modulo}  the alternation period for a smFRET measurement.  (b) ns-ALEX TCSPC nanotime histogram for a smFRET measurement.  Both plots have been generated by the same plot function  (\texttt{plot\_alternation\_hist()}).  Additional information on these specific measurements can be found in the  attached notebook  (\href{http://nbviewer.jupyter.org/github/tritemio/fretbursts_paper/blob/master/notebooks/Figures\%20-\%20ALEX\%20histograms.ipynb}{link}).%  }  \end{center}  \end{figure}  To change the period definitions, we can type:  \begin{lstlisting}  d.add(D_ON=(2100, 3900), A_ON=(100, 1900))  \end{lstlisting}  \DIFaddbegin \noindent \DIFaddend where \verb|D_ON| and \verb|A_ON| are tuples (pairs of numbers) representing  the \textit{start} and \textit{stop} values for D or A excitation periods.  The previous command works for both μs-ALEX and ns-ALEX measurements.  After changing the parameters, a new alternation plot will show the updated  period definitions.  The alternation period definition can be applied to the data  using the function \verb|loader.alex_apply_period|  (\href{http://fretbursts.readthedocs.org/en/latest/loader.html#fretbursts.loader.alex_apply_period}{link}):  \begin{lstlisting}  loader.alex_apply_period(d)  \end{lstlisting}  After this command, \verb|d| will contain only photons inside the defined excitation periods.  If the user needs to update the periods definition, the data file will need to be  reloaded and the steps above repeated as described.  \subsection*{Background Estimation}  \label{sec:bg_calc}  The first step of smFRET analysis involves estimating background rates.  For example, \DIFdelbegin \DIFdel{to compute the background }\DIFdelend \DIFaddbegin \DIFadd{the following command:  }  %DIF > Don't split command on two lines for PLOS  \begin{lstlisting}  d.calc_bg(bg.exp_fit, time_s=30, tail_min_us='auto')  \end{lstlisting}  \noindent \DIFadd{estimates the background rates in windows of 30~s  using the default iterative algorithm for choosing the  fitting threshold (}\nameref{sec:bg_intro}\DIFadd{). %DIF > PLOS: remove section and use nameref  Beginner users can simply use the previous command and   proceed to burst search (}\nameref{sec:burstsearch}\DIFadd{). %DIF > PLOS: remove section and use nameref  For more advanced users, this section provides details on   the different background estimation and plotting functions   provided by FRETBursts.  }  \DIFadd{As a start, we show how to estimate the background }\DIFaddend every 30~s,   using a \DIFdelbegin \DIFdel{minimal }\DIFdelend \DIFaddbegin \DIFadd{fixed }\DIFaddend inter-photon delay \DIFdelbegin \DIFdel{fixed }\DIFdelend threshold of 2~ms   \DIFdelbegin \DIFdel{for the all photon streams, the corresponding command is}\DIFdelend \DIFaddbegin \DIFadd{(the same for all the photon streams)}\DIFaddend :  \begin{lstlisting}  d.calc_bg(bg.exp_fit, time_s=30, tail_min_us=2000)  \end{lstlisting}  The first argument (\verb|bg.exp_fit|) is the function used to fit the  background rate for each photon stream (see section~\nameref{sec:bg_intro}).  The function  \verb|bg.exp_fit| estimates the background using a maximum likelihood estimation  (MLE) of the delays distribution.  The second argument, \verb|time_s|, is the duration of the  \textit{background period} (section~\nameref{sec:bg_intro}) and the third, \verb|tail_min_us|,  is the minimum inter-photon delay to use when fitting the distribution to the specified model function.  To use different thresholds for each photon stream we pass a  tuple (i.e. a comma-separated list of values, \href{https://docs.python.org/3.5/tutorial/datastructures.html#tuples-and-sequences}{link}) instead of a scalar.  The recommended approach is however automating the choice of threshold using  \verb|tail_min_us='auto'| using an heuristic algorithm which is described in  \textit{Background estimation} section of the μs-ALEX tutorial  (\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/FRETBursts%20-%20us-ALEX%20smFRET%20burst%20analysis.ipynb#Background-estimation}{link}).  Finally, it is possible to use a slower but rigorous approach for finding the optimal  threshold as described in~\nameref{sec:bg_opt_th}. % SI_link  FRETBursts provides two kinds of plots to represent the background. One shows the histograms  of inter-photon delays compared to the fitted exponential distribution, shown in  figure~\ref{fig:bg_dist_all}) (see section~\nameref{sec:bg_intro} for details on the inter-photon distribution).  This plot is created with the command:  \begin{lstlisting}  dplot(d, hist_bg, period=0)  \end{lstlisting}  This command reflects the general form of plotting commands in FRETBursts  as described in~\nameref{sec:plotting}. % SI_link  Here we only note that the argument \verb|period| is an integer specifying the background  period to be plotted (when omitted, the default is 0, i.e. the first period).  Figure~\ref{fig:bg_dist_all} allows to quickly identify pathological cases where the  background fitting procedure returns unreasonable values.  The second background-related plot represents a timetrace of background rates,  as shown in figure~\ref{fig:bg_timetrace}. This plot allows monitoring background rate variations  occurring during the measurement and is obtained with the command:  \begin{lstlisting}  dplot(d, timetrace_bg)  \end{lstlisting}  Normally, samples should have a fairly constant background rate as a function of time  as in figure~\ref{fig:bg_timetrace}(a). However, sometimes, non-ideal  experimental conditions can yield a time-varying background rate, as illustrated in  figure~\ref{fig:bg_timetrace}(b).  A possible reason for the observed behavior could be buffer evaporation from an open sample  \DIFdelbegin \DIFdel{or poorly }\DIFdelend \DIFaddbegin \DIFadd{(we strongly recommend using a }\DIFaddend sealed   observation chamber \DIFaddbegin \DIFadd{whenever possible)}\DIFaddend . Additionally,  cover-glass impurities can contribute to the background.  These impurities tend to bleach on timescales of minutes resulting in  background variations during the course of the measurement.  \paragraph*{Python details}  The estimated background rates are stored in the \verb|Data| attributes  \verb|bg_dd|, \verb|bg_ad| and \verb|bg_aa|, corresponding to photon  streams \verb|Ph_sel(Dex='Dem')|, \verb|Ph_sel(Dex='Aem')| and \verb|Ph_sel(Aex='Aem')|  respectively.  These attributes are lists of arrays (one array per excitation spot).  The arrays contain the estimated background rates in the different time windows  (background periods).  Additional background fitting functions (e.g. least-square fitting of inter-photon delay  histogram) are available in \verb|bg| namespace  (i.e. the \verb|background| module,  \href{http://fretbursts.readthedocs.org/en/latest/background.html}{link}).  \subsection*{Burst Search}  \label{sec:burstsearch}  %\subsubsection*{Burst Search in FRETBursts}  %\label{sec:burstsearch_code}  Following background estimation, burst search is the next step of  the analysis.  In FRETBursts, a standard burst search using a single photon stream  (see section~\nameref{sec:burstsearch_intro}) is performed by calling the  \verb|Data.burst_search| method  (\href{http://fretbursts.readthedocs.org/en/latest/data_class.html#fretbursts.burstlib.Data.burst_search}{link}).  For example, the following command:  \begin{lstlisting}  d.burst_search(F=6, m=10, ph_sel=Ph_sel('all'))  \end{lstlisting}  \DIFaddbegin \noindent \DIFaddend performs a burst search on all photons  (\verb|ph_sel=Ph_sel('all')|), with a count rate threshold equal to 6 times the  local background rate (\verb|F=6|), using 10 consecutive photons to compute the  local count rate (\verb|m=10|).  A different photon stream, threshold ($F$) or number of photons $m$ can be selected  by passing different values.  These parameters are good general-purpose starting point for smFRET analysis  but can they can be adjusted if needed.  Note that the previous burst search does not perform any burst size selection  (however, by definition, the minimum bursts size is effectively $m$).  An additional parameter $L$ can be passed to impose a minimum burst  size before any correction.  However, it is recommended to select bursts only after \DIFdelbegin \DIFdel{background corrections  are applied}\DIFdelend \DIFaddbegin \DIFadd{applying background   corrections}\DIFaddend , as discussed in the next section~\nameref{sec:burstsel}.  It might sometimes be useful to specify a fixed photon-rate threshold, instead  of a threshold depending on the background rate, as in the previous example. In  this case, instead of $F$, the argument \verb|min_rate_cps| can be used to  specify the threshold (in counts-per-second). For example, a burst search with  a 50~kcps threshold is performed as follows:  \begin{lstlisting}  d.burst_search(min_rate_cps=50e3, m=10,  ph_sel=Ph_sel('all'))  \end{lstlisting}  Finally, to perform a DCBS burst search (or in general an AND gate burst search,  see section~\nameref{sec:burstsearch_intro}) we use the function  \verb|burst_search_and_gate|  (\href{http://fretbursts.readthedocs.org/en/latest/plugins.html#fretbursts.burstlib_ext.burst_search_and_gate}{link}),  as illustrated in the following example:  \begin{lstlisting}  d_dcbs = bext.burst_search_and_gate(d, F=6, m=10)  \end{lstlisting}  The last command puts the burst search results in a new copy of the   \verb|Data| variable \verb|d|  (in this example \DIFdelbegin \DIFdel{, }\DIFdelend the copy is called \verb|d_dcbs|).  Since FRETBursts shares the timestamps and detectors arrays between  different copies of \verb|Data| objects, the memory usage is minimized, even when  several copies are created.  \paragraph*{Python details}  Note that, while \DIFdelbegin %DIFDELCMD < \verb|.burst_search()| %%%  \DIFdelend \DIFaddbegin \verb|d.burst_search()| \DIFaddend is a method of \verb|Data|,  \DIFdelbegin %DIFDELCMD < \verb|burst_search_and_gate| %%%  \DIFdelend \DIFaddbegin \verb|bext.burst_search_and_gate()| \DIFaddend is a function in the \verb|bext| module  taking a \verb|Data| object as a first argument and returning a new  \verb|Data| object.  The function \verb|burst_search_and_gate| accepts optional arguments,  \verb|ph_sel1| and \verb|ph_sel2|, whose default values correspond to the  classical DCBS photon stream selection (see section~\nameref{sec:burstsearch_intro}).  These arguments can be specified to select different photon streams than those used in  a classical DCBS.  The \verb|bext| module (\href{http://fretbursts.readthedocs.org/en/latest/plugins.html}{link})  collects ``plugin'' functions that provides additional algorithms  for processing \verb|Data| objects.  \subsection*{Bursts Corrections}  \label{sec:corrcoeff}  In μs-ALEX, there are 3 important correction parameters: $\gamma$-factor,  donor leakage into the acceptor channel  and acceptor direct excitation by the donor excitation laser~\cite{Lee_2005}.  These corrections can be applied to burst data by simply assigning values  to the respective \verb|Data| attributes:  \begin{lstlisting}  d.gamma = 0.85  d.leakage = 0.15  d.dir_ex = 0.08  \end{lstlisting}  These attributes can be assigned either before or after the burst search. In the  latter case, existing burst data is automatically updated using the new  correction parameters.  These correction factors can be used to display corrected FRET distributions.  However, when the goal is to fit the FRET efficiency of sub-populations,  it is simpler to fit the background-corrected  PR histogram and then correct the population-level PR value (see SI in~\cite{Lee_2005}).  Correcting PR of each population (instead of correcting the data in each burst)  avoids distortion of the FRET distribution and keeps peaks of  static FRET subpopulations closer to the ideal \DIFdelbegin \DIFdel{Binomial }\DIFdelend \DIFaddbegin \DIFadd{binomial }\DIFaddend statistics~\cite{Gopich_2007}.  FRETBursts implements the correction formulas for $E$ and $S$ in the functions  \verb|fretmath.correct_E_gamma_leak_dir| and \verb|fretmath.correct_S|  (\href{http://fretbursts.readthedocs.org/en/latest/fretmath.html}{link}).  A derivation of these correction formulas (using computer-assisted algebra)  can be found online as an interactive notebook (\href{http://nbviewer.jupyter.org/github/tritemio/notebooks/blob/master/Derivation%20of%20FRET%20and%20S%20correction%20formulas.ipynb}{link}).  \subsection*{Burst Selection}  \label{sec:burstsel}  After burst search, it is common to select bursts according to different  criteria. One of the most common is burst size.  For instance, to select bursts with more than 30 photons detected during the donor excitation  (computed after background correction), we use following command:  \begin{lstlisting}  ds = d.select_bursts(select_bursts.size, th1=30)  \end{lstlisting}  The previous command creates a new \verb|Data| variable (\verb|ds|) containing  the selected bursts. \verb|th1| defines the lower bound for burst size, while  \verb|th2| defines the upper bound (when not specified, as in the previous example,  the upper bound is $+\infty$).  As before, the new object (\verb|ds|) will share the photon data  arrays with the original object (\verb|d|) in order to minimize the amount  of used memory.  The first argument of \verb|select_bursts|  (\href{http://fretbursts.readthedocs.org/en/latest/data_class.html#burst-selection-methods}{link})  is a python function implementing the ``selection rule'' (\verb|select_bursts.size| in this example);  all remaining arguments (only \verb|th1| in this case) are parameters of the selection rule.  The \verb|select_bursts| module  (\href{http://fretbursts.readthedocs.org/en/latest/burst_selection.html}{link})  contains numerous built-in selection functions  (\href{http://fretbursts.readthedocs.org/en/latest/burst_selection.html#module-fretbursts.select_bursts}{link}).  For example,  \verb|select_bursts.ES|  is used to select a region on the E-S ALEX histogram,  \verb|select_bursts.width|  to select bursts based on their duration.  New custom criteria can be readily implemented by defining a new selection function,  which requires only a couple of lines of code (see the  \verb|select_bursts| module's source code for examples,  \href{https://github.com/tritemio/FRETBursts/blob/master/fretbursts/select_bursts.py}{link}).  Finally, different criteria can be combined sequentially.  For example, with the following commands:  \begin{lstlisting}  ds = d.select_bursts(select_bursts.size,  th1=50, th2=200)  dsw = ds.select_bursts(select_bursts.width,  th1=0.5e-3, th2=3e-3)  \end{lstlisting}  \DIFaddbegin \noindent \DIFaddend bursts in \verb|dsw|  will have sizes between 50 and 200 photons, and duration between 0.5 and 3~ms.  \paragraph*{Burst Size Selection}  In the previous section, we selected bursts by size, using only  photons detected in both D and A channels during D excitation (i.e. \DIFdelbegin \DIFdel{Dex }\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex} }\DIFaddend photons),  as in eq.~\ref{eq:burstsize_dex}.  Alternatively, a threshold on the burst size computed including all photons  can be applied by adding $n_{aa}$ to the burst size (see eq.~\ref{eq:burstsize_allph}).  This is achieved  by passing \verb|add_naa=True| to the selection function.  The complete selection command is:  \begin{lstlisting}  ds = d.select_bursts(select_bursts.size,  th1=30, add_naa=True)  \end{lstlisting}  \DIFdelbegin %DIFDELCMD < \noindent %%%  \DIFdelend The result of this selection is plotted in figure~\ref{fig:alex_jointplot}.  When \verb|add_naa| is not specified,  as in the previous section, the default is \verb|add_naa=False|  (i.e. compute size using only \DIFdelbegin \DIFdel{Dex }\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex} }\DIFaddend photons).  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.7\columnwidth]{figures/alex_jointplot/alex_jointplot}  \caption{\label{fig:alex_jointplot} \textbf{E-S histogram showing FRET, D-only and A-only populations.}  A 2-D ALEX histogram and marginal E and S histograms for a 40-bp dsDNA  with D-A distance of 17 bases (Donor dye: ATTO550, Acceptor dye: ATTO647N).  Bursts are selected with a size-threshold of 30 photons, including \DIFdelbeginFL \DIFdelFL{Aex }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{A\textsubscript{ex} }\DIFaddendFL photons.  The plot is obtained with \texttt{alex\_jointplot(ds)}. The 2D E-S distribution plot (join plot)  is an histogram with hexagonal bins, which reduce the binning artifacts (compared to square bins)  and naturally resembles a scatter-plot when the burst density is low  \DIFaddbeginFL \DIFaddFL{(see }\nameref{sec:plotting}\DIFaddFL{)}\DIFaddendFL .  Three populations are visible: FRET population (middle), D-only population (top left) and  A-only population (bottom, $S < 0.2$). Compare with figure~\ref{fig:alex_jointplot_fretsel}  where the FRET population has been isolated.%  }  \end{center}  \end{figure}  Another important parameter for defining the burst size is the $\gamma$-factor, i.e.  the imbalance between the donor and the acceptor channel signals. As noted in  section~\nameref{sec:burstsizeweights}, the $\gamma$-factor is  used to compensate bias for the different fluorescence quantum yields of the D and A  fluorophores as well as the different photon-detection efficiencies of the D and A channels.  When $\gamma$ is significantly different from 1, neglecting its effect on burst size leads to  over-representing (in terms of number of bursts) one FRET population versus the others.  When the $\gamma$ factor is known \DIFaddbegin \DIFadd{(and $\ne 1$)}\DIFaddend , a more unbiased selection of different FRET  populations can be achieved passing the argument \verb|gamma| to the  selection function:  \begin{lstlisting}  ds = d.select_bursts(select_bursts.size,  th1=15, gamma=0.65)  \end{lstlisting}  When not specified, $\gamma=1$ is assumed.  \DIFdelbegin %DIFDELCMD <   %DIFDELCMD < %%%  \DIFdelend For more details on burst size selection, see the  \verb|select_bursts.size| documentation  (\href{http://fretbursts.readthedocs.org/en/latest/burst_selection.html#fretbursts.select_bursts.size}{link}).  \paragraph*{Python details}  \DIFdelbegin \DIFdel{To }\DIFdelend \DIFaddbegin \DIFadd{The method to }\DIFaddend compute $\gamma$-corrected burst sizes (with   or without addition of \verb|naa|)  \DIFdelbegin \DIFdel{the method }\DIFdelend \DIFaddbegin \DIFadd{is }\DIFaddend \verb|Data.burst_sizes|  (\href{http://fretbursts.readthedocs.org/en/latest/data_class.html#fretbursts.burstlib.Data.burst_sizes}{link})\DIFdelbegin \DIFdel{is used}\DIFdelend .  \paragraph*{Select the FRET Populations}  In smFRET-ALEX experiments, in addition to one or more FRET populations, there are always  donor-only (D-only) and acceptor-only (A-only) populations.  In most cases, these additional populations are not of interest and need to be filtered out.  In principle, using the E-S representation, D-only and A-only bursts  can be excluded by selecting bursts within a range of $S$ values (e.g. S=0.2-0.8).  This approach, however, simply truncates the burst distribution with arbitrary  thresholds and is therefore not recommended for quantitative assessment of FRET  populations.  An alternative approach consists in applying two selection filters sequentially.  First, the A-only population is filtered out  by applying a threshold on the number of photons during D excitation (\DIFdelbegin \DIFdel{Dex}\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex}}\DIFaddend ).  Second, the D-only population is filtered out by applying a threshold on  the number of A photons during A excitation (\DIFdelbegin \DIFdel{AemAex}\DIFdelend \DIFaddbegin \DIFadd{A\textsubscript{ex}A\textsubscript{em}}\DIFaddend ).  The commands for these combined selections are:  \begin{lstlisting}  ds1 = d.select_bursts(select_bursts.size, th1=15)  ds2 = ds1.select_bursts(select_bursts.naa, th1=15)  \end{lstlisting}  Here, \DIFaddbegin \DIFadd{the }\DIFaddend variable \verb|ds2| contains the combined burst selection.  Figure~\ref{fig:alex_jointplot_fretsel} shows the resulting pure FRET  population obtained with the previous selection.  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.7\columnwidth]{figures/alex_jointplot_fretsel/alex_jointplot_fretsel}  \caption{\label{fig:alex_jointplot_fretsel}  \textbf{E-S histogram after filtering out D-only and A-only populations.}  2-D ALEX histogram after selection of FRET population  using the composition of two burst selection filters:  (1) selection of bursts with counts in \DIFdelbeginFL \DIFdelFL{Dex }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{D\textsubscript{ex} }\DIFaddendFL stream larger than 15;  (2) selection of bursts with counts in \DIFdelbeginFL \DIFdelFL{AemAex }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{A\textsubscript{ex}A\textsubscript{em} }\DIFaddendFL stream larger than 15.  Compare to figure~\ref{fig:alex_jointplot} where all burst populations  (FRET, D-only and A-only) are reported.%  }  \end{center}  \end{figure}  \subsection*{Population Analysis}  \label{sec:fretfit}  Typically, after bursts selection, E or S histograms are fitted to a model.  FRETBursts \verb|mfit| module allows fitting histograms of bursts quantities  (i.e. E or S) with arbitrary models. In this context, a model is an object  specifying a function, the parameters varied during the fit  and optional constraints for these parameters. This concept of model  is taken from \textit{lmfit}~\cite{lmfit}, the underlying library used by  FRETBursts to perform the fits.  Models can be created from arbitrary functions.   \DIFdelbegin \DIFdel{By default,  FRETBursts allows using predefined }\DIFdelend \DIFaddbegin \DIFadd{FRETBursts includes predefined (i.e. built-in) }\DIFaddend models   such as 1 to 3 Gaussian peaks or 2-Gaussian connected by a \DIFdelbegin \DIFdel{``bridge''.  }\DIFdelend \DIFaddbegin \DIFadd{flat plateau.  The latter is an empirical model that  can be used to more accurately fit the center values of two populations  when the peaks are connected by intermediate-FRET bursts  (for the analytical definition of this function see the documentation,   }\href{http://fretbursts.readthedocs.io/en/latest/mfit.html#fretbursts.mfit.factory_two_gaussians}{link}\DIFadd{).  }\DIFaddend Built-in models are created by calling a corresponding factory function  (\DIFdelbegin \DIFdel{names starting }\DIFdelend \DIFaddbegin \DIFadd{whose names start }\DIFaddend with \verb|mfit.factory_|) which initializes the parameters  with values and constraints suitable for E and S histograms fits  \DIFdelbegin \DIFdel{.  }\DIFdelend (see \textit{Factory Functions} documentation,  \href{http://fretbursts.readthedocs.org/en/latest/mfit.html#model-factory-functions}{link}).  As an example, we \DIFaddbegin \DIFadd{can }\DIFaddend fit the E histogram of bursts in the  \verb|ds| variable with two Gaussian peaks with the following command:  \begin{lstlisting}  bext.bursts_fitter(ds, 'E', binwidth=0.03,  model=mfit.factory_two_gaussians())  \end{lstlisting}  Changing \verb|'E'| with \verb|'S'| will fit the S histogram instead.  The \verb|binwidth| argument specifies the histogram bin width and  the \verb|model| argument defines which model shall be used for  fitting.  All fitting results (including best fit values, uncertainties, etc...),  are stored in the \verb|E_fitter| (or \verb|S_fitter|)  attributes of the \verb|Data| variable (named \verb|ds| here).  To print a comprehensive summary of the fit results, including  uncertainties, reduced $\chi^2$ and correlation between parameters,  \DIFdelbegin \DIFdel{the we }\DIFdelend \DIFaddbegin \DIFadd{we can }\DIFaddend use the following command:  \begin{lstlisting}  fit_res = ds.E_fitter.fit_res[0]  print(fit_res.fit_report())  \end{lstlisting}  Finally, to plot the fitted model together with the FRET histogram,  as shown in figure~\ref{fig:histfit}, we pass the parameter \verb|show_model=True|  to the \verb|hist_fret| function   \DIFdelbegin \DIFdel{as follows  (seesection}\DIFdelend \DIFaddbegin \DIFadd{(see}\DIFaddend ~\nameref{sec:plotting} for an introduction to plotting in FRETBursts):  \begin{lstlisting}  dplot(ds, hist_fret, show_model=True)  \end{lstlisting}  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.49\columnwidth]{figures/hist_fit/hist_fit}  \caption{\label{fig:histfit} \textbf{FRET histogram fitted with two Gaussians.}  Example of a FRET histogram fitted with a 2-Gaussian model.  After performing the fit (see main text), the plot is generated  with \texttt{dplot(ds, hist\_fret, show\_model=True)}.%  }  \end{center}  \end{figure}  For more examples on fitting bursts data and plotting results, refer to the  fitting section of the μs-ALEX notebook (\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/FRETBursts%20-%20us-ALEX%20smFRET%20burst%20analysis.ipynb#FRET-fit:-in-depth-example}{link}),  the \textit{Fitting Framework} section of the documentation  (\href{http://fretbursts.readthedocs.org/en/latest/fit.html}{link})  as well as the documentation for \verb|bursts_fitter| function  (\href{http://fretbursts.readthedocs.org/en/latest/plugins.html#fretbursts.burstlib_ext.bursts_fitter}{link}).  \paragraph*{Python details}  Models returned by FRETBursts's factory functions (\verb|mfit.factory_*|)  are \verb|lmfit.Model| objects (\href{https://lmfit.github.io/lmfit-py/model.html}{link}).  Custom models can be created by calling \verb|lmfit.Model| directly.  When an \verb|lmfit.Model| is fitted, it returns a \verb|ModelResults| object  (\href{https://lmfit.github.io/lmfit-py/model.html#the-modelresult-class}{link}),  which contains all information related to the fit (model, data,  parameters with best values and uncertainties) and useful methods to operate on fit results.  FRETBursts puts a \verb|ModelResults| object of each excitation spot in the list  \verb|ds.E_fitter.fit_res|.  For instance, to obtain the reduced $\chi^2$ value of the E histogram fit in a  single-spot measurement \verb|d|, we use the following command:  \begin{lstlisting}  d.E_fitter.fit_res[0].redchi  \end{lstlisting}  Other useful attributes are \verb|aic| and \verb|bic| which contain  \DIFaddbegin \DIFadd{statistics for }\DIFaddend the Akaike information criterion (AIC)\DIFaddbegin \DIFadd{~\mbox{%DIFAUXCMD  \cite{akaike_new_1974}   }%DIFAUXCMD  }\DIFaddend and the Bayes Information criterion (BIC)\DIFaddbegin \DIFadd{~\mbox{%DIFAUXCMD  \cite{schwarz_estimating_1978}}%DIFAUXCMD  }\DIFaddend .  AIC and BIC \DIFdelbegin \DIFdel{allow comparing different models and  selecting the most appropriate for the dataat hand.  }\DIFdelend \DIFaddbegin \DIFadd{are general-purpose statistical criteria for comparing the  suitability of multiple non-nested models according to the data.  By penalizing models with higher number of parameters, these criteria   strike a balance between the need of achieving high goodness of fit   with the need of keeping the model complexity low to avoid overfitting.  }\DIFaddend   Examples of definition and modification of fit models are provided in  the aforementioned μs-ALEX notebook  (\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/FRETBursts%20-%20us-ALEX%20smFRET%20burst%20analysis.ipynb#FRET-fit:-in-depth-example}{link}).  Users can also refer to the comprehensive lmfit's documentation  (\href{http://lmfit.github.io/lmfit-py/}{link}).  \DIFdelbegin \section*{\DIFdel{Implementing Burst Variance Analysis}}  %DIFAUXCMD  \DIFdelend \DIFaddbegin \subsection*{\DIFadd{FRET Dynamics}}  \label{sec:dynamics}  \DIFaddend   \DIFdelbegin %DIFDELCMD < %DIFDELCMD < \label{sec:bva}%%%  %DIFDELCMD < %%%  \DIFdel{In this section, we describe how to implement burst variance analysis (BVA)  as described in~\mbox{%DIFAUXCMD  \cite{Torella_2011}}%DIFAUXCMD  .  FRETBursts provides well-tested, general-purpose functions for timestamps and burst data  manipulation and therefore simplifies implementing custom burst analysis algorithms such as BVA.  }%DIFDELCMD <   %DIFDELCMD < %%%  \subsection*{\DIFdel{BVA Overview}}  %DIFAUXCMD  \DIFdelend Single-molecule FRET histograms show more information than just mean FRET efficiencies.  While in general the presence of several peaks clearly indicates the existence of  multiple subpopulations, a single peak cannot a priori be associated with  a single population defined by a unique FRET efficiency without further analysis\DIFdelbegin \DIFdel{(such as, for instance, shot-noise analysis~\mbox{%DIFAUXCMD  \cite{Nir_2006,Antonik2006}}%DIFAUXCMD  )}\DIFdelend .  \DIFdelbegin \DIFdel{The FRET histogram of a single FRET population  has a minimum width set by shot noise  }\DIFdelend \DIFaddbegin \DIFadd{Shot-noise analysis~\mbox{%DIFAUXCMD  \cite{Nir_2006} }%DIFAUXCMD  or probability  distribution analysis (PDA)~\mbox{%DIFAUXCMD  \cite{Antonik2006,kalinin_probability_2007}  }%DIFAUXCMD  allow to compute the minimum width of a static FRET population  }\DIFaddend (i.e. \DIFdelbegin \DIFdel{the width is }\DIFdelend caused by the statistics of discrete photon-detection events).  \DIFdelbegin \DIFdel{FRET distributions broader than the shot noise limit,  can be ascribed to either a static mixture of species with slightly different  FRET efficiencies, or to }\DIFdelend \DIFaddbegin \DIFadd{Typically, several mechanisms  contribute to the broadening of the experimental FRET peak  beyond the shot-noise limit. These include heterogeneities in the sample  resulting in a distribution of Förster radiuses,  or actual conformational changes giving rise to }\DIFaddend a \DIFdelbegin \DIFdel{specie undergoing dynamic transitions (e.  g. interconversion between multiple states, diffusion in a continuum of conformations, binding-unbinding events, etc.  ).  When the single peak of a FRET distribution is wider than predicted from shot-noise, it is not possible to discriminate between the static and dynamic case without further analysis .  }\DIFdelend \DIFaddbegin \DIFadd{distribution  of D-A distances~\mbox{%DIFAUXCMD  \cite{sisamakis_accurate_2010}}%DIFAUXCMD  .  }  \DIFadd{Gopich and Szabo developed an elegant analytical model   for the FRET distribution of $M$ interconverting states   based on superposition of Gaussian peaks~\mbox{%DIFAUXCMD  \cite{gopich_fret_2010}}%DIFAUXCMD  .  Unfortunately, the method is not of straightforward application for   freely-diffusing data as it requires a special selection   criterion for filtering bursts with quasi-Poisson rates.  Santoso~\mbox{%DIFAUXCMD  \cite{santoso_probing_2009} }%DIFAUXCMD  and Kalinin~\mbox{%DIFAUXCMD  \cite{Kalinin2010}  }%DIFAUXCMD  extended the PDA approach to estimate conversion rates between different  states by comparing FRET histograms as a function of the time-bin size.  In addition, Gopich and Szabo~\mbox{%DIFAUXCMD  \cite{Gopich2009, gopich_theory_2011} }%DIFAUXCMD  developed  a related method to compute conversion rates using  a likelihood function which depends on photon timestamps (overcoming  the time binning and FRET histogramming step and directly applicable   to freely-diffusing data).  In case of measurement including lifetime, the multiparameter fluorescence  detection (MFD) method allows to identify dynamics from the deviation   from the linear relation between lifetime and E~\mbox{%DIFAUXCMD  \cite{sisamakis_accurate_2010}}%DIFAUXCMD  .  Hoffman~\mbox{%DIFAUXCMD  \cite{hoffmann_quantifying_2011} }%DIFAUXCMD  proposed a method   called RASP (recurrence analysis of single particles) to extend   the timescale of detectable kinetics.  Hoffman computes the probability that two nearby bursts are due to  the same molecule and therefore allows setting a time-threshold  for considering consecutive bursts as the same single-molecule event.  }  \DIFadd{Other interesting approaches include combining smFRET and FCS   for detecting and quantify kinetics on timescales much shorter   than the diffusion   time~\mbox{%DIFAUXCMD  \cite{laurence_correlation_2007,torres_measuring_2007,nettels_unfolded_2008}}%DIFAUXCMD  .  In addition, Bayes-based methods have been proposed to fit static  populations~\mbox{%DIFAUXCMD  \cite{devore_classic_2012,murphy_bayesian_2014}}%DIFAUXCMD  , or to study dynamics~\mbox{%DIFAUXCMD  \cite{kou_bayesian_2005}}%DIFAUXCMD  .  }  \DIFadd{Finally, two related methods for discriminating between static heterogeneity  and sub-millisecond dynamics are Burst Variance Analysis  (BVA) proposed by Torella~\mbox{%DIFAUXCMD  \cite{Torella_2011} }%DIFAUXCMD  and  kernel density distribution estimator (2CDE) proposed by   Tomov~\mbox{%DIFAUXCMD  \cite{Tomov_2012}}%DIFAUXCMD  . The BVA method is described in the next section.  The 2CDE method, which has been implemented in FRETBursts, computes local  photon rates from timestamps within bursts using  Kernel Density Estimation (KDE)  (FRETBursts includes general-purpose functions  to compute KDE of photon timestamps in the }\verb|phrates| \DIFadd{module,   (}\href{http://fretbursts.readthedocs.io/en/latest/phrates.html}{link}\DIFadd{)).  From time variations of local rates is possible to  detect the occurrence of dynamics. In particular the 2CDE method  builds, for each burst, a quantity $(E)_D$ (or $(1-E)_A$) which is equal   to the burst average $E$ when no dynamics is present, but it is biased   toward an higher (or lower) value in presence of dynamics. From these  quantities a burst ``estimator''   (called FRET-2CDE) is derived. For a user the 2CDE method consists  in plotting the 2-D histogram of $E$ versus FRET-2CDE   in assessing the vertical position of the various populations:   populations centered around FRET-2CDE=10 have  no dynamics while population biased towards higher FRET-2CDE values   have dynamics.  }  \DIFadd{The BVA and 2CDE methods are implemented   in two notebooks included with FRETBursts  (}\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%20Burst%20Variance%20Analysis.ipynb}{BVA link},  \href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%202CDE%20Method.ipynb}{2CDE link}\DIFadd{).  To use them, a user needs to download the relevant notebook  and run the anaysis therein.  The other methods mentioned in this section are not currently   implemented in FRETBursts.  However, users can implement their additional favorite method  taking advantage of FRETBursts functions for burst analysis  and timestamps/bursts manipulation.  To facilitate this task, in the next section,  we show how to perform low-level analysis of timestamps and bursts data   by implementing the BVA method from scratch.  An additional example showing how to split bursts in constant time-bins  can be found in the respective FRETBursts notebook  (}\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%20Working%20with%20timestamps%20and%20bursts.ipynb}{link}\DIFadd{).  These examples serve as a guide for implementing new methods.  We welcome researchers willing to implement new methods to ask questions  on GitHub or on the mailing list.   We also encourage sharing eventual new methods implemented in FRETBursts   for the benefit the entire community.  }  \section*{\DIFadd{Implementing Burst Variance Analysis}}  \label{sec:bva}  \DIFadd{In this section, we describe how to implement burst variance analysis (BVA)  as described in~\mbox{%DIFAUXCMD  \cite{Torella_2011}}%DIFAUXCMD  .  FRETBursts provides well-tested, general-purpose functions for timestamps and burst data  manipulation and therefore simplifies implementing custom burst analysis algorithms such as BVA.  }  \subsection*{\DIFadd{BVA Overview}}  \DIFaddend The BVA method has been developed to \DIFdelbegin \DIFdel{address this issue, namely identifying }\DIFdelend \DIFaddbegin \DIFadd{identify }\DIFaddend the presence of dynamics  in FRET distributions~\cite{Torella_2011},  and has been successfully applied to identify biomolecular processes with  dynamics on the millisecond time-scale~\cite{Torella_2011, Robb_2013}.  The basic idea behind BVA is to subdivide bursts into contiguous burst chunks (sub-bursts)  comprising a fixed number $n$ of photons,  and to compare the empirical variance of acceptor counts of all sub-bursts in a burst,  with the theoretical shot-noise-limited variance.  An empirical variance of sub-bursts larger than the shot-noise limited value indicates  the presence of dynamics. Since the estimation of the sub-bursts variance is affected  by uncertainty, BVA analysis provides and indication of an higher or lower probability  of observing dynamics.  In a FRET (sub-)population originating from a single static FRET efficiency,  the sub-bursts acceptor counts $n_a$ can be modeled as a binomial-distributed random variable  $N_a \sim \operatorname{B}(n, E_p)$, where $n$ is the number of photons in each sub-burst and  $E_p$ is the estimated population proximity-ratio (PR).  Note that we can use the PR because, regardless of the molecular FRET efficiency,  the detected counts are partitioned between donor and acceptor channels according to  a binomial distribution with success probability equal to the PR.  The only approximation done here is neglecting the presence of background  (a reasonable approximation since the backgrounds counts are in general a  very small fraction of the total counts).  We refer the interested reader to~\cite{Torella_2011} for further discussion.  If $N_a$ follows a binomial distribution, the random variable $E_{\textrm{sub}} = N_a/n$,  has a standard deviation reported in eq.~\ref{eq:binom_std}.  \begin{equation}  \label{eq:binom_std}  \operatorname{Std}(E_{\textrm{sub}}) = \left( \frac{E_p\,(1 - E_p)}{n} \right)^{1/2}  \end{equation}  BVA analysis consists of four steps: 1) dividing bursts into consecutive sub-bursts  containing a constant number of consecutive photons~\textit{n}, 2) computing the PR  of each sub-burst, 3) calculating the empirical standard deviation ($s_E$) of sub-bursts  PR in each burst, and 4) comparing $s_E$ to the expected standard deviation  of a shot-noise-limited distribution~(eq.~\ref{eq:binom_std}).  If, as in figure~\ref{fig:bva_static}, the observed FRET efficiency distribution  originates from a static mixture of sub-populations (of different  non-interconverting molecules) characterized by distinct FRET efficiencies,  $s_E$ of each burst is only affected by shot-noise and will follow the expected  standard deviation curve based on eq.~\ref{eq:binom_std}.  Conversely, if the observed distribution originates from biomolecules belonging to a single specie,  which interconverts between different FRET sub-populations (over times comparable to the diffusion  time), as in figure~\ref{fig:bva_dynamic}, $s_E$ of each burst will be larger than the expected  shot-noise-limited standard deviation, and will be located above the shot-noise standard  deviation curve (right panel of figure~\ref{fig:bva_dynamic}).  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.98\columnwidth]{figures/ALEX_BVA_static/ALEX_BVA_static}  \caption{\label{fig:bva_static} \textbf{BVA distribution for a static mixture sample.}  The left panel shows the E-S histogram for a mixture of single stranded DNA (20dT) and double stranded DNA (20dT-20dA) molecules in 200 mM MgCl$_2$. The right panel shows the corresponding BVA plot. Since both 20dT and 20dT-20dA are stable and have no dynamics, the BVA plots shows $s_E$ peaks lying on the static standard deviation curve (\textit{red curve}).%  }  \end{center}  \end{figure}  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.98\columnwidth]{figures/ALEX_BVA_dynamic/ALEX_BVA_dynamic}  \caption{\label{fig:bva_dynamic} \textbf{BVA distribution for a hairpin sample undergoing dynamics.}  The left panel shows the E-S histogram for a single stranded DNA sample ($A_{31}$-TA, see in~\cite{Tsukanov_2013}), designed to form a transient hairpin in 400mM NaCl. The right panel shows the corresponding BVA plot. Since the transition between hairpin and open structure causes a significant change in FRET efficiency, $s_E$ lies largely above the static standard deviation curve (\textit{red curve}).%  }  \end{center}  \end{figure}  \subsection*{BVA Implementation}  The following paragraphs describe the low-level details involved in implementing the BVA using FRETBursts.  The main goal is to illustrate a real-world example of accessing and manipulating timestamps and burst data.  For a ready-to-use BVA implementation users can refer to the corresponding notebook included with FRETBursts  (\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%20Burst%20Variance%20Analysis.ipynb}{link}).  \paragraph*{Python details}  For BVA implementation, two photon streams are needed: all-photons during donor excitation (\DIFdelbegin \DIFdel{Dex}\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex}}\DIFaddend )  and acceptor photons during donor excitation (\DIFdelbegin \DIFdel{DexAem}\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex}A\textsubscript{em}}\DIFaddend ).  These photon stream selections are obtained by computing boolean masks as follows  (see\DIFdelbegin \DIFdel{section}\DIFdelend ~\nameref{sec:burststimes}):  \begin{lstlisting}  Dex_mask = ds.get_ph_mask(ph_sel=Ph_sel(Dex='DAem'))  DexAem_mask = ds.get_ph_mask(ph_sel=Ph_sel(Dex='Aem'))  DexAem_mask_d = AemDex_mask[Dex_mask]  \end{lstlisting}  Here, the first two variables (\verb|Dex_mask| and \verb|DexAem_mask|)  select photon from the all-photons timestamps array,  while \verb|DexAem_mask_d|, selects A-emitted photons from the  array of photons emitted during D-excitation. As shown below,  the latter is needed to count acceptor photons in burst chunks.  Next, we need to express bursts start-stop data as indexes of the D-excitation photon stream  (by default burst start-stop indexes refer to all-photons timestamps array):  \begin{lstlisting}  ph_d = ds_FRET.get_ph_times(ph_sel=Ph_sel(Dex='DAem'))  bursts = ds_FRET.mburst[0]  bursts_d = bursts.recompute_index_reduce(ph_d)  \end{lstlisting}  Here, \verb|ph_d| contains the \DIFdelbegin \DIFdel{Dex }\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex} }\DIFaddend timestamps, \verb|bursts| the original burst data and  \verb|bursts_d| the burst data with start-stop indexes relative to \verb|ph_d|.  Finally, with the previous variables at hand, the BVA algorithm  can be easily implemented by computing the $s_E$ quantity for each burst:  \begin{lstlisting}  n = 7  E_sub_std = []  for burst in bursts_d:  E_sub = []  startlist = range(burst.istart, burst.istop + 2 - n, n)  stoplist = [i + n for i in startlist]  for start, stop in zip(startlist, stoplist):  A_D = DexAem_mask_d[start:stop].sum()  E = A_D / n  E_sub.append(E)  E_sub_std.append(np.std(E_sub))  \end{lstlisting}  Here, \verb|n| is the BVA parameter defining the number of photons in each burst chunk.  The outer loop iterates through bursts, while the inner loop iterates through sub-bursts.  The variables \verb|startlist| and \verb|stoplist| are the list of start-stop indexes for  all sub-bursts in current burst.  In the inner loop, \verb|A_D| and \verb|E| contain the number of acceptor photons and  FRET efficiency for the current sub-burst. Finally, for each burst, the standard deviation  of \verb|E| is appended to the list \verb|E_sub_std|.  By plotting the 2D distribution of $s_E$ (i.e. \verb|E_sub_std|) versus the average (uncorrected) E  we obtain the BVA plots of figure~\ref{fig:bva_static} and~\ref{fig:bva_dynamic}.  \section*{Conclusions}  \label{sec:conclusions}  FRETBursts is an open source and openly developed (see~\nameref{sec:dev}) implementation % SI_link  of established smFRET burst analysis methods  made available to the single-molecule community.  It implements several novel concepts which improve the analysis results, such as  time-dependent background estimation, background-dependent burst search threshold,  burst weighting and $\gamma$-corrected burst size selection.  More importantly, FRETBursts provides a library of thoroughly-tested functions  for timestamps and burst manipulation, making it an ideal platform for  developing and comparing new analytical techniques.  We envision FRETBursts both as a state-of-the-art burst analysis  software as well as a platform for development and assessment of novel algorithms.  To underpin this envisioned role, FRETBursts is developed following modern  software engineering practices, such as DRY principle  (\href{http://en.wikipedia.org/wiki/Don\%27t_repeat_yourself}{link})  to reduce duplication and KISS principle  (\href{http://en.wikipedia.org/wiki/KISS_principle}{link})  to reduce over-engineering. Furthermore, to minimize the number software errors~\cite{Merali_2010,Soergel_2015},  we employ defensive programming~\cite{Prli__2012} which includes code readability,  unit and regression testing and continuous integration~\cite{Eglen_2016}.  Finally, being open source, any scientist can inspect the source code,  fix errors, adapt it to her own needs.  We believe that, in the single-molecule community,  standard open source software implementations, such as FRETBursts, can enhance  reliability and reproducibility of analysis and promote a faster adoption of novel methods,  while reducing the duplication of efforts among different groups.  \section*{Acknowledgments}  We thank Dr. Eyal Nir and Dr. Toma Tomov for support in the implementation of the 2CDE method \DIFdelbegin \DIFdel{.  }\DIFdelend \DIFaddbegin \DIFadd{and Dr. Achilles Kapanidis and Dr. Nicole Robb for providing   experimental data for testing the BVA implementation.  }\DIFaddend This work was supported by National Institutes of Health (NIH)  grant R01-GM95904 and R01-GM069709. Dr. Weiss discloses equity in  Nesher Technologies and intellectual property used in the research  reported here. The work at UCLA was conducted in Dr. Weiss's Laboratory.  \section*{Supporting Information}  \paragraph*{S1 Appendix.}  \label{sec:notebook}  {\bf Notebook Workflow.} A description of the notebook workflow used by FRETBursts.  \paragraph*{S2 Appendix.}  \label{sec:dev}  {\bf Development and Contributions.} A description of development philosophy and techniques  as well as how to contribute to the FRETBursts project.  \paragraph*{S3 Appendix.}  \label{sec:burststimes}  {\bf Timestamps and Burst Data.} General concepts of how timestamps and  bursts data are stored and handled in FRETBursts.  \paragraph*{S4 Appendix.}  \label{sec:plotting}  {\bf Plotting \texttt{Data}.} A description of the syntax used to perform  plots in FRETBursts \DIFaddbegin \DIFadd{and of the 2-D hexagonal-bin histogram used in E-S plots}\DIFaddend .  \paragraph*{S5 Appendix.}  \label{sec:bg_opt_th}  {\bf Background Estimation With Optimal Threshold.} A description of  the algorithm used by FRETBursts to compute the  optimal threshold for background estimation.  \paragraph*{S6 Appendix.}  \label{sec:burstweights_theory}  {\bf Burst Weights.} Theory underpinning the choice of using burst size  as weights for FRET estimation.  \nolinenumbers  \bibliography{bibliography/converted_to_latex.bib%  }  \end{document}           

% Template for PLoS  %DIF LATEXDIFF DIFFERENCE FILE  %DIF DEL full_article_928.tex Tue Jun 28 13:25:24 2016  %DIF ADD full_article_161.tex Thu Jun 30 12:52:03 2016  % Version 3.1 February 2015  %  % To compile to pdf, run:  % latex plos.template  % bibtex plos.template  % latex plos.template  % latex plos.template  % dvipdf plos.template  %  % % % % % % % % % % % % % % % % % % % % % %  %  % -- IMPORTANT NOTE  %  % This template contains comments intended  % to minimize problems and delays during our production  % process. Please follow the template instructions  % whenever possible.  %  % % % % % % % % % % % % % % % % % % % % % % %  %  % Once your paper is accepted for publication,  % PLEASE REMOVE ALL TRACKED CHANGES in this file and leave only  % the final text of your manuscript.  %  % There are no restrictions on package use within the LaTeX files except that  % no packages listed in the template may be deleted.  %  % Please do not include colors or graphics in the text.  %  % Please do not create a heading level below \subsection. For 3rd level headings, use \paragraph*{}.  %  % % % % % % % % % % % % % % % % % % % % % % %  %  % -- FIGURES AND TABLES  %  % Please include tables/figure captions directly after the paragraph where they are first cited in the text.  %  % DO NOT INCLUDE GRAPHICS IN YOUR MANUSCRIPT  % - Figures should be uploaded separately from your manuscript file.  % - Figures generated using LaTeX should be extracted and removed from the PDF before submission.  % - Figures containing multiple panels/subfigures must be combined into one image file before submission.  % For figure citations, please use "Fig." instead of "Figure".  % See http://www.plosone.org/static/figureGuidelines for PLOS figure guidelines.  %  % Tables should be cell-based and may not contain:  % - tabs/spacing/line breaks within cells to alter layout or alignment  % - vertically-merged cells (no tabular environments within tabular environments, do not use \multirow)  % - colors, shading, or graphic objects  % See http://www.plosone.org/static/figureGuidelines#tables for table guidelines.  %  % For tables that exceed the width of the text column, use the adjustwidth environment as illustrated in the example table in text below.  %  % % % % % % % % % % % % % % % % % % % % % % % %  %  % -- EQUATIONS, MATH SYMBOLS, SUBSCRIPTS, AND SUPERSCRIPTS  %  % IMPORTANT  % Below are a few tips to help format your equations and other special characters according to our specifications. For more tips to help reduce the possibility of formatting errors during conversion, please see our LaTeX guidelines at http://www.plosone.org/static/latexGuidelines  %  % Please be sure to include all portions of an equation in the math environment.  %  % Do not include text that is not math in the math environment. For example, CO2 will be CO\textsubscript{2}.  %  % Please add line breaks to long display equations when possible in order to fit size of the column.  %  % For inline equations, please do not include punctuation (commas, etc) within the math environment unless this is part of the equation.  %  % % % % % % % % % % % % % % % % % % % % % % % %  %  % Please contact [email protected] with any questions.  %  % % % % % % % % % % % % % % % % % % % % % % % %  \documentclass[10pt,letterpaper]{article}  \usepackage[top=0.85in,left=2.75in,footskip=0.75in]{geometry}  % Use adjustwidth environment to exceed column width (see example table in text)  \usepackage{changepage}  % Use Unicode characters when possible  %\usepackage[utf8]{inputenc}  % textcomp package and marvosym package for additional characters  \usepackage{textcomp,marvosym}  % fixltx2e package for \textsubscript  \usepackage{fixltx2e}  % amsmath and amssymb packages, useful for mathematical formulas and symbols  \usepackage{amsmath,amssymb}  % cite package, to clean up citations in the main text. Do not remove.  \usepackage{cite}  % Use nameref to cite supporting information files (see Supporting Information section for more info)  \usepackage{nameref}  \usepackage{color}  \usepackage[colorlinks=true,  linkcolor=blue,  urlcolor=blue,  citecolor=black]{hyperref}  % line numbers  \usepackage[right]{lineno}  % ligatures disabled  \usepackage{microtype}  \DisableLigatures[f]{encoding = *, family = * }  % rotating package for sideways tables  \usepackage{rotating}  % Remove comment for double spacing  %\usepackage{setspace}  %\doublespacing  \usepackage{graphicx}  \usepackage[space]{grffile}  \usepackage{latexsym}  \usepackage{textcomp}  \usepackage{longtable}  \usepackage{multirow,booktabs}  % You can conditionalize code for latexml or normal latex using this.  \newif\iflatexml\latexmlfalse  \usepackage[utf8]{inputenc}  \usepackage[ngerman,greek,english]{babel}  %% Neutralize any \includegraphics in the document, as PLOS does not allow figures in the final submission  \makeatletter  \let\orig@includegraphics\includegraphics  \AtBeginDocument{\let\includegraphics\PLOS@ignore}  \newcommand{\PLOS@ignore}[2][]{}  \makeatother  % Text layout  \raggedright  \setlength{\parindent}{0.5cm}  \textwidth 5.25in  \textheight 8.75in  % Bold the 'Figure #' in the caption and separate it from the title/caption with a period  % Captions will be left justified  \usepackage[aboveskip=1pt,labelfont=bf,labelsep=period,justification=raggedright,singlelinecheck=off]{caption}  % Use the PLoS provided BiBTeX style  \bibliographystyle{plos2015}  % Remove brackets from numbering in List of References  \makeatletter  \renewcommand{\@biblabel}[1]{\quad#1.}  \makeatother  % Leave date blank  \date{}  % Header and Footer with logo  \usepackage{lastpage,fancyhdr,graphicx}  \usepackage{epstopdf}  \pagestyle{myheadings}  \pagestyle{fancy}  \fancyhf{}  \makeatletter  \lhead{\orig@includegraphics[width=2.0in]{PLOS-submission.eps}}  \makeatother  \rfoot{\thepage/\pageref{LastPage}}  \renewcommand{\footrule}{\hrule height 2pt \vspace{2mm}}  \fancyheadoffset[L]{2.25in}  \fancyfootoffset[L]{2.25in}  \lfoot{\sf PLOS}  %% Include all macros below  \newcommand{\lorem}{{\bf LOREM}}  \newcommand{\ipsum}{{\bf IPSUM}}  \usepackage{color}  \usepackage{listings}  \lstset{ %  backgroundcolor=\color{white}, % choose the background color  basicstyle=\footnotesize\ttfamily, % size of fonts used for the code  breaklines=true, % automatic line breaking only at whitespace  captionpos=b, % sets the caption-position to bottom  commentstyle=\color{OliveGreen}, % comment style  keywordstyle=\color{blue}, % keyword style  stringstyle=\color{black}, % string literal style  language=Python, % Set your language (you can change the language for each code-block optionally)  frame=l, %  xleftmargin=\fboxsep, %  xrightmargin=-\fboxsep, %  }  \hyphenation{smFRET}  \hyphenation{FRETBursts}  %% END MACROS SECTION  %DIF PREAMBLE EXTENSION ADDED BY LATEXDIFF  %DIF UNDERLINE PREAMBLE %DIF PREAMBLE  \RequirePackage[normalem]{ulem} %DIF PREAMBLE  \RequirePackage{color}\definecolor{RED}{rgb}{1,0,0}\definecolor{BLUE}{rgb}{0,0,1} %DIF PREAMBLE  \providecommand{\DIFaddtex}[1]{{\protect\color{blue}\uwave{#1}}} %DIF PREAMBLE  \providecommand{\DIFdeltex}[1]{{\protect\color{red}\sout{#1}}} %DIF PREAMBLE  %DIF SAFE PREAMBLE %DIF PREAMBLE  \providecommand{\DIFaddbegin}{} %DIF PREAMBLE  \providecommand{\DIFaddend}{} %DIF PREAMBLE  \providecommand{\DIFdelbegin}{} %DIF PREAMBLE  \providecommand{\DIFdelend}{} %DIF PREAMBLE  %DIF FLOATSAFE PREAMBLE %DIF PREAMBLE  \providecommand{\DIFaddFL}[1]{\DIFadd{#1}} %DIF PREAMBLE  \providecommand{\DIFdelFL}[1]{\DIFdel{#1}} %DIF PREAMBLE  \providecommand{\DIFaddbeginFL}{} %DIF PREAMBLE  \providecommand{\DIFaddendFL}{} %DIF PREAMBLE  \providecommand{\DIFdelbeginFL}{} %DIF PREAMBLE  \providecommand{\DIFdelendFL}{} %DIF PREAMBLE  %DIF END PREAMBLE EXTENSION ADDED BY LATEXDIFF  %DIF PREAMBLE EXTENSION ADDED BY LATEXDIFF  %DIF HYPERREF PREAMBLE %DIF PREAMBLE  \providecommand{\DIFadd}[1]{\texorpdfstring{\DIFaddtex{#1}}{#1}} %DIF PREAMBLE  \providecommand{\DIFdel}[1]{\texorpdfstring{\DIFdeltex{#1}}{}} %DIF PREAMBLE  %DIF END PREAMBLE EXTENSION ADDED BY LATEXDIFF  \begin{document}  \vspace*{0.35in}  % Title must be 250 characters or less.  \begin{flushleft}  {\Large  \textbf\newline{\input{title}}  }  \newline  % Insert author names, affiliations and corresponding author email (do not include titles, positions, or degrees).  \\  Antonino Ingargiola\textsuperscript{1*},  Eitan Lerner\textsuperscript{1},  SangYoon Chung\textsuperscript{1},  Shimon Weiss\textsuperscript{1},  Xavier Michalet\textsuperscript{1},  \\  \bigskip  \textbf{1} Dept. Chemistry and Biochemistry, Univ. of California Los Angeles, Los Angeles, CA, USA  \bigskip  % Use the asterisk to denote corresponding authorship and provide email address in note below.  * [email protected]  \end{flushleft}  % Please keep the abstract below 300 words  \section*{Abstract}  Single-molecule Förster Resonance Energy Transfer (smFRET) allows  probing intermolecular interactions and conformational changes in  biomacromolecules, and represents an invaluable tool for studying  cellular processes at the molecular scale. smFRET experiments can  detect the distance between two fluorescent labels (donor and acceptor)  in the 3-10~nm range. In the commonly employed confocal geometry,  molecules are free to diffuse in solution. When a molecule traverses  the excitation volume, it emits a burst of photons, which can be detected  by single-photon avalanche diode (SPAD) detectors. The intensities of  donor and acceptor fluorescence can then be related to the distance  between the two fluorophores.  While recent years have seen a growing number of contributions  proposing improvements or new techniques in smFRET data analysis,  rarely have those publications been accompanied by software implementation.  In particular, despite the widespread application of smFRET, no complete  software package for smFRET burst analysis is freely available to date.  In this paper, we introduce FRETBursts, an open source software  for analysis of freely-diffusing smFRET data.  FRETBursts allows executing all the fundamental steps of smFRET bursts  analysis using state-of-the-art as well as novel techniques,  while providing an open, robust and well-documented implementation.  Therefore, FRETBursts represents an ideal platform for comparison  and development of new methods in burst analysis.  We employ modern software engineering principles in order to  minimize bugs and facilitate long-term maintainability.  Furthermore, we place a strong focus on reproducibility by relying on  Jupyter notebooks for FRETBursts execution.  Notebooks are executable documents capturing all the steps of the  analysis (including data files, input parameters, and results) and can  be easily shared to replicate complete smFRET analyzes.  Notebooks allow beginners to execute complex workflows  and advanced users to customize the analysis for their own needs.  By bundling analysis description, code and results in a single document,  FRETBursts allows to seamless share analysis workflows  and results, encourages reproducibility and facilitates collaboration  among researchers in the single-molecule community.  % Please keep the Author Summary between 150 and 200 words  % Use first person. PLOS ONE authors please skip this step.  % Author Summary not valid for PLOS ONE submissions.  %\section*{Author Summary}  \linenumbers  \section*{Introduction}  \subsection*{Open Science and Reproducibility}  Over the past 20 years, single molecule FRET (smFRET) has grown into one of the most  useful techniques in single-molecule spectroscopy~\cite{Weiss_1999,Hohlbein_2014}.  While it is possible to extract information on sub-populations using ensemble measurements  (e.g. ~\cite{Lerner_2014,Rahamim_2015}),  smFRET unique feature is its ability to very straightforwardly resolve conformational  changes of biomolecules or measure binding-unbinding kinetics in heterogeneous  samples~\cite{Selvin_2000,Roy_2008,Schuler_2008,Sisamakis_2010,Haran_2012}.  smFRET measurements on freely diffusing molecules (the focus of this paper)  have the additional advantage, over measurements performed on immobilized molecules,  of allowing to probe molecules and processes without perturbation from surface  immobilization or additional functionalization needed for surface  attachment~\cite{Eggeling_1998,Dahan_1999}.  The increasing amount of work using freely-diffusing smFRET has motivated  a growing number of theoretical contributions to the specific topic of data  analysis~\cite{Fries_1998,Eggeling_2001,Zhang_2005,Gopich_2005,Lee_2005,Nir_2006,Antonik2006,Gopich_2007,Gopich_2008,Camley_2009,Santoso_2010,Torella_2011,Tomov_2012}.  Despite this profusion of publications, most research groups still rely on  their own implementation of a limited number of methods, with very little  collaboration or code sharing.  To clarify this statement, let us point that our own group's past smFRET papers  merely mention the use of custom-made software without additional details~\cite{Lee_2005,Nir_2006}.  Even though some of these software tools are made available upon request,  or sometimes shared publicly on websites,  it remains hard to reproduce and validate results from different groups,  let alone build upon them.  Additionally, as new methods are proposed in literature,  it is generally difficult to quantify their performance compared to other methods.  An independent quantitative assessment  would require a complete reimplementation, an effort few groups can afford.  As a result, potentially useful analysis improvements  are either rarely or slowly adopted by the community.  In contrast with other established traditions such as  sharing protocols and samples, in the domain of scientific software,  we have relegated ourselves to islands of non-communication.  From a more general standpoint, the non-availability of the code  used to produce scientific results, hinders reproducibility,  makes it impossible to review and validate the software's correctness  and prevents improvements and extensions by other scientists.  This situation, common in many disciplines,  represents a real impediment to the scientific progress.  Since the pioneering work of the Donoho group in the 90's~\cite{Buckheit_1995},  it has become evident that developing and maintaining open source scientific software  for reproducible research is a critical requirement of the modern  scientific enterprise~\cite{Ince_2012,Vihinen_2015}.  %Peer-reviewed publications describing such software are also necessary~\cite{Pradal_2013},  %although the debate is still open on the most effective model for peer-reviewing this  %class of publications~\cite{Check_Hayden_2013,Check_Hayden_2015}  %(\href{https://software-carpentry.org/blog/2015/04/quality-is-free-getting-there-isnt.html}{Willson 2015})  %(\href{https://www.mozillascience.org/effective-code-review-for-journals}{Mills 2015})  %(\href{http://ivory.idyll.org/blog/2015-we-live-in-a-bubble.html}{Brown 2015} and \href{http://ivory.idyll.org/blog/on-code-review-of-scientific-code.html}{2013}).  Other disciplines have started tackling this issue~\cite{Eglen_2016},  and even in the single-molecule field a few recent publications have provided  software for analysis of surface-immobilized experiments~\cite{McKinney_2006,Bronson_2009,Greenfeld_2012,K_nig_2013,van_de_Meent_2014}.  For freely-diffusing smFRET experiments, although it is common to find mention of  ``code available from the authors upon reques'' in publications, there is a dearth  of such open source code, with, to our knowledge, the notable exception of a single  example~\cite{Murphy2014}.  To address this issue, we have developed FRETBursts,  an open source Python software for analysis of freely-diffusing single-molecule FRET measurements.  FRETBursts can be used, inspected and modified by anyone interested in using  state-of-the art smFRET analysis methods or implementing modifications or completely new techniques.  FRETBursts therefore represents an ideal platform  for quantitative comparison of different methods for smFRET burst analysis.  Technically, a strong emphasis has been given to the reproducibility of complete analysis  workflows. FRETBursts uses Jupyter Notebooks~\cite{Shen_2014},  an interactive and executable document containing textual narrative, input parameters,  code, and computational results (tables, plots, etc.). A notebook thus captures the various analysis steps  in a document which is easy to share and execute.  To minimize the possibility of bugs being introduced inadvertently~\cite{Soergel_2015},  we employ modern software engineering techniques  such as unit testing and continuous integration~\cite{Wilson_2014,Eglen_2016}.  FRETBursts is hosted on GitHub~\cite{Blischak_2016,Prli__2012},  where users can write comments, report issues or contribute code.  In a related effort, we recently introduced Photon-HDF5~\cite{Ingargiola2016},  an open file format for timestamp-based single-molecule fluorescence  experiments. An other related open source tool is PyBroMo~\cite{Ingargiola_2016},  a freely-diffusing smFRET simulator which produces Photon-HDF5 files that are  directly analyzable with FRETBursts.  Together with all the aforementioned tools, FRETBursts contributes to the growing  ecosystem of open tools for reproducible science in the single-molecule field.  \subsection*{Paper Overview}  This paper is written as an introduction to smFRET burst analysis and  its implementation in FRETBursts.  The aim is illustrating the specificities and  trade-offs involved in various approaches  with sufficient details to enable readers  to customize the analysis for their own needs.  After a brief overview of FRETBursts features (section~\nameref{sec:overview}),  we introduce essential concepts and terminology for smFRET burst analysis  (section~\nameref{sec:concepts}).  In section~\nameref{sec:analysis}, we illustrate the steps involved  in smFRET burst analysis: (i) data loading (section~\nameref{sec:dataload}),  (ii) definition of the excitation alternation periods  (section~\nameref{sec:alternation}), (iii) background correction  (section~\nameref{sec:bg_calc}), (iv) burst search  (section~\nameref{sec:burstsearch}),  (v) burst selection (section~\nameref{sec:burstsel}) and  (vi) FRET histogram fitting (section~\nameref{sec:fretfit}).  As an example  of implementation of an advanced data processing technique,  section~\nameref{sec:bva} walks the reader thorough implementing  Burst Variance Analysis (BVA)~\cite{Torella_2011}.  Finally, section~\nameref{sec:conclusions} summarizes what we believe  to be the strengths of FRETBursts software.  Throughout this paper,  links to relevant sections of documentation and other web resources  are displayed as ``(link)''.  In order to make the text more legible,  we have concentrated Python-specific details in paragraphs titled  \textit{Python details}. These subsections provide deeper insights for readers  already familiar with Python and can be initially skipped by readers who are not.  Finally, note that all commands and figures in this paper can be regenerated  using the accompanying notebooks  (\href{https://github.com/tritemio/fretbursts_paper}{link}).  \section*{FRETBursts Overview}  \label{sec:overview}  \subsection*{Technical Features}  FRETBursts can analyze smFRET measurements  from one or multiple excitation spots~\cite{Ingargiola_2013}. The supported  excitation schemes include single laser, alternating laser excitation (ALEX)  with either CW lasers (μs-ALEX~\cite{Kapanidis_2005})  or pulsed lasers (ns-ALEX~\cite{Laurence_2005} or  pulsed-interleaved excitation (PIE)~\cite{M_ller_2005}).  The software implements both standard and novel algorithms for smFRET data analysis  including background estimation as a function of time (including background accuracy  metrics), sliding-window burst search~\cite{Eggeling_1998},  dual-channel burst search (DCBS)~\cite{Nir_2006} and  modular burst selection methods based on user-defined criteria  (including a large set of pre-defined selection rules). Novel features include burst size  selection with $\gamma$-corrected burst sizes, burst weighting, burst search with  background-dependent threshold (in order to guarantee a minimal signal-to-background  ratio~\cite{Michalet_2012}).  Moreover, FRETBursts provides a large set of fitting options to characterize FRET subpopulations.  In particular, distributions of burst quantities (such as $E$ or $S$) can be assessed  through (1) histogram fitting (with arbitrary model functions),  (2) non-parametric weighted kernel density estimation (KDE), (3) weighted  expectation-maximization (EM), (4) maximum likelihood fitting using Gaussian models  or Poisson statistic. Finally FRETBursts includes a large number of  predefined and customizable plot functions which (thanks to the \textit{matplotlib}  graphic library) produce publication quality plots in a wide range of formats.  Additionally, implementations of population dynamics analysis such  as Burst Variance Analysis (BVA)~\cite{Torella_2011} and two-channel  kernel density distribution estimator (2CDE)~\cite{Tomov_2012}  are available as FRETBursts notebooks  \DIFdelbegin \DIFdel{.  }\DIFdelend \DIFaddbegin \DIFadd{(}\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%20Burst%20Variance%20Analysis.ipynb}{BVA link},  \href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%202CDE%20Method.ipynb}{2CDE link}\DIFadd{).  }\DIFaddend   \subsection*{Software Availability}  FRETBursts is hosted and openly developed on GitHub. FRETBursts homepage  (\href{http://tritemio.github.io/FRETBursts}{link})  contains links to the various resources. \DIFaddbegin \DIFadd{Pre-built packages are provided for  Windows, OS X and Linux. }\DIFaddend Installation instructions  can be found in the Reference Documentation  (\href{http://fretbursts.readthedocs.org/en/latest/getting_started.html}{link}).  A description of FRETBursts execution using Jupyter notebooks is reported  in~\nameref{sec:notebook}. % SI_link  Detailed information on development style, testing strategies and  contributions guidelines are reported in~\nameref{sec:dev}. % SI_link  Finally, to facilitate evaluation and comparison with other software,  we set up an on-line services allowing to execute FRETBursts  without requiring any installation on the user's computer (\href{https://github.com/tritemio/FRETBursts_notebooks#run-online}{link}).  \section*{Architecture and Concepts}  \label{sec:concepts}  In this section, we introduce some general burst analysis concepts  and notations used in FRETBursts.  \subsection*{Photon Streams}  \label{sec:ph_streams}  The raw data collected during a smFRET experiment consists in one or more arrays of  photon timestamps, whose temporal resolution is set by the acquisition hardware,  typically between 10 and 50 ns.  In single-spot measurements, all timestamps are stored in a single array. In multispot  measurements~\cite{Ingargiola_2013}, there are as many timestamps arrays  as excitation spots.  Each array contains timestamps from both donor (D) and acceptor (A) channels.  When alternating excitation lasers are used (ALEX measurements)~\cite{Lee_2005},  a further distinction between photons emitted during the D or A excitation periods can be made.  In FRETBursts, the corresponding sets of photons are called ``photon streams''   and are specified with a \verb|Ph_sel| object  (\href{http://fretbursts.readthedocs.org/en/latest/ph_sel.html}{link}).  In non-ALEX smFRET data, there are 3 photon streams  (table~\ref{tab:ph_sel_smfret}), while in \DIFaddbegin \DIFadd{two-color }\DIFaddend ALEX data,   there are 5 streams (table~\ref{tab:ph_sel_alex}).  The \verb|Ph_sel| class (\href{http://fretbursts.readthedocs.org/en/latest/ph_sel.html}{link})  allows the specification of any combination of photon streams.  For example, in ALEX measurements, the D-emission during A-excitation stream is  usually ignored because it does not contain any useful signal~\cite{Lee_2005}.  To indicate all but photons in this photon stream, the syntax is  \verb|Ph_sel(Dex='DAem', Aex='Aem')|, which indicates selection of donor  and acceptor photons (\verb|DAem|) during donor excitation (\verb|Dex|) and only acceptor  photons (\verb|Aem|) during acceptor excitation (\verb|Aex|).  \begin{table}  \begin{tabular}{l|l}  Photon selection & code \\  \hline  All-photons & \verb|Ph_sel('all')|\\  D-emission & \verb|Ph_sel(Dex='Dem')|\\  A-emission & \verb|Ph_sel(Dex='Aem')|\\  \end{tabular}  \caption{\label{tab:ph_sel_smfret}Photon selection syntax (non-ALEX)}  \end{table}  \begin{table}  \begin{tabular}{l|l}  Photon selection & code \\  \hline  All-photons & \verb|Ph_sel('all')|\\  D-emission during D-excitation & \verb|Ph_sel(Dex='Dem')|\\  A-emission during D-excitation & \verb|Ph_sel(Dex='Aem')|\\  D-emission during A-excitation & \verb|Ph_sel(Aex='Dem')|\\  A-emission during A-excitation & \verb|Ph_sel(Aex='Aem')|\\  \end{tabular}  \caption{\label{tab:ph_sel_alex}Photon selection syntax (ALEX)}  \end{table}  \subsection*{Background Definitions}  \label{sec:bg_intro}  An estimation of the background rates is needed to both select a proper threshold for  burst search, and to correct the raw burst counts by \DIFdelbegin \DIFdel{subtraction of }\DIFdelend \DIFaddbegin \DIFadd{subtracting }\DIFaddend background counts.  The recorded stream of timestamps is the result of two processes: one characterized  by a high count rate, due to fluorescence photons of single molecules crossing the  excitation volume, and another characterized by a lower count rate, due to ``background  counts'' originating from detector dark counts, afterpulsing, out-of-focus molecules  and sample scattering and/or impurities~\cite{Edman_1996,Gopich_2008}.  The signature of these two types of processes can be  observed in the inter-photon delays distribution (i.e. the waiting times  between two subsequent timestamps) as illustrated in figure~\ref{fig:bg_dist_all}(a).  The ``tail'' of the distribution (a straight line in semi-log scale) corresponds  to exponentially-distributed time-delays, indicating that those counts are generated by a  Poisson process. At short  timescales, the distribution departs from the exponential due to the contribution  of the higher rate process of single molecules traversing the excitation volume.  To estimate the background rate (i.e. the inverse of the exponential time constant),  it is necessary to define a time-delay threshold above which the distribution  can be considered exponential.  Finally, a parameter estimation method needs to be specified, such as Maximum  Likelihood Estimation (MLE) or non-linear least squares curve fitting of  the time-delay histogram (both supported in FRETBursts).  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.77\columnwidth]{figures/ph_delays_distrib_all/ph_delays_distrib_all}  \caption{\label{fig:bg_dist_all} \textbf{Inter-photon delays fitted with and exponential function.}  Experimental distributions of inter-photon delays (\textit{dots}) and  corresponding fits of the exponential tail (\textit{solid lines}).  (\textit{Panel a}) An example of inter-photon delays distribution (\textit{red dots}) and an exponential fit  of the tail of the distribution (\textit{black line}).  (\textit{Panel b}) Inter-photon delays distribution and exponential fit for different photon streams as obtained with \texttt{dplot(d, hist\_bg)}. The \textit{dots} represent the experimental histogram for the different photon streams. The \textit{solid lines} represent the corresponding exponential fit of the tail of the distributions. The legend shows abbreviations of the photon streams  and the fitted background rates.%  }  \end{center}  \end{figure}  It is advisable to monitor the background as a function of time  throughout the measurement, in order to account for possible variations.  Experimentally, we found that when the background is not constant,  it usually varies  on time scales of tens of seconds (see figure~\ref{fig:bg_timetrace}).  FRETBursts divides the acquisition in constant-duration time  windows called \textit{background periods} and computes the background rates for  each of these windows (see section~\nameref{sec:bg_calc}).  Note that FRETBursts uses these local background rates also during burst search,  in order to compute time-dependent burst detection thresholds  and for background correction of burst data (see section~\nameref{sec:burstsearch}).  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.91\columnwidth]{figures/background_timetrace/background_timetrace}  \caption{\label{fig:bg_timetrace} \textbf{Background rates as a function of time.}  Estimated background rate as a function of time for two μs-ALEX measurements.  Different colors represent different photon streams.  (\textit{Panel a}) A measurement performed with a sealed sample chamber  exhibiting constant a background as a function of time.  (\textit{Panel b}) A measurement performed on an unsealed sample exhibiting  significant background variations due to sample evaporation and/or  photobleaching (likely impurities on the cover-glass).  These plots are produced by the command  \texttt{dplot(d, timetrace\_bg)} after estimation of background.  Each data point in these figures is computed for a 30~s time window.%  }  \end{center}  \end{figure}  \subsection*{The \texttt{Data} Class}  \label{sec:data_intro}  The \verb|Data| class  (\href{http://fretbursts.readthedocs.org/en/latest/data_class.html}{link})  is the fundamental data container in FRETBursts. It contains the  measurement data and parameters (attributes) as well as several methods  for data analysis (background estimation, burst search, etc...).  All analysis results (bursts data, estimated parameters) are also stored  as \verb|Data| attributes.  There are 3 important ``burst counts'' attributes which contain  the number of photons detected in the donor or the acceptor channel  during donor or acceptor excitation (table~\ref{tab:data_n}).  The attributes in table~\ref{tab:data_n} are background-corrected by default.  Furthermore, \verb|na| is corrected for leakage and direct excitation  (section~\nameref{sec:corrcoeff}) if the relative coefficients are specified  (by default they are 0).  There is also a closely related attribute named \verb|nda| for donor photons  during acceptor excitation. \verb|nda| is normally neglected as it only contains  background.  \begin{table}  \begin{tabular}{l p{0.8\columnwidth}}  Name & Description \\  \hline  \verb|nd| & number of photons detected by the donor channel (during donor excitation period in ALEX case)\\  \verb|na| & number of photons detected by the acceptor channel (during donor excitation period in ALEX case)\\  \verb|naa| & number of photons detected by the acceptor channel during acceptor excitation period (present only in ALEX measurements)\\  \end{tabular}  \caption{\label{tab:data_n}\texttt{Data} attributes names and descriptions for burst photon counts in different photon streams.}  \end{table}  \paragraph*{Python details}  Many \verb|Data| attributes are lists of arrays (or scalars) with the length of the lists  equal to the number of excitation spots. This means that in  single-spot measurements, an array of burst-data  is accessed by specifying the index as 0, for example \verb|Data.nd[0]|.  \verb|Data| implements a shortcut syntax to access the first element of a list  with an underscore, so that an equivalently syntax is  \verb|Data.nd_| instead of \verb|Data.nd[0]|.  \subsection*{Introduction to Burst Search}  \label{sec:burstsearch_intro}  Identifying single-molecule fluorescence bursts in the stream of photons is  one of the most crucial steps in the analysis of freely-diffusing single-molecule FRET data.  The widely used ``sliding window'' algorithm, introduced by the Seidel group in  1998~\cite{Eggeling_1998,Fries_1998}, involves searching for  $m$ consecutive photons detected during a period shorter than  $\Delta t$. In other words, bursts are regions of the photon stream where the  local rate (computed using $m$ photons) is above a minimum threshold rate.  Since a universal criterion to choose the rate threshold and  the number of photons $m$ is, as of today, lacking, it has become a common  practice to manually adjust those parameters for each specific measurement.  \DIFaddbegin \DIFadd{Commonly employed values for $m$ are between 5 and 15 photons.  }\DIFaddend   A more general approach consists in taking into account the background rate of  the specific measurements and in choosing a rate threshold that is $F$ times  larger than the background rate \DIFaddbegin \DIFadd{(typical values for $F$ are between 4 and 9)}\DIFaddend .   This approach ensures that all resulting bursts  have a signal-to-background ratio (SBR) larger than  $(F-1)$~\cite{Michalet_2012}. A consistent criterion for choosing the threshold is  particularly important when comparing different measurements with different background  rates, when the background significantly varies during measurements or in  multi-spot measurements where each spot has a different background rate.  A second important aspect of burst search is the choice of photon stream used  to perform the search.  In most cases, for instance when identifying FRET sub-populations,  the burst search should use all \DIFdelbegin \DIFdel{photons (i.e. APBS). In some }\DIFdelend \DIFaddbegin \DIFadd{the photons, the so called  all-photon burst search (APBS)~\mbox{%DIFAUXCMD  \cite{Eggeling_1998,Fries_1998,Nir_2006}}%DIFAUXCMD  .  In }\DIFaddend other cases, \DIFaddbegin \DIFadd{for example }\DIFaddend when focusing on  donor-only or \DIFdelbegin \DIFdel{acceptor only }\DIFdelend \DIFaddbegin \DIFadd{acceptor-only }\DIFaddend populations, it is better to perform   the search using only donor or acceptor signal.  In order to handle the general case and to provide flexibility,  FRETBursts allows performing the burst search on arbitrary selections of photons.  (see section~\nameref{sec:ph_streams} for more information on photon stream definitions).  Additionally, Nir~\textit{et al.}~\cite{Nir_2006} proposed \DIFdelbegin \DIFdel{DCBS (``}\DIFdelend \DIFaddbegin \DIFadd{a }\DIFaddend dual-channel   burst search \DIFdelbegin \DIFdel{'')   , }\DIFdelend \DIFaddbegin \DIFadd{(DCBS)   }\DIFaddend which can help mitigating artifacts due to photophysics effects such as blinking.  During DCBS, a search is performed \DIFdelbegin \DIFdel{in parallel }\DIFdelend on two photon streams  and bursts are defined as periods during which both photon streams   exhibit a rate higher than  the threshold, implementing the equivalent of an AND logic operation.  Conventionally, the term DCBS refers to a burst search where the two photon streams  are (1) all photons during donor excitation (\verb|Ph_sel(Dex='DAem')|) and  (2) acceptor channel photons during acceptor excitation (\verb|Ph_sel(Aex='Aem')|).  In FRETBursts, the user can choose arbitrary photon streams as input, an in general  this kind of search is called a ``AND-gate burst search''.  After burst search, it is necessary to select  bursts, for instance by specifying a minimum number of photons (or burst size). In the most  basic form, this selection can be performed during burst search by discarding  bursts with size smaller than a threshold $L$ \DIFaddbegin \DIFadd{(typically 30 or higher)}\DIFaddend ,   as originally proposed by  Eggeling~\textit{et al.}~\cite{Eggeling_1998}.  This method, however, neglects the effect  of background and $\gamma$ factor on the burst size and can lead to a selection  bias for some channels and/or sub-populations.  For this reason, we suggest performing a burst size selection after background  correction, taking into account the $\gamma$ factor, as discussed in  sections~\nameref{sec:burstsizeweights} and~\nameref{sec:burstsel}.  In special cases, users may choose to replace (or combine)  the burst selection based on burst size  with another criterion such as burst duration or brightness (see section~\nameref{sec:burstsel}).  \subsection*{Corrected Burst Sizes and Weights}  \label{sec:burstsizeweights}  The number of photons detected during a burst --the ``burst size''--  is computed using either all photons, or photons detected  during donor excitation period. To compute the burst size, FRETBursts uses  one of the following formulas:  \begin{equation}  \label{eq:burstsize_dex}  n_{dex} = n_a + \gamma\,n_d  \end{equation}  \begin{equation}  \label{eq:burstsize_allph}  n_t = n_a + \gamma\,n_d + n_{aa}  \end{equation}  \noindent where $n_d$, $n_a$ and $n_{aa}$ are, similarly to the attributes  in table~\ref{tab:data_n}, the background-corrected  burst counts in different channels and excitation periods.  The factor $\gamma$ takes into account  different fluorescence quantum yields of donor and acceptor fluorophores and different  photon detection efficiencies between donor and acceptor detection  channels~\cite{Deniz_1999,Lee_2005}.  Eq.~\ref{eq:burstsize_dex} includes counts collected during donor excitation periods only,  while eq.~\ref{eq:burstsize_allph} includes all counts.  Burst sizes computed according to eq.~\ref{eq:burstsize_dex}  or~\ref{eq:burstsize_allph} are called $\gamma$-corrected burst sizes.  The burst search algorithm yields a set of bursts whose sizes  approximately follow an exponential distribution.  Compared to bursts with smaller sizes, bursts with large sizes are less frequent,   but contain more information per-burst (having higher SNR).  Therefore, selecting bursts by size is an important step (see \DIFdelbegin \DIFdel{section~}\DIFdelend \nameref{sec:burstsel}).  A threshold set too low may result in unresolvable sub-populations  because of broadening of FRET peaks and appearance of shot-noise artifacts  in the FRET (and \DIFdelbegin \DIFdel{S}\DIFdelend \DIFaddbegin \DIFadd{$S$}\DIFaddend ) distribution (i.e. spurious narrow peaks due to \DIFdelbegin \DIFdel{E and S }\DIFdelend \DIFaddbegin \DIFadd{$E$ and $S$ }\DIFaddend being  computed as the ratio of small integers).  Conversely, too large a threshold may result in too low a number of bursts  therefore poor representation of the FRET distribution.  Additionally, especially when computing fractions of sub-populations  (e.g. ratio of number of bursts in each sub-population),  it is important to use $\gamma$-corrected burst sizes as selection criterion,  in order to avoid under-representing some FRET sub-populations  due to different quantum yields of donor and acceptor dyes and/or  different photon detection efficiencies of donor and acceptor channels.  \DIFaddbegin \DIFadd{An alternative method to apply the $\gamma$ correction is to randomly  discard a constant fraction of photons chosen randomly from either   the Dem or Aem photon stream~\mbox{%DIFAUXCMD  \cite{Nir_2006}}%DIFAUXCMD  . This   simple method transforms the measurement data in order to  achieve $\gamma=1$, overcoming the issue of selection bias between populations.  This approach has also the advantage of preserving  the binomial distribution of D and A photons in each burst, so that peaks  of FRET populations are easier to model statistically.  The only drawback is that, by discarding a fraction of photons,  this method leads to information loss and therefore to a potential   decrease in sensitivity and/or accuracy.  }  \DIFaddend A simple way to mitigate the dependence of the FRET distribution on  the burst size selection threshold is weighting bursts proportionally to their size  so that the bursts with largest sizes will have the largest weights.  Using size as weights (instead of any other monotonically increasing function  of size) can be justified noticing that the variance of bursts proximity ratio (PR) is  inversely proportional to the burst size (see~\nameref{sec:burstweights_theory} for details). % SI_link  In general, a weighting scheme is used for building efficient estimators for a population  parameter (e.g. the population FRET efficiency $E_p$).  But, it can also be used to build weighted histograms or Kernel Density  Estimation (KDE) plots which emphasize FRET subpopulations peaks  without excluding small size bursts.  Traditionally, for optimal results when not using weights, the  FRET histogram is manually adjusted by finding an ad-hoc (high)  size-threshold which selects only bursts with the highest size (and thus lowest variance).  Building size-weighted FRET histograms is a simple method to balance  the need of reducing the peaks width with the need of including as much bursts  as possible to reduce statistical noise.  As a practical example, by fixing the burst size threshold to a low value (e.g. 10-20 photons)  and using weights, is possible to build a FRET histogram with well-defined FRET sub-populations peaks  without the need of searching an optimal burst-size threshold (\nameref{sec:burstweights_theory}).  \paragraph*{Python details}  FRETBursts has the option to weight bursts using $\gamma$-corrected  burst sizes which optionally include acceptor excitation photons \verb|naa|.  A weight proportional to the burst size is applied by passing the argument  \verb|weights='size'| to histogram or KDE plot functions. The \verb|weights|  keyword can be also passed to fitting functions in order to fit  the weighted E or S distributions (see section~\nameref{sec:fretfit}).  Other weighting functions (for example depending quadratically on the size)  are listed in the \verb|fret_fit.get_weights| documentation  (\href{http://fretbursts.readthedocs.org/en/latest/fret_fit.html#fretbursts.fret_fit.get_weights}{link}).  However, using weights different from the size is not recommended  due to their less efficient use of burst information   \DIFaddbegin \DIFadd{(}\nameref{sec:burstweights_theory}\DIFadd{)}\DIFaddend .  \section*{smFRET Burst Analysis}  \label{sec:analysis}  \subsection*{Loading the Data}  \label{sec:dataload}  While FRETBursts can load several data files formats,  we encourage users to adopt the recently introduced Photon-HDF5  file format~\cite{Ingargiola2016}.  Photon-HDF5 is an HDF5-based, open format, specifically designed  for freely-diffusing smFRET and  other timestamp-based experiments.  Photon-HDF5 is a self-documented, platform- and language-independent binary format,  which supports compression and allows saving photon data (e.g. timestamps)  and measurement-specific metadata  (e.g. setup and sample information, authors, provenance, etc.).  Moreover, Photon-HDF5 is designed for long-term data preservation and aims  to facilitate data sharing  between different software and research groups.  All example data files provided with FRETBursts use the Photon-HDF5 format.  To load data from a Photon-HDF5 file, we use the function \verb|loader.photon_hdf5|  (\href{http://fretbursts.readthedocs.org/en/latest/loader.html#fretbursts.loader.photon_hdf5}{link}):  \begin{lstlisting}  d = loader.photon_hdf5(filename)  \end{lstlisting}  \noindent  where \verb|filename| is a string containing the file path.  This command loads the measurement data into the variable \verb|d|,  a \verb|Data| object (see section~\nameref{sec:data_intro}).  The same command can load data from a variety of smFRET measurements supported  by the Photon-HDF5 format, taking advantage of the rich metadata included with each file.  For instance, data generated using different excitation schemes such as CW excitation  or pulsed excitation, single-laser vs two alternating lasers, etc.,  or with any number of excitation spots, are automatically recognized and interpreted accordingly.  FRETBursts also supports loading μs-ALEX data stored in .sm files  (a custom binary format used in the Weiss lab) and  ns-ALEX data stored in .spc files (a binary format used by TCSPC Becker \& Hickl acquisition hardware).  Alternatively, these and other formats (such as ht3, a binary format used by PicoQuant hardware)  can be converted into Photon-HDF5 files using phconvert,  a file conversion library and utility for Photon-HDF5  (\href{http://photon-hdf5.github.io/phconvert/}{link}).  More information on loading different file formats  can be found in the \verb|loader| module's documentation  (\href{http://fretbursts.readthedocs.org/en/latest/loader.html}{link}).  \subsection*{Alternation Parameters}  \label{sec:alternation}  For μs-ALEX and ns-ALEX data, Photon-HDF5 normally stores parameters defining  alternation periods corresponding to donor and acceptor laser excitation.  At load time, a user can plot these parameters and change them if deemed necessary.  In μs-ALEX measurements~\cite{Kapanidis_2004},  CW laser lines are alternated on timescales of the order of 10 to 100~μs.  Plotting an histogram of timestamps modulo the alternation period, it  is possible to identify the donor and acceptor excitation periods (see figure~\ref{fig:altern_hist_double}a).  In ns-ALEX measurements~\cite{Laurence_2005},  pulsed lasers with equal repetition rates are delayed with respect  to one another with typical delays of 10 to 100~ns.  In this case, forming an histogram of TCSPC times (nanotimes) will allow  the definition of periods of fluorescence after excitation  of either the donor or the acceptor (see figure~\ref{fig:altern_hist_double}b).  In both cases, the function  \verb|plot_alternation_hist|  (\href{http://fretbursts.readthedocs.org/en/latest/plots.html#fretbursts.burst_plot.plot_alternation_hist}{link})  will plot the relevant alternation histogram (figure~\ref{fig:altern_hist_double})  using currently selected (or default) values for donor and acceptor excitation periods.  \begin{figure}[h!]  \begin{center}  \includegraphics[width=1\columnwidth]{figures/ALEX_alternation_double/ALEX_alternation_double}  \caption{\label{fig:altern_hist_double}  \textbf{Alternation histograms for μs-ALEX and ns-ALEX measurements.}  Histograms used for the selection/determination  of the alternation periods for two typical smFRET-ALEX experiments.  Distributions of photons detected by donor channel are in \textit{green},  and by acceptor channel in \textit{red}.  The light \textit{green} and \textit{red} shaded areas indicate the donor  and acceptor period definitions.  (a) μs-ALEX alternation histogram, i.e. histogram of timestamps \textit{modulo}  the alternation period for a smFRET measurement.  (b) ns-ALEX TCSPC nanotime histogram for a smFRET measurement.  Both plots have been generated by the same plot function  (\texttt{plot\_alternation\_hist()}).  Additional information on these specific measurements can be found in the  attached notebook  (\href{http://nbviewer.jupyter.org/github/tritemio/fretbursts_paper/blob/master/notebooks/Figures\%20-\%20ALEX\%20histograms.ipynb}{link}).%  }  \end{center}  \end{figure}  To change the period definitions, we can type:  \begin{lstlisting}  d.add(D_ON=(2100, 3900), A_ON=(100, 1900))  \end{lstlisting}  \DIFaddbegin \noindent \DIFaddend where \verb|D_ON| and \verb|A_ON| are tuples (pairs of numbers) representing  the \textit{start} and \textit{stop} values for D or A excitation periods.  The previous command works for both μs-ALEX and ns-ALEX measurements.  After changing the parameters, a new alternation plot will show the updated  period definitions.  The alternation period definition can be applied to the data  using the function \verb|loader.alex_apply_period|  (\href{http://fretbursts.readthedocs.org/en/latest/loader.html#fretbursts.loader.alex_apply_period}{link}):  \begin{lstlisting}  loader.alex_apply_period(d)  \end{lstlisting}  After this command, \verb|d| will contain only photons inside the defined excitation periods.  If the user needs to update the periods definition, the data file will need to be  reloaded and the steps above repeated as described.  \subsection*{Background Estimation}  \label{sec:bg_calc}  The first step of smFRET analysis involves estimating background rates.  For example, \DIFdelbegin \DIFdel{to compute the background }\DIFdelend \DIFaddbegin \DIFadd{the following command:  }  %DIF > Don't split command on two lines for PLOS  \begin{lstlisting}  d.calc_bg(bg.exp_fit, time_s=30, tail_min_us='auto')  \end{lstlisting}  \noindent \DIFadd{estimates the background rates in windows of 30~s  using the default iterative algorithm for choosing the  fitting threshold (}\nameref{sec:bg_intro}\DIFadd{). %DIF > PLOS: remove section and use nameref  Beginner users can simply use the previous command and   proceed to burst search (}\nameref{sec:burstsearch}\DIFadd{). %DIF > PLOS: remove section and use nameref  For more advanced users, this section provides details on   the different background estimation and plotting functions   provided by FRETBursts.  }  \DIFadd{As a start, we show how to estimate the background }\DIFaddend every 30~s,   using a \DIFdelbegin \DIFdel{minimal }\DIFdelend \DIFaddbegin \DIFadd{fixed }\DIFaddend inter-photon delay \DIFdelbegin \DIFdel{fixed }\DIFdelend threshold of 2~ms   \DIFdelbegin \DIFdel{for the all photon streams, the corresponding command is}\DIFdelend \DIFaddbegin \DIFadd{(the same for all the photon streams)}\DIFaddend :  \begin{lstlisting}  d.calc_bg(bg.exp_fit, time_s=30, tail_min_us=2000)  \end{lstlisting}  The first argument (\verb|bg.exp_fit|) is the function used to fit the  background rate for each photon stream (see section~\nameref{sec:bg_intro}).  The function  \verb|bg.exp_fit| estimates the background using a maximum likelihood estimation  (MLE) of the delays distribution.  The second argument, \verb|time_s|, is the duration of the  \textit{background period} (section~\nameref{sec:bg_intro}) and the third, \verb|tail_min_us|,  is the minimum inter-photon delay to use when fitting the distribution to the specified model function.  To use different thresholds for each photon stream we pass a  tuple (i.e. a comma-separated list of values, \href{https://docs.python.org/3.5/tutorial/datastructures.html#tuples-and-sequences}{link}) instead of a scalar.  The recommended approach is however automating the choice of threshold using  \verb|tail_min_us='auto'| using an heuristic algorithm which is described in  \textit{Background estimation} section of the μs-ALEX tutorial  (\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/FRETBursts%20-%20us-ALEX%20smFRET%20burst%20analysis.ipynb#Background-estimation}{link}).  Finally, it is possible to use a slower but rigorous approach for finding the optimal  threshold as described in~\nameref{sec:bg_opt_th}. % SI_link  FRETBursts provides two kinds of plots to represent the background. One shows the histograms  of inter-photon delays compared to the fitted exponential distribution, shown in  figure~\ref{fig:bg_dist_all}) (see section~\nameref{sec:bg_intro} for details on the inter-photon distribution).  This plot is created with the command:  \begin{lstlisting}  dplot(d, hist_bg, period=0)  \end{lstlisting}  This command reflects the general form of plotting commands in FRETBursts  as described in~\nameref{sec:plotting}. % SI_link  Here we only note that the argument \verb|period| is an integer specifying the background  period to be plotted (when omitted, the default is 0, i.e. the first period).  Figure~\ref{fig:bg_dist_all} allows to quickly identify pathological cases where the  background fitting procedure returns unreasonable values.  The second background-related plot represents a timetrace of background rates,  as shown in figure~\ref{fig:bg_timetrace}. This plot allows monitoring background rate variations  occurring during the measurement and is obtained with the command:  \begin{lstlisting}  dplot(d, timetrace_bg)  \end{lstlisting}  Normally, samples should have a fairly constant background rate as a function of time  as in figure~\ref{fig:bg_timetrace}(a). However, sometimes, non-ideal  experimental conditions can yield a time-varying background rate, as illustrated in  figure~\ref{fig:bg_timetrace}(b).  A possible reason for the observed behavior could be buffer evaporation from an open sample  \DIFdelbegin \DIFdel{or poorly }\DIFdelend \DIFaddbegin \DIFadd{(we strongly recommend using a }\DIFaddend sealed   observation chamber \DIFaddbegin \DIFadd{whenever possible)}\DIFaddend . Additionally,  cover-glass impurities can contribute to the background.  These impurities tend to bleach on timescales of minutes resulting in  background variations during the course of the measurement.  \paragraph*{Python details}  The estimated background rates are stored in the \verb|Data| attributes  \verb|bg_dd|, \verb|bg_ad| and \verb|bg_aa|, corresponding to photon  streams \verb|Ph_sel(Dex='Dem')|, \verb|Ph_sel(Dex='Aem')| and \verb|Ph_sel(Aex='Aem')|  respectively.  These attributes are lists of arrays (one array per excitation spot).  The arrays contain the estimated background rates in the different time windows  (background periods).  Additional background fitting functions (e.g. least-square fitting of inter-photon delay  histogram) are available in \verb|bg| namespace  (i.e. the \verb|background| module,  \href{http://fretbursts.readthedocs.org/en/latest/background.html}{link}).  \subsection*{Burst Search}  \label{sec:burstsearch}  %\subsubsection*{Burst Search in FRETBursts}  %\label{sec:burstsearch_code}  Following background estimation, burst search is the next step of  the analysis.  In FRETBursts, a standard burst search using a single photon stream  (see section~\nameref{sec:burstsearch_intro}) is performed by calling the  \verb|Data.burst_search| method  (\href{http://fretbursts.readthedocs.org/en/latest/data_class.html#fretbursts.burstlib.Data.burst_search}{link}).  For example, the following command:  \begin{lstlisting}  d.burst_search(F=6, m=10, ph_sel=Ph_sel('all'))  \end{lstlisting}  \DIFaddbegin \noindent \DIFaddend performs a burst search on all photons  (\verb|ph_sel=Ph_sel('all')|), with a count rate threshold equal to 6 times the  local background rate (\verb|F=6|), using 10 consecutive photons to compute the  local count rate (\verb|m=10|).  A different photon stream, threshold ($F$) or number of photons $m$ can be selected  by passing different values.  These parameters are good general-purpose starting point for smFRET analysis  but can they can be adjusted if needed.  Note that the previous burst search does not perform any burst size selection  (however, by definition, the minimum bursts size is effectively $m$).  An additional parameter $L$ can be passed to impose a minimum burst  size before any correction.  However, it is recommended to select bursts only after \DIFdelbegin \DIFdel{background corrections  are applied}\DIFdelend \DIFaddbegin \DIFadd{applying background   corrections}\DIFaddend , as discussed in the next section~\nameref{sec:burstsel}.  It might sometimes be useful to specify a fixed photon-rate threshold, instead  of a threshold depending on the background rate, as in the previous example. In  this case, instead of $F$, the argument \verb|min_rate_cps| can be used to  specify the threshold (in counts-per-second). For example, a burst search with  a 50~kcps threshold is performed as follows:  \begin{lstlisting}  d.burst_search(min_rate_cps=50e3, m=10,  ph_sel=Ph_sel('all'))  \end{lstlisting}  Finally, to perform a DCBS burst search (or in general an AND gate burst search,  see section~\nameref{sec:burstsearch_intro}) we use the function  \verb|burst_search_and_gate|  (\href{http://fretbursts.readthedocs.org/en/latest/plugins.html#fretbursts.burstlib_ext.burst_search_and_gate}{link}),  as illustrated in the following example:  \begin{lstlisting}  d_dcbs = bext.burst_search_and_gate(d, F=6, m=10)  \end{lstlisting}  The last command puts the burst search results in a new copy of the   \verb|Data| variable \verb|d|  (in this example \DIFdelbegin \DIFdel{, }\DIFdelend the copy is called \verb|d_dcbs|).  Since FRETBursts shares the timestamps and detectors arrays between  different copies of \verb|Data| objects, the memory usage is minimized, even when  several copies are created.  \paragraph*{Python details}  Note that, while \DIFdelbegin %DIFDELCMD < \verb|.burst_search()| %%%  \DIFdelend \DIFaddbegin \verb|d.burst_search()| \DIFaddend is a method of \verb|Data|,  \DIFdelbegin %DIFDELCMD < \verb|burst_search_and_gate| %%%  \DIFdelend \DIFaddbegin \verb|bext.burst_search_and_gate()| \DIFaddend is a function in the \verb|bext| module  taking a \verb|Data| object as a first argument and returning a new  \verb|Data| object.  The function \verb|burst_search_and_gate| accepts optional arguments,  \verb|ph_sel1| and \verb|ph_sel2|, whose default values correspond to the  classical DCBS photon stream selection (see section~\nameref{sec:burstsearch_intro}).  These arguments can be specified to select different photon streams than those used in  a classical DCBS.  The \verb|bext| module (\href{http://fretbursts.readthedocs.org/en/latest/plugins.html}{link})  collects ``plugin'' functions that provides additional algorithms  for processing \verb|Data| objects.  \subsection*{Bursts Corrections}  \label{sec:corrcoeff}  In μs-ALEX, there are 3 important correction parameters: $\gamma$-factor,  donor leakage into the acceptor channel  and acceptor direct excitation by the donor excitation laser~\cite{Lee_2005}.  These corrections can be applied to burst data by simply assigning values  to the respective \verb|Data| attributes:  \begin{lstlisting}  d.gamma = 0.85  d.leakage = 0.15  d.dir_ex = 0.08  \end{lstlisting}  These attributes can be assigned either before or after the burst search. In the  latter case, existing burst data is automatically updated using the new  correction parameters.  These correction factors can be used to display corrected FRET distributions.  However, when the goal is to fit the FRET efficiency of sub-populations,  it is simpler to fit the background-corrected  PR histogram and then correct the population-level PR value (see SI in~\cite{Lee_2005}).  Correcting PR of each population (instead of correcting the data in each burst)  avoids distortion of the FRET distribution and keeps peaks of  static FRET subpopulations closer to the ideal \DIFdelbegin \DIFdel{Binomial }\DIFdelend \DIFaddbegin \DIFadd{binomial }\DIFaddend statistics~\cite{Gopich_2007}.  FRETBursts implements the correction formulas for $E$ and $S$ in the functions  \verb|fretmath.correct_E_gamma_leak_dir| and \verb|fretmath.correct_S|  (\href{http://fretbursts.readthedocs.org/en/latest/fretmath.html}{link}).  A derivation of these correction formulas (using computer-assisted algebra)  can be found online as an interactive notebook (\href{http://nbviewer.jupyter.org/github/tritemio/notebooks/blob/master/Derivation%20of%20FRET%20and%20S%20correction%20formulas.ipynb}{link}).  \subsection*{Burst Selection}  \label{sec:burstsel}  After burst search, it is common to select bursts according to different  criteria. One of the most common is burst size.  For instance, to select bursts with more than 30 photons detected during the donor excitation  (computed after background correction), we use following command:  \begin{lstlisting}  ds = d.select_bursts(select_bursts.size, th1=30)  \end{lstlisting}  The previous command creates a new \verb|Data| variable (\verb|ds|) containing  the selected bursts. \verb|th1| defines the lower bound for burst size, while  \verb|th2| defines the upper bound (when not specified, as in the previous example,  the upper bound is $+\infty$).  As before, the new object (\verb|ds|) will share the photon data  arrays with the original object (\verb|d|) in order to minimize the amount  of used memory.  The first argument of \verb|select_bursts|  (\href{http://fretbursts.readthedocs.org/en/latest/data_class.html#burst-selection-methods}{link})  is a python function implementing the ``selection rule'' (\verb|select_bursts.size| in this example);  all remaining arguments (only \verb|th1| in this case) are parameters of the selection rule.  The \verb|select_bursts| module  (\href{http://fretbursts.readthedocs.org/en/latest/burst_selection.html}{link})  contains numerous built-in selection functions  (\href{http://fretbursts.readthedocs.org/en/latest/burst_selection.html#module-fretbursts.select_bursts}{link}).  For example,  \verb|select_bursts.ES|  is used to select a region on the E-S ALEX histogram,  \verb|select_bursts.width|  to select bursts based on their duration.  New custom criteria can be readily implemented by defining a new selection function,  which requires only a couple of lines of code (see the  \verb|select_bursts| module's source code for examples,  \href{https://github.com/tritemio/FRETBursts/blob/master/fretbursts/select_bursts.py}{link}).  Finally, different criteria can be combined sequentially.  For example, with the following commands:  \begin{lstlisting}  ds = d.select_bursts(select_bursts.size,  th1=50, th2=200)  dsw = ds.select_bursts(select_bursts.width,  th1=0.5e-3, th2=3e-3)  \end{lstlisting}  \DIFaddbegin \noindent \DIFaddend bursts in \verb|dsw|  will have sizes between 50 and 200 photons, and duration between 0.5 and 3~ms.  \paragraph*{Burst Size Selection}  In the previous section, we selected bursts by size, using only  photons detected in both D and A channels during D excitation (i.e. \DIFdelbegin \DIFdel{Dex }\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex} }\DIFaddend photons),  as in eq.~\ref{eq:burstsize_dex}.  Alternatively, a threshold on the burst size computed including all photons  can be applied by adding $n_{aa}$ to the burst size (see eq.~\ref{eq:burstsize_allph}).  This is achieved  by passing \verb|add_naa=True| to the selection function.  The complete selection command is:  \begin{lstlisting}  ds = d.select_bursts(select_bursts.size,  th1=30, add_naa=True)  \end{lstlisting}  \DIFdelbegin %DIFDELCMD < \noindent %%%  \DIFdelend The result of this selection is plotted in figure~\ref{fig:alex_jointplot}.  When \verb|add_naa| is not specified,  as in the previous section, the default is \verb|add_naa=False|  (i.e. compute size using only \DIFdelbegin \DIFdel{Dex }\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex} }\DIFaddend photons).  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.7\columnwidth]{figures/alex_jointplot/alex_jointplot}  \caption{\label{fig:alex_jointplot} \textbf{E-S histogram showing FRET, D-only and A-only populations.}  A 2-D ALEX histogram and marginal E and S histograms for a 40-bp dsDNA  with D-A distance of 17 bases (Donor dye: ATTO550, Acceptor dye: ATTO647N).  Bursts are selected with a size-threshold of 30 photons, including \DIFdelbeginFL \DIFdelFL{Aex }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{A\textsubscript{ex} }\DIFaddendFL photons.  The plot is obtained with \texttt{alex\_jointplot(ds)}. The 2D E-S distribution plot (join plot)  is an histogram with hexagonal bins, which reduce the binning artifacts (compared to square bins)  and naturally resembles a scatter-plot when the burst density is low  \DIFaddbeginFL \DIFaddFL{(see }\nameref{sec:plotting}\DIFaddFL{)}\DIFaddendFL .  Three populations are visible: FRET population (middle), D-only population (top left) and  A-only population (bottom, $S < 0.2$). Compare with figure~\ref{fig:alex_jointplot_fretsel}  where the FRET population has been isolated.%  }  \end{center}  \end{figure}  Another important parameter for defining the burst size is the $\gamma$-factor, i.e.  the imbalance between the donor and the acceptor channel signals. As noted in  section~\nameref{sec:burstsizeweights}, the $\gamma$-factor is  used to compensate bias for the different fluorescence quantum yields of the D and A  fluorophores as well as the different photon-detection efficiencies of the D and A channels.  When $\gamma$ is significantly different from 1, neglecting its effect on burst size leads to  over-representing (in terms of number of bursts) one FRET population versus the others.  When the $\gamma$ factor is known \DIFaddbegin \DIFadd{(and $\ne 1$)}\DIFaddend , a more unbiased selection of different FRET  populations can be achieved passing the argument \verb|gamma| to the  selection function:  \begin{lstlisting}  ds = d.select_bursts(select_bursts.size,  th1=15, gamma=0.65)  \end{lstlisting}  When not specified, $\gamma=1$ is assumed.  \DIFdelbegin %DIFDELCMD <   %DIFDELCMD < %%%  \DIFdelend For more details on burst size selection, see the  \verb|select_bursts.size| documentation  (\href{http://fretbursts.readthedocs.org/en/latest/burst_selection.html#fretbursts.select_bursts.size}{link}).  \paragraph*{Python details}  \DIFdelbegin \DIFdel{To }\DIFdelend \DIFaddbegin \DIFadd{The method to }\DIFaddend compute $\gamma$-corrected burst sizes (with   or without addition of \verb|naa|)  \DIFdelbegin \DIFdel{the method }\DIFdelend \DIFaddbegin \DIFadd{is }\DIFaddend \verb|Data.burst_sizes|  (\href{http://fretbursts.readthedocs.org/en/latest/data_class.html#fretbursts.burstlib.Data.burst_sizes}{link})\DIFdelbegin \DIFdel{is used}\DIFdelend .  \paragraph*{Select the FRET Populations}  In smFRET-ALEX experiments, in addition to one or more FRET populations, there are always  donor-only (D-only) and acceptor-only (A-only) populations.  In most cases, these additional populations are not of interest and need to be filtered out.  In principle, using the E-S representation, D-only and A-only bursts  can be excluded by selecting bursts within a range of $S$ values (e.g. S=0.2-0.8).  This approach, however, simply truncates the burst distribution with arbitrary  thresholds and is therefore not recommended for quantitative assessment of FRET  populations.  An alternative approach consists in applying two selection filters sequentially.  First, the A-only population is filtered out  by applying a threshold on the number of photons during D excitation (\DIFdelbegin \DIFdel{Dex}\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex}}\DIFaddend ).  Second, the D-only population is filtered out by applying a threshold on  the number of A photons during A excitation (\DIFdelbegin \DIFdel{AemAex}\DIFdelend \DIFaddbegin \DIFadd{A\textsubscript{ex}A\textsubscript{em}}\DIFaddend ).  The commands for these combined selections are:  \begin{lstlisting}  ds1 = d.select_bursts(select_bursts.size, th1=15)  ds2 = ds1.select_bursts(select_bursts.naa, th1=15)  \end{lstlisting}  Here, \DIFaddbegin \DIFadd{the }\DIFaddend variable \verb|ds2| contains the combined burst selection.  Figure~\ref{fig:alex_jointplot_fretsel} shows the resulting pure FRET  population obtained with the previous selection.  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.7\columnwidth]{figures/alex_jointplot_fretsel/alex_jointplot_fretsel}  \caption{\label{fig:alex_jointplot_fretsel}  \textbf{E-S histogram after filtering out D-only and A-only populations.}  2-D ALEX histogram after selection of FRET population  using the composition of two burst selection filters:  (1) selection of bursts with counts in \DIFdelbeginFL \DIFdelFL{Dex }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{D\textsubscript{ex} }\DIFaddendFL stream larger than 15;  (2) selection of bursts with counts in \DIFdelbeginFL \DIFdelFL{AemAex }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{A\textsubscript{ex}A\textsubscript{em} }\DIFaddendFL stream larger than 15.  Compare to figure~\ref{fig:alex_jointplot} where all burst populations  (FRET, D-only and A-only) are reported.%  }  \end{center}  \end{figure}  \subsection*{Population Analysis}  \label{sec:fretfit}  Typically, after bursts selection, E or S histograms are fitted to a model.  FRETBursts \verb|mfit| module allows fitting histograms of bursts quantities  (i.e. E or S) with arbitrary models. In this context, a model is an object  specifying a function, the parameters varied during the fit  and optional constraints for these parameters. This concept of model  is taken from \textit{lmfit}~\cite{lmfit}, the underlying library used by  FRETBursts to perform the fits.  Models can be created from arbitrary functions.   \DIFdelbegin \DIFdel{By default,  FRETBursts allows using predefined }\DIFdelend \DIFaddbegin \DIFadd{FRETBursts includes predefined (i.e. built-in) }\DIFaddend models   such as 1 to 3 Gaussian peaks or 2-Gaussian connected by a \DIFdelbegin \DIFdel{``bridge''.  }\DIFdelend \DIFaddbegin \DIFadd{flat plateau.  The latter is an empirical model that  can be used to more accurately fit the center values of two populations  when the peaks are connected by intermediate-FRET bursts  (for the analytical definition of this function see the documentation,   }\href{http://fretbursts.readthedocs.io/en/latest/mfit.html#fretbursts.mfit.factory_two_gaussians}{link}\DIFadd{).  }\DIFaddend Built-in models are created by calling a corresponding factory function  (\DIFdelbegin \DIFdel{names starting }\DIFdelend \DIFaddbegin \DIFadd{whose names start }\DIFaddend with \verb|mfit.factory_|) which initializes the parameters  with values and constraints suitable for E and S histograms fits  \DIFdelbegin \DIFdel{.  }\DIFdelend (see \textit{Factory Functions} documentation,  \href{http://fretbursts.readthedocs.org/en/latest/mfit.html#model-factory-functions}{link}).  As an example, we \DIFaddbegin \DIFadd{can }\DIFaddend fit the E histogram of bursts in the  \verb|ds| variable with two Gaussian peaks with the following command:  \begin{lstlisting}  bext.bursts_fitter(ds, 'E', binwidth=0.03,  model=mfit.factory_two_gaussians())  \end{lstlisting}  Changing \verb|'E'| with \verb|'S'| will fit the S histogram instead.  The \verb|binwidth| argument specifies the histogram bin width and  the \verb|model| argument defines which model shall be used for  fitting.  All fitting results (including best fit values, uncertainties, etc...),  are stored in the \verb|E_fitter| (or \verb|S_fitter|)  attributes of the \verb|Data| variable (named \verb|ds| here).  To print a comprehensive summary of the fit results, including  uncertainties, reduced $\chi^2$ and correlation between parameters,  \DIFdelbegin \DIFdel{the we }\DIFdelend \DIFaddbegin \DIFadd{we can }\DIFaddend use the following command:  \begin{lstlisting}  fit_res = ds.E_fitter.fit_res[0]  print(fit_res.fit_report())  \end{lstlisting}  Finally, to plot the fitted model together with the FRET histogram,  as shown in figure~\ref{fig:histfit}, we pass the parameter \verb|show_model=True|  to the \verb|hist_fret| function   \DIFdelbegin \DIFdel{as follows  (seesection}\DIFdelend \DIFaddbegin \DIFadd{(see}\DIFaddend ~\nameref{sec:plotting} for an introduction to plotting in FRETBursts):  \begin{lstlisting}  dplot(ds, hist_fret, show_model=True)  \end{lstlisting}  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.49\columnwidth]{figures/hist_fit/hist_fit}  \caption{\label{fig:histfit} \textbf{FRET histogram fitted with two Gaussians.}  Example of a FRET histogram fitted with a 2-Gaussian model.  After performing the fit (see main text), the plot is generated  with \texttt{dplot(ds, hist\_fret, show\_model=True)}.%  }  \end{center}  \end{figure}  For more examples on fitting bursts data and plotting results, refer to the  fitting section of the μs-ALEX notebook (\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/FRETBursts%20-%20us-ALEX%20smFRET%20burst%20analysis.ipynb#FRET-fit:-in-depth-example}{link}),  the \textit{Fitting Framework} section of the documentation  (\href{http://fretbursts.readthedocs.org/en/latest/fit.html}{link})  as well as the documentation for \verb|bursts_fitter| function  (\href{http://fretbursts.readthedocs.org/en/latest/plugins.html#fretbursts.burstlib_ext.bursts_fitter}{link}).  \paragraph*{Python details}  Models returned by FRETBursts's factory functions (\verb|mfit.factory_*|)  are \verb|lmfit.Model| objects (\href{https://lmfit.github.io/lmfit-py/model.html}{link}).  Custom models can be created by calling \verb|lmfit.Model| directly.  When an \verb|lmfit.Model| is fitted, it returns a \verb|ModelResults| object  (\href{https://lmfit.github.io/lmfit-py/model.html#the-modelresult-class}{link}),  which contains all information related to the fit (model, data,  parameters with best values and uncertainties) and useful methods to operate on fit results.  FRETBursts puts a \verb|ModelResults| object of each excitation spot in the list  \verb|ds.E_fitter.fit_res|.  For instance, to obtain the reduced $\chi^2$ value of the E histogram fit in a  single-spot measurement \verb|d|, we use the following command:  \begin{lstlisting}  d.E_fitter.fit_res[0].redchi  \end{lstlisting}  Other useful attributes are \verb|aic| and \verb|bic| which contain  \DIFaddbegin \DIFadd{statistics for }\DIFaddend the Akaike information criterion (AIC)\DIFaddbegin \DIFadd{~\mbox{%DIFAUXCMD  \cite{akaike_new_1974}   }%DIFAUXCMD  }\DIFaddend and the Bayes Information criterion (BIC)\DIFaddbegin \DIFadd{~\mbox{%DIFAUXCMD  \cite{schwarz_estimating_1978}}%DIFAUXCMD  }\DIFaddend .  AIC and BIC \DIFdelbegin \DIFdel{allow comparing different models and  selecting the most appropriate for the dataat hand.  }\DIFdelend \DIFaddbegin \DIFadd{are general-purpose statistical criteria for comparing the  suitability of multiple non-nested models according to the data.  By penalizing models with higher number of parameters, these criteria   strike a balance between the need of achieving high goodness of fit   with the need of keeping the model complexity low to avoid overfitting.  }\DIFaddend   Examples of definition and modification of fit models are provided in  the aforementioned μs-ALEX notebook  (\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/FRETBursts%20-%20us-ALEX%20smFRET%20burst%20analysis.ipynb#FRET-fit:-in-depth-example}{link}).  Users can also refer to the comprehensive lmfit's documentation  (\href{http://lmfit.github.io/lmfit-py/}{link}).  \DIFaddbegin \subsection*{\DIFadd{FRET Dynamics}}  \label{sec:dynamics}  \DIFaddend   \DIFaddbegin   \DIFadd{Single-molecule FRET histograms show more information than just mean FRET efficiencies.  While in general the presence of several peaks clearly indicates the existence of  multiple subpopulations, a single peak cannot a priori be associated with  a single population defined by a unique FRET efficiency without further analysis (such as, for instance, shot-noise analysis~\mbox{%DIFAUXCMD  \cite{Nir_2006,Antonik2006}}%DIFAUXCMD  ).}\DIFaddend  \DIFaddbegin \DIFadd{Shot-noise analysis~\mbox{%DIFAUXCMD  \cite{Nir_2006} }%DIFAUXCMD  or probability  distribution analysis (PDA)~\mbox{%DIFAUXCMD  \cite{Antonik2006,kalinin_probability_2007}  }%DIFAUXCMD  allow to compute the minimum width of a static FRET population  (i.e. caused by the statistics of discrete photon-detection events).  Typically, several mechanisms  contribute to the broadening of the experimental FRET peak  beyond the shot-noise limit. These include heterogeneities in the sample  resulting in a distribution of Förster radiuses,  or actual conformational changes giving rise to a distribution  of D-A distances~\mbox{%DIFAUXCMD  \cite{sisamakis_accurate_2010}}%DIFAUXCMD  .  }  \DIFadd{Gopich and Szabo developed an elegant analytical model   for the FRET distribution of $M$ interconverting states   based on superposition of Gaussian peaks~\mbox{%DIFAUXCMD  \cite{gopich_fret_2010}}%DIFAUXCMD  .  Unfortunately, the method is not of straightforward application for   freely-diffusing data as it requires a special selection   criterion for filtering bursts with quasi-Poisson rates.  Santoso~\mbox{%DIFAUXCMD  \cite{santoso_probing_2009} }%DIFAUXCMD  and Kalinin~\mbox{%DIFAUXCMD  \cite{Kalinin2010}  }%DIFAUXCMD  extended the PDA approach to estimate conversion rates between different  states by comparing FRET histograms as a function of the time-bin size.  In addition, Gopich and Szabo~\mbox{%DIFAUXCMD  \cite{Gopich2009, gopich_theory_2011} }%DIFAUXCMD  developed  a related method to compute conversion rates using  a likelihood function which depends on photon timestamps (overcoming  the time binning and FRET histogramming step and directly applicable   to freely-diffusing data).  In case of measurement including lifetime, the multiparameter fluorescence  detection (MFD) method allows to identify dynamics from the deviation   from the linear relation between lifetime and E~\mbox{%DIFAUXCMD  \cite{sisamakis_accurate_2010}}%DIFAUXCMD  .  Hoffman~\mbox{%DIFAUXCMD  \cite{hoffmann_quantifying_2011} }%DIFAUXCMD  proposed a method   called RASP (recurrence analysis of single particles) to extend   the timescale of detectable kinetics.  Hoffman computes the probability that two nearby bursts are due to  the same molecule and therefore allows setting a time-threshold  for considering consecutive bursts as the same single-molecule event.  }  \DIFadd{Other interesting approaches include combining smFRET and FCS   for detecting and quantify kinetics on timescales much shorter   than the diffusion   time~\mbox{%DIFAUXCMD  \cite{laurence_correlation_2007,torres_measuring_2007,nettels_unfolded_2008}}%DIFAUXCMD  .  In addition, Bayes-based methods have been proposed to fit static  populations~\mbox{%DIFAUXCMD  \cite{devore_classic_2012,murphy_bayesian_2014}}%DIFAUXCMD  , or to study dynamics~\mbox{%DIFAUXCMD  \cite{kou_bayesian_2005}}%DIFAUXCMD  .  }  \DIFadd{Finally, two related methods for discriminating between static heterogeneity  and sub-millisecond dynamics are Burst Variance Analysis  (BVA) proposed by Torella~\mbox{%DIFAUXCMD  \cite{Torella_2011} }%DIFAUXCMD  and  kernel density distribution estimator (2CDE) proposed by   Tomov~\mbox{%DIFAUXCMD  \cite{Tomov_2012}}%DIFAUXCMD  . The BVA method is described in the next section.  The 2CDE method, which has been implemented in FRETBursts, computes local  photon rates from timestamps within bursts using  Kernel Density Estimation (KDE)  (FRETBursts includes general-purpose functions  to compute KDE of photon timestamps in the }\verb|phrates| \DIFadd{module,   (}\href{http://fretbursts.readthedocs.io/en/latest/phrates.html}{link}\DIFadd{)).  From time variations of local rates is possible to  detect the occurrence of dynamics. In particular the 2CDE method  builds, for each burst, a quantity $(E)_D$ (or $(1-E)_A$) which is equal   to the burst average $E$ when no dynamics is present, but it is biased   toward an higher (or lower) value in presence of dynamics. From these  quantities a burst ``estimator''   (called FRET-2CDE) is derived. For a user the 2CDE method consists  in plotting the 2-D histogram of $E$ versus FRET-2CDE   in assessing the vertical position of the various populations:   populations centered around FRET-2CDE=10 have  no dynamics while population biased towards higher FRET-2CDE values   have dynamics.  }  \DIFadd{The BVA and 2CDE methods are implemented   in two notebooks included with FRETBursts  (}\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%20Burst%20Variance%20Analysis.ipynb}{BVA link},  \href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%202CDE%20Method.ipynb}{2CDE link}\DIFadd{).  To use them, a user needs to download the relevant notebook  and run the anaysis therein.  The other methods mentioned in this section are not currently   implemented in FRETBursts.  However, users can implement their additional favorite method  taking advantage of FRETBursts functions for burst analysis  and timestamps/bursts manipulation.  To facilitate this task, in the next section,  we show how to perform low-level analysis of timestamps and bursts data   by implementing the BVA method from scratch.  An additional example showing how to split bursts in constant time-bins  can be found in the respective FRETBursts notebook  (}\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%20Working%20with%20timestamps%20and%20bursts.ipynb}{link}\DIFadd{).  These examples serve as a guide for implementing new methods.  We welcome researchers willing to implement new methods to ask questions  on GitHub or on the mailing list.   We also encourage sharing eventual new methods implemented in FRETBursts   for the benefit the entire community.  }  \section*{Implementing Burst Variance Analysis}  \label{sec:bva}  In this section, we describe how to implement burst variance analysis (BVA)  as described in~\cite{Torella_2011}.  FRETBursts provides well-tested, general-purpose functions for timestamps and burst data  manipulation and therefore simplifies implementing custom burst analysis algorithms such as BVA.  \subsection*{BVA Overview}  \DIFdelbegin \DIFdel{Single-molecule FRET histograms show more information than just mean FRET efficiencies.  While in general the presence of several peaks clearly indicates the existence of  multiple subpopulations, a single peak cannot a priori be associated with  a single population defined by a unique FRET efficiency without further analysis  (such as, for instance, shot-noise analysis~\mbox{\cite{Nir_2006,Antonik2006}}).}  \DIFdel{The FRET histogram of a single FRET population has a minimum width set by shot noise  (i.e. the width is caused by the statistics of discrete photon-detection events).  FRET distributions broader than the shot noise limit,  can be ascribed to either a static mixture of species with slightly different FRET efficiencies,  or to a specie undergoing dynamic transitions (e.g. interconversion between multiple states,  diffusion in a continuum of conformations, binding-unbinding events, etc.).  When the single peak of a FRET distribution is wider than predicted from shot-noise,  it is not possible to discriminate between the static and dynamic case without further analysis.}\DIFdelend  The BVA method has been developed to \DIFdelbegin \DIFdel{address this issue, namely identifying }\DIFdelend \DIFaddbegin \DIFadd{identify }\DIFaddend the presence of dynamics  in FRET distributions~\cite{Torella_2011},  and has been successfully applied to identify biomolecular processes with  dynamics on the millisecond time-scale~\cite{Torella_2011, Robb_2013}.  The basic idea behind BVA is to subdivide bursts into contiguous burst chunks (sub-bursts)  comprising a fixed number $n$ of photons,  and to compare the empirical variance of acceptor counts of all sub-bursts in a burst,  with the theoretical shot-noise-limited variance.  An empirical variance of sub-bursts larger than the shot-noise limited value indicates  the presence of dynamics. Since the estimation of the sub-bursts variance is affected  by uncertainty, BVA analysis provides and indication of an higher or lower probability  of observing dynamics.  In a FRET (sub-)population originating from a single static FRET efficiency,  the sub-bursts acceptor counts $n_a$ can be modeled as a binomial-distributed random variable  $N_a \sim \operatorname{B}(n, E_p)$, where $n$ is the number of photons in each sub-burst and  $E_p$ is the estimated population proximity-ratio (PR).  Note that we can use the PR because, regardless of the molecular FRET efficiency,  the detected counts are partitioned between donor and acceptor channels according to  a binomial distribution with success probability equal to the PR.  The only approximation done here is neglecting the presence of background  (a reasonable approximation since the backgrounds counts are in general a  very small fraction of the total counts).  We refer the interested reader to~\cite{Torella_2011} for further discussion.  If $N_a$ follows a binomial distribution, the random variable $E_{\textrm{sub}} = N_a/n$,  has a standard deviation reported in eq.~\ref{eq:binom_std}.  \begin{equation}  \label{eq:binom_std}  \operatorname{Std}(E_{\textrm{sub}}) = \left( \frac{E_p\,(1 - E_p)}{n} \right)^{1/2}  \end{equation}  BVA analysis consists of four steps: 1) dividing bursts into consecutive sub-bursts  containing a constant number of consecutive photons~\textit{n}, 2) computing the PR  of each sub-burst, 3) calculating the empirical standard deviation ($s_E$) of sub-bursts  PR in each burst, and 4) comparing $s_E$ to the expected standard deviation  of a shot-noise-limited distribution~(eq.~\ref{eq:binom_std}).  If, as in figure~\ref{fig:bva_static}, the observed FRET efficiency distribution  originates from a static mixture of sub-populations (of different  non-interconverting molecules) characterized by distinct FRET efficiencies,  $s_E$ of each burst is only affected by shot-noise and will follow the expected  standard deviation curve based on eq.~\ref{eq:binom_std}.  Conversely, if the observed distribution originates from biomolecules belonging to a single specie,  which interconverts between different FRET sub-populations (over times comparable to the diffusion  time), as in figure~\ref{fig:bva_dynamic}, $s_E$ of each burst will be larger than the expected  shot-noise-limited standard deviation, and will be located above the shot-noise standard  deviation curve (right panel of figure~\ref{fig:bva_dynamic}).  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.98\columnwidth]{figures/ALEX_BVA_static/ALEX_BVA_static}  \caption{\label{fig:bva_static} \textbf{BVA distribution for a static mixture sample.}  The left panel shows the E-S histogram for a mixture of single stranded DNA (20dT) and double stranded DNA (20dT-20dA) molecules in 200 mM MgCl$_2$. The right panel shows the corresponding BVA plot. Since both 20dT and 20dT-20dA are stable and have no dynamics, the BVA plots shows $s_E$ peaks lying on the static standard deviation curve (\textit{red curve}).%  }  \end{center}  \end{figure}  \begin{figure}[h!]  \begin{center}  \includegraphics[width=0.98\columnwidth]{figures/ALEX_BVA_dynamic/ALEX_BVA_dynamic}  \caption{\label{fig:bva_dynamic} \textbf{BVA distribution for a hairpin sample undergoing dynamics.}  The left panel shows the E-S histogram for a single stranded DNA sample ($A_{31}$-TA, see in~\cite{Tsukanov_2013}), designed to form a transient hairpin in 400mM NaCl. The right panel shows the corresponding BVA plot. Since the transition between hairpin and open structure causes a significant change in FRET efficiency, $s_E$ lies largely above the static standard deviation curve (\textit{red curve}).%  }  \end{center}  \end{figure}  \subsection*{BVA Implementation}  The following paragraphs describe the low-level details involved in implementing the BVA using FRETBursts.  The main goal is to illustrate a real-world example of accessing and manipulating timestamps and burst data.  For a ready-to-use BVA implementation users can refer to the corresponding notebook included with FRETBursts  (\href{http://nbviewer.jupyter.org/github/tritemio/FRETBursts_notebooks/blob/master/notebooks/Example%20-%20Burst%20Variance%20Analysis.ipynb}{link}).  \paragraph*{Python details}  For BVA implementation, two photon streams are needed: all-photons during donor excitation (\DIFdelbegin \DIFdel{Dex}\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex}}\DIFaddend )  and acceptor photons during donor excitation (\DIFdelbegin \DIFdel{DexAem}\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex}A\textsubscript{em}}\DIFaddend ).  These photon stream selections are obtained by computing boolean masks as follows  (see\DIFdelbegin \DIFdel{section}\DIFdelend ~\nameref{sec:burststimes}):  \begin{lstlisting}  Dex_mask = ds.get_ph_mask(ph_sel=Ph_sel(Dex='DAem'))  DexAem_mask = ds.get_ph_mask(ph_sel=Ph_sel(Dex='Aem'))  DexAem_mask_d = AemDex_mask[Dex_mask]  \end{lstlisting}  Here, the first two variables (\verb|Dex_mask| and \verb|DexAem_mask|)  select photon from the all-photons timestamps array,  while \verb|DexAem_mask_d|, selects A-emitted photons from the  array of photons emitted during D-excitation. As shown below,  the latter is needed to count acceptor photons in burst chunks.  Next, we need to express bursts start-stop data as indexes of the D-excitation photon stream  (by default burst start-stop indexes refer to all-photons timestamps array):  \begin{lstlisting}  ph_d = ds_FRET.get_ph_times(ph_sel=Ph_sel(Dex='DAem'))  bursts = ds_FRET.mburst[0]  bursts_d = bursts.recompute_index_reduce(ph_d)  \end{lstlisting}  Here, \verb|ph_d| contains the \DIFdelbegin \DIFdel{Dex }\DIFdelend \DIFaddbegin \DIFadd{D\textsubscript{ex} }\DIFaddend timestamps, \verb|bursts| the original burst data and  \verb|bursts_d| the burst data with start-stop indexes relative to \verb|ph_d|.  Finally, with the previous variables at hand, the BVA algorithm  can be easily implemented by computing the $s_E$ quantity for each burst:  \begin{lstlisting}  n = 7  E_sub_std = []  for burst in bursts_d:  E_sub = []  startlist = range(burst.istart, burst.istop + 2 - n, n)  stoplist = [i + n for i in startlist]  for start, stop in zip(startlist, stoplist):  A_D = DexAem_mask_d[start:stop].sum()  E = A_D / n  E_sub.append(E)  E_sub_std.append(np.std(E_sub))  \end{lstlisting}  Here, \verb|n| is the BVA parameter defining the number of photons in each burst chunk.  The outer loop iterates through bursts, while the inner loop iterates through sub-bursts.  The variables \verb|startlist| and \verb|stoplist| are the list of start-stop indexes for  all sub-bursts in current burst.  In the inner loop, \verb|A_D| and \verb|E| contain the number of acceptor photons and  FRET efficiency for the current sub-burst. Finally, for each burst, the standard deviation  of \verb|E| is appended to the list \verb|E_sub_std|.  By plotting the 2D distribution of $s_E$ (i.e. \verb|E_sub_std|) versus the average (uncorrected) E  we obtain the BVA plots of figure~\ref{fig:bva_static} and~\ref{fig:bva_dynamic}.  \section*{Conclusions}  \label{sec:conclusions}  FRETBursts is an open source and openly developed (see~\nameref{sec:dev}) implementation % SI_link  of established smFRET burst analysis methods  made available to the single-molecule community.  It implements several novel concepts which improve the analysis results, such as  time-dependent background estimation, background-dependent burst search threshold,  burst weighting and $\gamma$-corrected burst size selection.  More importantly, FRETBursts provides a library of thoroughly-tested functions  for timestamps and burst manipulation, making it an ideal platform for  developing and comparing new analytical techniques.  We envision FRETBursts both as a state-of-the-art burst analysis  software as well as a platform for development and assessment of novel algorithms.  To underpin this envisioned role, FRETBursts is developed following modern  software engineering practices, such as DRY principle  (\href{http://en.wikipedia.org/wiki/Don\%27t_repeat_yourself}{link})  to reduce duplication and KISS principle  (\href{http://en.wikipedia.org/wiki/KISS_principle}{link})  to reduce over-engineering. Furthermore, to minimize the number software errors~\cite{Merali_2010,Soergel_2015},  we employ defensive programming~\cite{Prli__2012} which includes code readability,  unit and regression testing and continuous integration~\cite{Eglen_2016}.  Finally, being open source, any scientist can inspect the source code,  fix errors, adapt it to her own needs.  We believe that, in the single-molecule community,  standard open source software implementations, such as FRETBursts, can enhance  reliability and reproducibility of analysis and promote a faster adoption of novel methods,  while reducing the duplication of efforts among different groups.  \section*{Acknowledgments}  We thank Dr. Eyal Nir and Dr. Toma Tomov for support in the implementation of the 2CDE method \DIFdelbegin \DIFdel{.  }\DIFdelend \DIFaddbegin \DIFadd{and Dr. Achilles Kapanidis and Dr. Nicole Robb for providing   experimental data for testing the BVA implementation.  }\DIFaddend This work was supported by National Institutes of Health (NIH)  grant R01-GM95904 and R01-GM069709. Dr. Weiss discloses equity in  Nesher Technologies and intellectual property used in the research  reported here. The work at UCLA was conducted in Dr. Weiss's Laboratory.  \section*{Supporting Information}  \paragraph*{S1 Appendix.}  \label{sec:notebook}  {\bf Notebook Workflow.} A description of the notebook workflow used by FRETBursts.  \paragraph*{S2 Appendix.}  \label{sec:dev}  {\bf Development and Contributions.} A description of development philosophy and techniques  as well as how to contribute to the FRETBursts project.  \paragraph*{S3 Appendix.}  \label{sec:burststimes}  {\bf Timestamps and Burst Data.} General concepts of how timestamps and  bursts data are stored and handled in FRETBursts.  \paragraph*{S4 Appendix.}  \label{sec:plotting}  {\bf Plotting \texttt{Data}.} A description of the syntax used to perform  plots in FRETBursts \DIFaddbegin \DIFadd{and of the 2-D hexagonal-bin histogram used in E-S plots}\DIFaddend .  \paragraph*{S5 Appendix.}  \label{sec:bg_opt_th}  {\bf Background Estimation With Optimal Threshold.} A description of  the algorithm used by FRETBursts to compute the  optimal threshold for background estimation.  \paragraph*{S6 Appendix.}  \label{sec:burstweights_theory}  {\bf Burst Weights.} Theory underpinning the choice of using burst size  as weights for FRET estimation.  \nolinenumbers  \bibliography{bibliography/converted_to_latex.bib%  }  \end{document}