deletions | additions
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..098507a
--- /dev/null
+++ b/.gitignore
...
*.aux
*.pdf
*.bbl
*.blg
*.log
*.out
*.gz
diff --git a/Background estimation.tex b/Background estimation.tex
index d4f551a..5c35229 100644
--- a/Background estimation.tex
+++ b/Background estimation.tex
...
Finally, it is
possible to use a heuristic estimation of the threshold using
\verb|tail_min_us='auto'|. For more details refer to the
\href{http://fretbursts.readthedocs.org/en/latest/data_class.html#fretbursts.burstlib.Data.calc_bg}{\texttt{calc\_bg} \href{http://fretbursts.readthedocs.org/en/latest/data\_class.html#fretbursts.burstlib.Data.calc\_bg}{\texttt{calc\_bg} documentation}.
\subsubsection{Error metric and optimal threshold}
The functions fitted to the background also return an estimation of the quality of fit computed as the distance between the empirical
\href{http://en.wikipedia.org/wiki/Cumulative_distribution_function}{cumulative \href{http://en.wikipedia.org/wiki/Cumulative\_distribution\_function}{cumulative distribution function} (CDF) and fitted CDF. Two different distance metrics can be returned. The first is
\href{http://en.wikipedia.org/wiki/Kolmogorov\%E2\%80\%93Smirnov_test}{Kolgomorov-Smirnov} \href{http://en.wikipedia.org/wiki/Kolmogorov\%E2\%80\%93Smirnov\_test}{Kolgomorov-Smirnov} statistics (the maximum of the difference between the empirical and the fitted CDF) and the second is the
\href{http://en.wikipedia.org/wiki/Cram\%C3\%A9r\%E2\%80\%93von_Mises_criterion}{Cramer \href{http://en.wikipedia.org/wiki/Cram\%C3\%A9r\%E2\%80\%93von\_Mises\_criterion}{Cramer von Mises} statistics corresponding to the integral of the squared residuals (see the code \href{https://github.com/tritemio/FRETBursts/blob/master/fretbursts/background.py#L40}{here}).
In principle, one can find the threshold as the value that minimize the error metric. This approach is implemented by the function
\href{http://fretbursts.readthedocs.org/en/latest/plugins.html#fretbursts.burstlib_ext.calc_bg_brute}{calc\_bg\_brute} \href{http://fretbursts.readthedocs.org/en/latest/plugins.html#fretbursts.burstlib\_ext.calc\_bg\_brute}{calc\_bg\_brute} in the \href{http://fretbursts.readthedocs.org/en/latest/plugins.html}{burstlib\_ext module}. For more information see this notebook[TODO].
diff --git a/Burst search.tex b/Burst search.tex
index 21d4e91..98cb3eb 100644
--- a/Burst search.tex
+++ b/Burst search.tex
...
\subsubsection{Introduction to burst search}
\label{sec:burstsearch_intro}
After background estimation, the burst search is the next fundamental step of
the analysis. The core "sliding window" algorithm, proposed by
Eggeling~\textit{et al.} in 1998~\cite{Eggeling_1998}, involves searching for
bursts of photons
in which $m$ consecutive photons are contained within a minimal time period
$\Delta t$. In other words, bursts are portions of the photon stream where the
local rate (computed using $m$ photons) is above a minimal rate chosen as a
threshold. Eggeling did not provide any criteria on how to choose the rate
threshold and the number of photons $m$ and as therefore it has become a common
practice to manually tweak those parameters for each specific measurement.
A more general approach consist in taking into account the background rate of
the specific measurements and in choosing a rate threshold that is $F$ times
larger than the background rate. This approach assures that the resulting bursts
all have a single-to-background ratio (SBR) larger than
$(F-1)$~\cite{Michalet_2012}. A consistent criterion to choose the threshold is
very important when comparing different measurements with different background
rates, when the background significantly changes during the measurements or in
multi-spot measurements where each spot has a different background rate.
A second important aspect of burst search is which photon stream is processed.
Sometimes, for instance when identifying FRET populations, one would like to
apply the burst search to all the photons. Other times, when focusing on
donor-only or acceptor only populations, it is better to use only the donor or
acceptor signal. In general one would like to be able to apply the burst search
to an arbitrary selection of photons. In FRETBursts this can be achieved passing
the appropriate \verb|Ph_sel| object to the burst search method (see
section~\ref{sec:ph_streams} for more info on photon stream definitions).
Finally, Nir~\textit{et al.}~\cite{Nir_2006} proposed a dual-channel burst
search (DCBS) that allows one to mitigate (to some extent) artifacts due to
photo-physical effects such as blinking. In this case a search is performed
independently on two photon streams and bursts are marked only when both photon
streams exhibit a rate higher than the threshold,
implementing a kind of an AND-gate logic.
Usually, the term DCBS is refers to a burst search where the two photon streams
are all the photons
during donor excitation (\verb|Ph_sel(Dex='DAem')|) and acceptor channel photons
during acceptor
excitation (\verb|Ph_sel(Aex='Aem')|).
After the first level of burst search is performed it is important to select
bursts according to their number of photons (burst size). In the most
rudimentary form this selection can be perfomed during burst search discarding
all the bursts
with size lower that a threshold $L$. This method, however, neglects the effect
of background and gamma factor on the burst size and can lead to a selection
bias of certain channels or of certain sub-populations.
For this reason we advocate performing a burst size selection after background
correction and taking into account the gamma factor, as illustrated in
section~\ref{sec:burstsel}.
\subsubsection{Burst search in FRETBursts}
\label{sec:burstsearch_code}
In FRETBursts the standard burst search is performed calling the
\href{http://fretbursts.readthedocs.org/en/latest/data_class.html#fretbursts.burstlib.Data.burst\_search}{\verb|burst_search| \href{http://fretbursts.readthedocs.org/en/latest/data\_class.html#fretbursts.burstlib.Data.burst\_search}{\texttt{burst\_search} method}.
\begin{verbatim}
d.burst_search(F=6, m=10, ph_sel=Ph_sel('all'))
\end{verbatim}
The previous command perfoms a burst search on all photons
(\verb|ph_sel=Ph_sel('all')|), with a minimum rate 6 times larger than the
background rate (\verb|F=6|) and using 10 consecutive photons to compute the
local rate (\verb|m=10|).
A different photon selection, threshold ($F$) or number of photons for rate
computation $m$ can be selected by passing a different value. These parameters
are generally a good starting point for smFRET analysis but can be tweaked in
specific cases.
Note that in the previous burst search no burst size selection is performed
(i.e. the minimum bursts size is $m$).
An additional
paramenter parameter $L$ can be passed to apply a threshold on the raw burst
size (before any correction).
We however suggest to perform a more accurate burst size selection as shown in
the next section~\ref{sec:burstsel}.
In us-ALEX there are 3 important correction parameters: gamma factor, spectral
leakage and
acceptor direct excitation~\cite{Lee_2005}. These corrections can be applied by
simply setting the respective
Data attributes:
\begin{verbatim}
...
d.dir_ex = 0.08
\end{verbatim}
These attributes can be assigned either before or after the burst search. In the
latter case, the burst data is
automatically updated using the newly assigned correction values.
Sometimes it may be useful to specify a fixed threshold, instead
of a threshold derived from the background rate like in the previous example. In
this case, instead of $F$ we can use the argument \verb|min_rate_cps| that
specifies a threshold in Hertz. For example, a burst search with a 50~kHz
threshold can be perfoemed as follows:
\begin{verbatim}
d.burst_search(min_rate_cps=50e3, m=10, ph_sel=Ph_sel('all'))
\end{verbatim}
Finally, to perform a DCBS burst search (or in general an AND-gate burst search,
see section~\ref{sec:burstsearch_intro}) the plugin function
\href{http://fretbursts.readthedocs.org/en/latest/plugins.html#fretbursts.burstlib_ext.burst_search_and_gate}{\verb|burst_search_and_gate|} \href{http://fretbursts.readthedocs.org/en/latest/plugins.html#fretbursts.burstlib\_ext.burst\_search\_and\_gate}{\texttt{burst\_search\_and\_gate}}
can be used like in the following example:
\begin{verbatim}
d_dcbs = bext.burst_search_and_gate(d, F=6, m=10)
\end{verbatim}
Note that in this case a new \verb|Data| variable is returned (\verb|d_dcbp|)
containing all the data and the results of the new burst search. In order to
save RAM, FRETBursts shares the timestamps and detectors arrays between
different copies of a \verb|Data| object (for example \verb|d| and
\verb|d_dcbs|), while all the other data (including background and burst data)
is copied.
The function \verb|burst_search_and_gate| accepts additional arguments
\verb|ph_sel1| and \verb|ph_sel2|
used to specify different photons streams. The default values
(\verb|ph_sel1 = Ph_sel(Dex='DAem')| and \verb|ph_sel2 = Ph_sel(Aex='Aem')|)
correspond to the classical DCBS
(see section~\ref{sec:burstsearch_intro}).
diff --git a/Burst selection.tex b/Burst selection.tex
index 25f0c48..db7d878 100644
--- a/Burst selection.tex
+++ b/Burst selection.tex
...
\subsection{Burst selection}
\label{sec:burstsel}
After burst search it is common to select bursts according to different
criteria, among which one of the most common is the burst size.
For example, to select bursts with more than 100 photons (after background
correction) detected during the donor excitation periods we can write:
\begin{verbatim}
ds = Sel(d, select_bursts.size, th1=100)
\end{verbatim}
In the previous command a new Data variable (\verb|ds|) containing the selected
bursts is created. As mentioned before the new object will share the photon data
arrays with the original object (\verb|d|) in order to minimize the RAM
consumption.
Looking at the previous command, we notice that the
\href{http://fretbursts.readthedocs.org/en/latest/burst_selection.html#fretbursts.burstlib.Sel}{\verb|Sel|} \href{http://fretbursts.readthedocs.org/en/latest/burst\_selection.html#fretbursts.burstlib.Sel}{\\texttt{Sel}}
function requires a "selection criterium" (a python function) as second
argument; all the remaining arguments are passed to the selection function. The
module \verb|select_bursts| contains numerous built-in selection functions, for
example to
select a region on the E-S ALEX histogram (\verb|select_bursts.ES|),
to select bursts based on their duration (\verb|select_bursts.width|) and so on.
New criteria can be easily
implemented by defining a new selection function, usually not longer than a
couple of lines (see the
\href{https://github.com/tritemio/FRETBursts/blob/master/fretbursts/select_bursts.py}{\verb|select_bursts| \href{https://github.com/tritemio/FRETBursts/blob/master/fretbursts/select\_bursts.py}{\texttt{select\_bursts} module} for several examples).
Finally note that different criteria can be combined by applying them
in sequence. For example with the following commands
...
dsw = Sel(ds, select_bursts.width, th1=0.5e-3, th2=3e-3)
\end{verbatim}
the variable \verb|dsw| will contain all the bursts with sizes between 50 and
200 photons, with duration between 0.5 and 3~ms.
\subsubsection{Burst size selection}
In the previous section we used a definition of "burst size" as the total number
of detected counts in the donor and in the acceptor channel during donor
excitation periods.
We can modify the selection command in order to also include photons detected in
the acceptor channels during acceptor excitation periods. This is achieved
passing the boolean flag \verb|add_naa=True| to the selection function as
follows:
\begin{verbatim}
ds = Sel(d, select_bursts.size, th1=100, add_naa=True)
\end{verbatim}
Another important parameter in defining the burst size is the gamma-factor, i.e.
the unbalance between the donor and the acceptor channels. The gamma-factor is
used to correct for the different quantum yield between D and A fluorophores and
the different photon-detection efficiency between the D and A channels.
Neglecting the effect of gamma-factor on the burst size leads to a biased burst
selection, especially if $\gamma$ significantly differ from 1.
To include the effect of $\gamma$ on the burst size and obtain a "fair" burst
selection (i.e. a selection that does not favor high or low FRET states) we
need to pass the argument \verb|gamma| (or \verb|gamma1|) like in the following
example:
\begin{verbatim}
ds = Sel(d, select_bursts.size, th1=100, gamma=0.65)
\end{verbatim}
For more information on burst size selection refer to the
\href{http://fretbursts.readthedocs.org/en/latest/burst_selection.html#fretbursts.select_bursts.size}{\verb|select_bursts.size| \href{http://fretbursts.readthedocs.org/en/latest/burst_selection.html#fretbursts.select\_bursts.size}{\texttt{select\_bursts.size} documentation}.
diff --git a/Concepts.tex b/Concepts.tex
index b89c590..b4197c9 100644
--- a/Concepts.tex
+++ b/Concepts.tex
...
\section{Architecture and concepts}
In this section we introduce some general concepts and naming conventions in
FRETBursts.
\subsection{Photon streams}
\label{sec:ph_streams}
The fundamental data at the core of smFRET experiments is the array of photon
arrival timestamps, with a resolution of the order of 10~ns. In single-spot
measurements, all the timestamps are stored in a single array. In multi-spot
measurements we have as many timestamps arrays as the number of excitation
spots.
Each array of timestamps contains timestamps from both the donor (D) and the
acceptor channel (A). In ALEX measurements, we can further differentiate between
photons emitted during D and A excitation periods. In FRETBursts the different
selections of photons/timestamps are called "photon streams" and they are
specified with a
\href{http://fretbursts.readthedocs.org/en/latest/ph_sel.html}{\texttt{Ph\_sel}
object} . In non-ALEX smFRET data we have 3 base photon streams
(table~\ref{tab:ph_sel_smfret}), while in ALEX data we have 5 base photon
streams (table~\ref{tab:ph_sel_alex}).
The
\href{http://fretbursts.readthedocs.org/en/latest/ph_sel.html}{\texttt{Ph\_sel}
class} allows to express any combination of photon streams.
For example, in ALEX measurements, the D-emission during A-excitation stream is
usually excluded because it does not contain any useful signal~\cite{Lee_2005}.
To indicate all but the photons in this photon stream we write
\verb|Ph_sel(Dex='DAem', Aex='Aem')|, that litteraly means \textit{select donor
and acceptor photons (DAem) during donor excitation (Dex) and only acceptor
photons (Aem) during acceptor excitation (Aex)}.
\begin{table}
\begin{tabular}{l|l}
...
\subsection{Background definitions}
\label{sec:bg_intro}
Even when no molecule is crossing the excitation volume, there are “background
counts” due to detectors dark counts, out of focus molecules and sample
scattering and/or auto-fluorescence. Figure~\ref{fig:bgdist} shows the typical
distribution of timestamps delays (i.e. the waiting times between two subsequent
timestamps) in a smFRET measurement. The “tail” of the distribution (a line in
semi-log scale) corresponds to exponentially-distributed delays, indicating that
those counts are generated by a
\href{http://en.wikipedia.org/wiki/Poisson_process}{Poisson process}. At short
timescales, the distribution departs from the exponential due to the bursts of
photons from diffusing single-molecules (the signal). To estimate the background
rate, (i.e. the exponential time constant) we need to select a minimal timestamp
delay threshold above which the distribution can be considered exponential. We
also need to chose a fitting method, for example the Maximum Likelihood
Estimation (MLE) or a curve fit of the histogram via non-linear least squares
(NLSQ).
Both burst search and burst correction require background rates for all the
different photon streams. Furthermore, we want to estimate the background
periodically (every few seconds) because it can varies during the measurement on
time scales of tens of seconds. FRETBursts splits the data in uniform time
slices called \textit{background periods} and compute the background rates for
each of these slices (see section~\ref{sec:bg_calc}). The slicing in background
periods is also used during burst search to compute a background-dependent
threshold and to apply the burst correction (section~\ref{sec:burstsearch}).
\subsection{The \texttt{Data} class}
\label{sec:data_intro}
The
\href{http://fretbursts.readthedocs.org/en/latest/data_class.html}{\texttt{Data}
class} is the fundamental data container in FRETBursts. It contains the
measurement data and provides several methods for data analysis (background
estimation, burst search, etc...). It also stores all the analysis results
(bursts data, estimated parameters).
All the arrays in Data are contained in lists whose length is equal to the
number of excitation spots. This means that for single-spot measurements all the
arrays are wrapped in 1-element lists. For example, the bursts data field
\verb|Data.mbursts| will be a 1-element list and \verb|Data.mbursts[0]| will be
the actual numpy array of burst data. \verb|Data|implements a shortcut syntax
that allows accessing
\verb|Data.mbursts[0]| as \verb|Data.mbursts_| (valid for all the fields).
As an example the following are some important burst-data fields:
\begin{itemize}
\item \verb|nd|: number of photons detected through the donor channel (during
donor excitation), after correction
\item \verb|na|: number of photons detected through the acceptor channel (during
donor excitation), after correction
\item \verb|naa|: number of photons detected through the acceptor channel during
acceptor excitation, after correction
\end{itemize}
\subsection{Plotting "Data"}
FRETBursts uses matplotlib~\cite{2096e2a4-8f50-4519-bfb3-f796da201630} to
provide a wide range of built-in plot functions for \verb|Data| objects. The
plot sysntax is the same both for single and multi-spot measurements. Almost all
the plot commands are called through the wrapper function \verb|dplot|, for
example to plot a timetrace of the photon data we type:
\begin{verbatim}
dplot(d, timetrace)
\end{verbatim}
The function \verb|dplot| is the generic plot function that creates the figure
and handles details common to all the plotting functions (i.e. the title).
\verb|d| is the \verb|Data| variable and \verb|timetrace| is the actual plot
function that operates on a single channel. In multi-spot measurements
\verb|dplot| creates one subplot for each spot and calls \verb|timetrace| for
each channel.
All the built-in plot functions that can be passed to \verb|dplot| are defined
in the \verb|burst_plot| module. When importing fretbursts all the plot
functions are also imported. To make easy to find plot function through
auto-completion, all the plot functions names start with a prefix indicating the
plot type. The plot names prefixes are: \verb|timetrace| for binned timetraces
of photon data, \verb|ratetrace| for rates of photons as a function of time (non
binnned), \verb|hist| for functions plotting histograms and \verb|scatter| for
scatter plots.
Additional plots can be manually created directly with matplotlib.
Usually plots are displayed inline in the notebook. However a few plot functions
such as \verb|timetrace| and \verb|hist2d_alex| have interactive features that
can be enabled when using the QT4 backend that open the plot in an external
window. It is possible to switch backend from inline to QT and vice versa using
the ipython commands \verb|%matplotlib qt|
and \verb|%matplotlib inline|. For example, after switching to the QT4 backend
the %the following commads:
\begin{verbatim}
dplot(d, timetrace, scroll=True)
\end{verbatim}
opens a new window with a timetrace plot and an horizontal scrollbar for quick
"scrolling" throughout the measurement.
Similarly, calling the \verb|hist2d_alex| function with the QT4 backend allows
selecting an area on the E-S histogram using the mouse.
\begin{verbatim}
dplot(ds, hist2d_alex, gui_sel=True)
\end{verbatim}
The values that identify the region are printed and can be copied an pasted as
argument for the burst sealection function \verb|select_bursts.ES| (see
section~\ref{sec:burstsel}).
diff --git a/Introduction.tex b/Introduction.tex
index 49c5d80..c70cdcd 100644
--- a/Introduction.tex
+++ b/Introduction.tex
...
\subsection{smFRET and burst analysis}
FRETBursts is a python package for burst analysis of confocal single-molecule
FRET
(smFRET) smFRET) data.
\textit{Expand abstract to introduce smFRET and what is burst analysis}.
\subsection{Installing FRETBursts}
FRETBursts is a standard python package that requires the "scipy stack", a set
of core
scientific python packages.
The "scipy stack" is easily installed through a free scientific python
distribution such as Continuum Anaconda, although some users may prefer another
distribution or a manual installation.
FRETBursts can be installed through the standard python package manager (PIP)
with
the command \texttt{pip install fretbursts}. Alternatively the latest
development version can be installed from GitHub.
For more information on different installation methods see the
\href{http://fretbursts.readthedocs.org/en/latest/installation.html}{FRETBursts
documentation}.
\subsection{Executing FRETBursts}
In general, we suggest to import FRETbursts with the expression:
...
>>> import fretbursts as fb
\end{verbatim}
that will make available all the FRETBursts functions with a concise `fb.`
prefix. In this article, however, we assume that FRETBursts is imported with the
shortcut form:
\begin{verbatim}
>>> from fretbursts import *
\end{verbatim}
that allows to skip the \verb|fb.| prefix and also imports some common numeric
libraries (numpy and matplotlib.pyplot imported as \verb|np| and \verb|plt|
respectively).
Furthermore we encourage using FRETBursts through the IPython Notebook
environment. All the FRETBursts tutorials are ipython notebook documents and,
indeed, a quick way to start a new analysis is copying a pre-existing FRETBursts
notebook and modifying it.
Furthermore we encourage using FRETBursts through The "notebook workflow"\cite{Shen_2014} has the
IPython Notebook environment. All advantage of automatically
recording all the
FRETBursts tutorials are ipython notebook documents and, indeed, a quick way to start a new analysis
is copying a pre-existing FRETBursts notebook steps including
data file names, software versions, analysis details and
modifying it. the full output
(figures, tables, etc...). The full, reproducible analysis becomes a document
The "notebook workflow"\cite{Shen_2014} has the advantage of automatically recording all the analysis steps including
data file names, software versions, analysis details and the full output (figures, tables, etc...). The full, reproducible analysis becomes a document
diff --git a/main.tex b/main.tex
new file mode 100644
index 0000000..852af17
--- /dev/null
+++ b/main.tex
...
\input{preamble}
\usepackage{hyperref}
\usepackage{listings}
\bibliographystyle{plain}
\author{Antonino Ingargiola}
\title{\input{title}}
\begin{document}
\maketitle
\input{"figures/ph_delays_distrib1/caption"}
\begin{abstract}
\input{Abstract.tex}
\end{abstract}
\input{Introduction.tex}
\input{Concepts.tex}
\input{"Loading data"}
\input{"Background estimation"}
\begin{figure}
\includegraphics{"figures/ph_delays_distrib1/ph_delays_distrib1"}
\caption[]{\input{"figures/ph_delays_distrib1/caption"}}
\end{figure}
\input{"Burst search.tex"}
\input{"Burst selection"}
\input{Fitting}
\input{Conclusions}
\bibliography{bibliography/biblio}
\end{document}