A nested sampling code for targeted searches for continuous gravitational waves from pulsars

Abstract

This document will detail a code to perform parameter estimation and model selection in targeted searches for continuous gravitational waves from known pulsars. We will describe the general workings of the code along with characterising it on simulated data and real LIGO S5 data. We will also show how it performs compared to a previous MCMC and grid-based approach to signal parameter estimation. An appendix will detail how to run the code in a variety of cases.

One of the primary methods used in searches for gravitational waves from known pulsars is based on two stages: a heterodyne stage that pre-processes the calibrated gravitational wave detector data by removing a signal’s expected intrinsic phase evolution (based on the observed solution from electromagnetic observations), whilst also low-pass filtering and down-sampling (via averaging) the heterodyned data (Dupuis et al., 2005); and, a parameter estimation stage that takes the processed data and uses it to estimate posterior probability distributions of the unknown signal parameters (Abbott et al., 2010). This method has been variously called the Time Domain Method, the Heterodyne Method, the Bayesian Method, the Glasgow Method, or some combination of these names. Up to now this method has only been used to set upper limits on signal amplitudes, but has included no explicit criteria for deciding on signal detection/significance. A further limitation has been that the MCMC method used was not designed to be efficient and required a lot of tuning. It also did not straightforwardly have the ability to perform a search on the data over small ranges in the phase evolution parameters (e.g. required if the gravitational wave signal does not well match the known pulsar’s phase evolution). For these reasons the parameter estimation code has been re-written to use the nested sampling algorithm (Skilling 2006), in particular a method based on that described in Veitch et al. (2010). This allows us to evaluate the evidence or marginal likelihood for the signal model and compare it to other model evidences (i.e. the data is just Gaussian noise), whilst also giving posterior probability distributions for the signal parameters.

This code makes use of the nested sampling algorithm provided within the LALInference software library (Veitch et al., 2015). As such this document will not give a detailed description of how that algorithm works, but will provide some information on the specific proposals that can be used within the algorithm. This method has previously been briefly described in Pitkin et al. (2012).

The code can also be used to perform parameter estimation and model selection for signals for any generic metric theory of gravity, however its use for this will be discussed in detail in a separate paper. When searching over parameters that vary the phase evolution there is potential to speed up the code through efficient likelihood evaluation via the reduced order quadrature method, but again that will be discussed in more detail in a future paper.

The code calculates the Bayesian evidence, or marginal likelihood for a particular signal model. The evidence, \(\mathcal{Z}\), for a given model, \(M\), defined by a set of parameters, \(\vec{\theta}\) is given by

\begin{equation} \label{eq:evidence}\mathcal{Z}_{M}=p(d|M,I)=\int^{\vec{\theta}}p(d|\vec{\theta},M,I)p(\vec{\theta}|M,I){\rm d}\vec{\theta},\\ \end{equation}where \(p(d|\vec{\theta},M,I)\) is the likelihood function for the data \(d\) given the model and its set of defining parameters and \(p(\vec{\theta}|M,I)\) is the prior probability distribution on \(\vec{\theta}\). During nested sampling this integral is evaluated by transforming it into the one dimensional sum

\begin{equation} \label{eq:nestedsampev}\mathcal{Z}_{M}=\sum_{i}^{N}L_{i}w_{i},\\ \end{equation}where \(L_{i}\) is the likelihood and \(w_{i}=p(\vec{\theta}|M,I)\Delta\vec{\theta}\) is the “prior weight” (i.e. inverse prior volume occupied by point \(i\)).

As a default the signal model evidence is compared to the evidence that the data consists of Gaussian noise to form an odds ratio for the two models

\begin{equation} \label{eq:oddsratio}\mathcal{O}=\frac{p(d|M,I)}{p(d|\text{noise},I)}\frac{p(M|I)}{p(\text{noise}|I)}=\frac{\mathcal{Z}_{M}}{\mathcal{Z}_{\text{noise}}},\\ \end{equation}where on the right hand side we have explicitly set the prior odds for the two models to \(p(M|I)/p(\text{noise}|I)=1\).

Other than this evidence value and odds ratio the code will also output all of the samples accumulated during
the nested sampling process. These samples will not be drawn from the posterior probability distribution as in
an MCMC method, but they can be resampled to generate a subset of samples that are drawn from the posterior
distribution. This resampling (performed using the lalapps_nest2pos python script within
lalapps^{1}^{1}https://www.lsc-group.phys.uwm.edu/daswg/projects/lalsuite.html) uses the
value of \(L_{i}w_{i}\) for each point, normalised by \((L_{i}w_{i})_{\rm max}\) to give values proportional to the
posterior probability, and then accepts a point with a probability given by its value, i.e. point is
accepted with the probability

In our case the data \(d\) in the above equations is heterodyned, low-pass filtered and down-sampled data for a detector, or set of detectors, each giving a single data stream or two data streams depending on whether the heterodyne was performed at one or both potential signal frequencies near the rotation frequency and/or twice the rotation frequency. Here we will use \(\mathbf{B}\) to represent a vector of these heterodyned data values, for which a single time sample is often referred to using the jargon term \(B_{k}\) (“B of k”), although we will not consistently use \(k\) as the time index for the data.

Here we will describe the core parts of the code defining the signal model and the probability functions used for the inference. These assume that the data consists of calibrated narrow-band complex heteroyned time series’. These time series data streams can be from different detectors, and/or different heterodyne frequencies. For example you could have a pair times series from both the LIGO Hanford (H1) and LIGO Livingston (L1) detectors, with one produced with a heterodyne at twice the rotation frequency of a known pulsar and the other produced with a heterodyne at the rotation frequency.

Our code assumes that the rotational phase evolution of a pulsar is defined by the Taylor expansion in phase

\begin{equation} \phi(t)=2\pi\left(fT+\frac{1}{2}\dot{f}T^{2}+\frac{1}{6}\ddot{f}T^{3}\ldots\right)\\ \end{equation}where \(T\) is the time corrected to an inertial reference frame (the solar system barycentre for isolated pulsars, or the binary system barycentre for pulsars in binaries), and the \(f\)’s give the rotation frequency and its time derivatives. The value of \(T=(t+\tau(t)-t_{0})\) where \(t\) is the time at a detector, \(\tau(t)\) is the time dependent correction to the inertial frame and \(t_{0}\) is the epoch. In practice the code can currently accept frequency derivatives up to the ninth order. We assume the calibrated detector data \(d(t)\) has been heterodyned such that

\begin{equation} B^{\prime}(t)=d(t)e^{-i\Phi_{i,\rm het}(t)},\\ \end{equation}where \(\Phi_{i,\rm het}(t)\) is the phase evolution for a given data stream \(i\), and we produce \(B(t)\) by low-pass filtering and averaging values of \(B^{\prime}(t)\).

Under the standard assumption that the general theory of relativity (GR) is correct the code uses the form of the signal model defined in Jones (2015), which, when heterodyned and asumming low-pass filtering, gives a signal at a pulsar’s rotation frequency (where \(\Phi_{1,{\rm het}}(t)=\phi(t)\)) of

\begin{equation} \label{eq:hf}h_{f}(t)=e^{i\Delta\phi_{1}(t)}\left(-\frac{C_{21}}{4}F_{+}(\psi,t)\sin{\iota}\cos{\iota}e^{i\Phi_{21}^{C}}+i\frac{C_{21}}{4}F_{\times}(\psi,t)\sin{\iota}e^{i\Phi_{21}^{C}}\right)\\ \end{equation}and at twice the pulsar’s rotation frequency (where \(\Phi_{2,{\rm het}}(t)=2\phi(t)\)) of

\begin{equation} \label{eq:h2f}h_{2f}(t)=e^{i\Delta\phi_{2}(t)}\left(-\frac{C_{22}}{2}F_{+}(\psi,t)[1+\cos{}^{2}\iota]e^{i\Phi_{22}^{C}}+iC_{22}F_{\times}(\psi,t)\cos{\iota}e^{i\Phi_{22}^{C}}\right).\\ \end{equation}The \(F_{+}\) and \(F_{\times}\) values are the detector dependent antenna patterns, which are a function of the detector position, source sky position and source polarisation angle \(\psi\). The \(C_{21}\), \(C_{22}\), \(\Phi_{21}^{C}\) and \(\Phi_{22}^{C}\) values are convenient ways of representing the waveform in terms of an amplitude and phase of the signal for the \(l=2\), \(m=1\) harmonic and \(l=m=2\) harmonic respectively. The \(\Delta\phi(t)\) values represent any time dependent phase deviation between the phase used in the heterodyne and the true signal phase (which does not necessarily have to precisely follow the true rotational phase), so \(\Delta\phi_{1}(t)=(\phi_{1,{\rm true}}(t)-\Phi_{1,{\rm het}}(t))\) and \(\Delta\phi_{2}(t)=(\phi_{2,{\rm true}}(t)-\Phi_{2,{\rm het}}(t))\).

To calculate the \(\Delta\phi\) values using up to the \((n-1)^{\rm th}\) frequency derivative, and try to avoid numerical overflow issues when dealing with large phases, the following equation is used

\begin{equation} \Delta\phi_{j}(t)=2\pi\sum_{k=1}^{n}\left(\frac{\left(f^{(k-1)}_{j,{\rm true}}-f^{(k-1)}_{j,{\rm het}}\right)}{k!}(t+\delta t_{\rm het})^{k}+\frac{f^{(k-1)}_{j,{\rm true}}}{k!}\sum_{i=0}^{i<k}\left(\begin{array}[]{c}k\\ i\end{array}\right)(\delta t_{\rm true}-\delta t_{\rm het})^{k-i}(t+\delta t_{\rm het})^{i}\right),\\ \end{equation}where \(f^{(n)}\) is the \(n^{\rm th}\) frequency derivative, and \(\delta t\) is the combination of any solar system barycentring and binary system barycentring time delays.

By default the code assumes emission just from the \(l=m=2\) mode, i.e. there is only a signal at twice the rotation frequency. In this case \(C_{22}\) and \(\Phi_{22}^{C}\) can be related to the more familiar physical \(h_{0}\) and \(\phi_{0}\) values via \(h_{0}=-2C_{22}\) (where the minus sign maintains consistency with the form given in Jaranowski et al., 1998) and \(\phi_{0}=\Phi_{22}^{C}/2\). For the more general case the relations between the waveform amplitude and phase parameters and physical source parameters are given in Jones (2015). In general for previous searches we have often assumed that we track the true signal phase perfectly with the heterodyne and as such \(\Delta\phi_{i}(t)=0\). In such cases the only time varying components of the signal are the antenna pattern functions, which allows great speed increases in the signal generation and likelihood calculations.

Our code can make use of two different likelihood functions. The default likelihood function is a Student’s t-likelihood function in which it is assumed that the standard deviation of the noise in the data is unknown, and can therefore be marginalised over. A Gaussian likelihood function can also be used, for which the code can either take in estimates of the noise standard deviation at each data point, or calculates these internally based on stationary stretches of data. For the Student’s t-likelihood function, and if calculating noise standard deviations for the Gaussian likelihood function internally, the code needs to break up the data into chunks that have (roughly) the same distribution. The method for doing this is given in Section \ref{sec:splitting}.

A full derivative of the Student’s t-likelihood function (see e.g. Dupuis et al., 2005) is given in Appendix \ref{ap

## Share on Social Media