Background estimation

\label{sect:background}

There are 2 processes that contribute the most to the background signal. The first is a Drell-Yan decay of the form

\[Z \rightarrow \mu^+ \mu^- b \overline{b}\]

and will later be referred to as \(dy\).

The second biggest background (to be reffered to as \(t\overline{t}\)) is a two step decay in which a \(t\) and a \(\overline{t}\) both decay to the corresponding b-quark and a W-boson which subsequently decays to a muon and its neutrino :

\[t \overline{t} \rightarrow W^+ W^- b \overline{b} \rightarrow \mu^+ \mu^- b \overline{b} \nu \overline{\nu} *\]

While these two contribute the most to the background signal, there are 6 other proceses which have signatures that can resemble the target decay after taking into consideration fakes and mismeasurements. Those processes are:

\[\begin{split} WZ & \rightarrow 2 \ell 2 q *\\ & \rightarrow 3 \ell \nu *\\ WW & \rightarrow 2 \ell 2 \nu *\\ ZZ & \rightarrow 4 \ell \\ & \rightarrow 2 \ell 2 \nu \\ & \rightarrow 2 \ell 2 q \end{split}\]

Of these 8 backgrounds, the processes with a * after them exhibit a 2-to-1 ratio of \(e \mu\) to \(\mu \mu\) events as the final state. Using this, the number of \(\mu \mu\) events present in the data that come from these background processes can be estimated from the total number of \(e \mu\) events present in data. However, due to experimental variations, we do not expect \(\frac{N_{e \mu}}{N_{\mu \mu}} = 2\). For this reason, the actual number used is computed in the \(tt\) dominated control region using the ratio of \(e \mu\) events to \(\mu \mu\) events in data, subtracting off the events from non 2:1 backgrounds. This ratio was found to be \(R_{2:1} = 2.01 \pm 0.033\)

For those processes which do not exhibit the 2-to-1 ratio, the expected number of events is computed based on estimates produced by the Monte Carlo samples discussed in section \ref{sect:eventSelection}. Since the Monte Carlo simulations cannot be assumed to be perfect, the estimates for the non 2-to-1 processes were scaled based on the ratio in the \(dy\) dominated control region of non 2-to-1 processes in data to those in Monte Carlo, i.e. \(SF = \frac{N_{data} - N_{2:1}}{N_{non-2:1}}\). Where N corresponds to the number of \(\mu \mu\) events in the \(dy\) dominated control region.

While these numbers, calculated for each mass point, varied slightly, a \(\chi^2\) test \cite{Adke_1994} resulted in a 47% confidence that they varied around the mean value in the 200-450 GeV region and a 96% confidence in the 500 GeV and above region. While the confidence level for the first region appears to be low, large variances in these uncertainties have very minor effects and these minutia can be considered negligible. Therefore a weighted average of the calculated values in the 200-450 GeV region and the greater than 500 GeV region were used in making our final prediction.

The final expression for the total count is as follows

\[\mathrm{Total\ Pred} = \left( \mathrm{D}_\mathrm{e \mu} \div R_{2:1} \right) + \left( SF \times \mathrm{N}_\mathrm{non-2:1} \right)\]

where \(R_{2:1}\) and \(SF\) are defined as above, \(D_{e \mu}\) refers to the number of \(e \mu\) events in data, and \(N_{non-2:1}\) refers to the number of non 2-to-1 \(\mu \mu\) events in Monte Carlo.