Authorea

#Methodology

##Matched filter detection
Matched filter earthquake detection relies on waveform cross correlation of a known earthquake signal or signals with continuously recorded seismic data. It has been shown to be especially useful in noisy areas where more traditional, amplitude-based techniques fail \cite{Gibbons_2006} \cite{Shelly_2007}. This makes it particularly useful in areas of geothermal power generation where large numbers of highly similar, low-magnitude earthquakes can occur and which are usually subject to high levels of anthropogenic noise associated with the power generation process.

The 637 events taken from the filtered GNS catalog were used as 'template events' for this study (Figure \ref{Figure1}). Each template waveform was one second long, starting 0.1 seconds before the P-pick on each channel for which a pick was made. Waveforms were filtered from 1.0 to 20.0 Hz after confirming that most events in the dataset contained significant amounts of energy within that pass band. Continuous seismic data were processed in an identical manner to the templates prior to matched filtering.

To generate detections, event templates were cross correlated with continuous data at a rate of 50 samples per second. At each sample, the cross correlation coefficients for each channel of data were summed to create the network detection statistic. Whenever the detection statistic exceeded a threshold value, which in this case was defined as the mean absolute deviation (MAD) of the detection statistic multiplied by 8 (as suggested by \cite{Shelly_2007}), a detection was recorded.

##NLLoc location of detections
After removing duplicate detections, picks for each of the newly detected events were made based upon cross correlation of the detecting template with the continuous data at the time of the detection. Template waveforms were correlated with network data over a 0.2 second window centered on the detection time. Picks were recorded at the time corresponding to the highest correlation value within that window. Time errors were then assigned to each pick based upon its correlation value with the template. Through visual review of a random sample of 100 events we assigned timing uncertainties of 0.01 seconds to picks with correlation values of greater than 0.90, 0.05 s for values of between 0.70 and 0.90, 0.10 seconds for values between 0.50 and 0.70, 0.30 s for values between 0.30 and 0.50 and 0.50 seconds for all other picks. We kept all events with more than 7 picks and discarded those with fewer. From an original total of nearly 30,000 detections, we were left with 15,471 detections after removing duplicates and discarding events with an insufficient number of picks. These events were then located with the nonlinear location program NonLinLoc \cite{Lomax_2014} using a 1-D model for Rotokawa used by former VUW MSc student Zara Rawlinson (\cite{rawlinson2011microseismicity} pp. 77).

##HypoDD relocation of events
As a final step, the entire catalog was relocated using the double-difference relocation program HypoDD \cite{Waldhauser_2000} with differential pick times generated using a cross correlation routine within the EQcorrscan Python package \cite{Chamberlain_2014}. Due to the size of the catalog, events were split into separate Rotokawa and Ngatamariki catalogs before being relocated with HypoDD.

##Computing
For this work we relied heavily on the seismic processing Python package Obspy \cite{Krischer_2015}. Matched filter detection was done using an open-source Python package called EQcorrscan \cite{Chamberlain_2014}, which allowed for a scalable, multi-parallel approach to the matched filtering routine. The package's so-called "embarassingly parallel" approach allowed us to make use of the PAN computing cluster at the University of Auckland through the New Zealand eScience Infrastructure (NeSI), which cut the processing time from months to a matter of a few days.