Casey Law edited untitled.tex  about 10 years ago

Commit id: e215b9b04059ba33d030a40299a35f5f8495c62e

deletions | additions      

       

% scalable anomaly detection on massive data streams  % wavefront  \vspace{-1.2cm} \vspace{-1cm}  \section{Introduction} 

\section{Science}  An exciting new class offast  radio transients is the "fast radio burst" (FRB; Thornton et al. 2013, Science, 341, 53). Discovered in all-sky pulsar surveys by single-dish telescopes, their dispersion is an order of magnitude larger than expected from the Galaxy and consistent with propagation through the intergalactic medium from distances up to z$\sim$1. While little is known about FRBs, if they medium. If FRBs  lie at cosmological distances, their dispersion can be used to measure the baryonic mass of the IGM. Beyond using FRBs as probes, understanding the origin of FRBs may have relevance to gamma-ray bursts and sources of gravitational waves. Nearer to our own Galaxy, pulsar surveys have discovered the "rotating radio transient" (RRAT; McLaughlin et al. 2006, Nature, 439, 817), a spinning neutron star that sporadically pulses. While a few dozen RRATs are now known, it is unclear whether they are tied to extreme objects like magnetars or simply ordinary pulsars that emit bright pulses detectable individually. Pulsars have now been The first pulsar was recently  detected in Andromeda (Rubio-Hererra et al. 2013, MNRAS, 428, 2857). Much as with FRBs, the The  dispersionmeasure at that distance will probe the baryon content  of radio transients is highly sensitive to baryons in  the outer fringes (the "halo") of the Milky Way and M31. Roughly 50\% of baryons in the local universe have not been directly detected, which has been referred to as the detected and fast radio transients may help solve this  "missing baryon problem". Much closer to earth, we know that Jupiter emits intense radio bursts that make it the brightest astronomical object at low radio frequencies. Coronal mass ejections (much as seen in the Sun), also drive radio fast, coherent radio flares. These processes could be used to measure magnetism and plasma properties of other stars and should profoundly affect the habitability of orbiting exoplanets. Both of these mechanisms should be detectable as subsecond transients.   \section{Real-Time Detection as Solution to Big Data Challenge}  The technical requirements for our radio transient searchs are extreme in astronomy, but are becoming more commonin astronomy  (e.g., see plans for the SKA and LSST). Lessons learned from our project will have increasing relevance to scientists working to solve the "needle in a haystack" problem. Currently, we are recording data to disk at a rate of 1 TB hour$^{-1}$ and processing it on compute clusters near the VLA, at Los Alamos National Lab, and NERSC. The internet is too slow to transport the 1 TB hour$^{-1}$ data stream, so we ship disks to our computing centers. This approach is complex andtime consuming, so it is  not sustainable as a normal observing mode. We believe a sustainable solution will use real-time transient detection. Bringing computational support closer telescopes ameliorates in  the distribution problem and let's us ignore data we know is uninteresting, a technique known as "data triage". large campaigns needed to find many fast radio transients.  In some applications, I am interested in thinking about how real-time processing can help solve  the process challenges  of measuring all information about a transient candidate may substantially greater than simply detecting it. The difference between big data. By bringing computational support closer to  the two can be critical for extreme data rate applications. Once telescope, real-time detection makes it possible to decide whether  a transient candidate is detected, the given segment of  data associated with the candidate can be saved for more detailed analysis. Data triage is worth saving or not. This kind of "data triage"  is routinely employed in the particle physics community, where a well-defined theory predicts the interactions but not elsewhere. The rise  of a particle with the detector. A detailed theory is critical to define what the \emph{absence} big data will require this kind  of a detection means. focus to avoid being overwhelmed.