Heather Campbell edited sectionClassificatio.tex  over 10 years ago

Commit id: 309b563cd71708e3d8d41f7b2cf5c275fa232a49

deletions | additions      

       

\section{Classification} \label{class} Gaia is predicted to detect 44 million transits per day,which is $\sim$150 - 800 GByte/day of data. Within this huge volume of data we expect 100s -1000s of potential interesting astrophysical triggers per day (real variables/moving objects). This precludes visual classification of a rich data stream and thus automated methods which are fast, repeatable and tuneable are essential. The Gaia alerts classification pipeline uses random forest classification. The random forest will use all the information available, and its features will include; light curve photometry (gradient, amplitude, historic rms, magnitude, SNR, transit rms), lowers spectra (flux v lambda, colours, SSCs,SpTy), auxiliary information (neighbour star, shape pars, motion pars, coords, crowding, calibration offset, correlations, QC pars) and crossmatch environment (near known star mags, near known star cols, near known variable class, near galaxy, near galaxy redshift, circumnuclear). To build up a sufficient sample of classification labels in order to train the random forest classifier (e.g. \cite{Bloom}) \cite{Ofek_Cenko_Butler_et_al__2012}\cite{Bloom})  we aim to observe $\sim$500s homogenous high-quality spectra in the first year of the mission, spread across each broad class of transient phenomena (AGN/cnSN/TDE, SN/Novae, VarStar-CV, VarStar-Misc, VarStar-Periodic). The light curve classification utilises the flux gradient of the transient object. The Gaia observations with 106.5minute cadence are used to indicate the type of object. The lowers (BP/RP) spectra provide far more information to aid classification \cite{Blagorodnova} and provide robust class for most objects, at $>$19mag, when the classifier is fully trained on representative data. In addition, the transient object will be cross matched with archival catalogues, for example, SDSS, 2MASS, HST and VISTA. This will help remove known variable star contaminates and provide environmental information for the transient events, e.g. is there a host galaxy associated with the source and if so what is the type and magnitude.