Authorea

Heather Campbell edited sectionClassificatio.tex about 10 years ago

Commit id: 9d71fb54fe89e6971b419d1d426a16fcef0b949c

deletions | additions

\section{Classification} \label{class} Gaia is predicted to detect 44 million transits per day,which is $\sim$150 - 800 GByte/day of data. Within this huge volume of data we expect 100s -1000s of potential interesting astrophysical triggers per day (real variables/moving objects). This precludes visual classification of a rich data stream and thus automated methods which are fast, repeatable and tuneable are essential. The Gaia alerts classification pipeline uses random forest classification. The random forest will use all the information available, and its features will include; light curve photometry (gradient, amplitude, historic rms, magnitude, SNR, transit rms), lowers lowres spectra (flux v lambda, colours, SSCs,SpTy), auxiliary information (neighbour star, shape pars, motion pars, coords, crowding, calibration offset, correlations, QC pars) and crossmatch environment (near known star mags, near known star cols, near known variable class, near galaxy, near galaxy redshift, redshift and circumnuclear). To build up a sufficient sample of classification labels in order to train the random forest classifier (e.g. \citet{Ofek_Cenko_Butler_et_al__2012}) we aim to observe $\sim$500s homogenous high-quality spectra in the first year of the mission, spread across each broad class of transient phenomena (AGN/cnSN/TDE, SN/Novae, VarStar-CV, VarStar-Misc, VarStar-Periodic).