Melanie edited Analysis Plan.tex  about 8 years ago

Commit id: 5bbd89f77656adbc5a3437a47efe7bae9eb85396

deletions | additions      

       

\item new decision tree    Morphological information was obtained in GZH by a series of questions being answered by volunteers on the Galaxy Zoo website (http://zoo3.galaxyzoo.org/). The probability of a galaxy having a given morphological feature is assumed to be proportional to the \emph{vote fraction} associated with a given question, which is defined as the number of votes for a particular answer divided by the number of total votes for that question; eg the probability of a galaxy to have a bar is proportional to the vote fraction $p_{bar} = \frac{Number~of~users~to~answer~``bar"}{Number~of~users~to~answer~bar~question}$. The uncertainty in the vote fraction is therefore dependent on the number of users to answer a given question. GZH requires at a minimum of 40 users to classify a galaxy, but because of the decision-tree structure of the questions, not all 40 users will answer each question. The advantage of this system is efficiency; eg it is neither useful nor time-effective to ask a user to determine if there is a spiral arm in a galaxy they had already classified as elliptical. The disadvantage of this system is that there is then no minimum requirement of number of votes for higher-tier questions; this introduces problems particularly for galaxies which are difficult to classify from the first question, which separates smooth/spheroidal galaxies from disks. For these galaxies, it is then not as obvious if questions about features related to disk galaxies are relevant or not, since there is greater uncertainty that the galaxy was correctly classified in the previous question.   For GZH2 we will implement a new system which enhances the advantages of the GZH system in minimizing the time efforts required of users, thereby maximizing crowdsourcing efficiency, while also bypassing the disadvantages of the old system by retiring galaxies only once enough data has been obtained for \emph{each question} in the tree.  \begin{itemize}  \item past decision tree same as GZ2 \citep{Willett2013}  \item advantage of decision tree over GZ1 - reduce wasted time by only asking questions relevant to prior question