Awaiting Activation edited untitled.tex  about 8 years ago

Commit id: 99abfbd1a6624ab1f4b969c0f5aafa9cfca87f2d

deletions | additions      

       

\section{Overview}  Beta regression is used when the output $y$ is between 0 and 1, that is $y\in[0,1]$. Common examples include the fraction of employees participating in a company 401k plan or any response that is a percentage. Such Linear regression has often been applied due to its simplicity but such data violates key assumptions. First, responses bounded by 0 and 1 are not normally distributed. Second, such  data is usually heteroskedastic: they show more variation around heteroscedastic. For example,  the mean and less variation variance shrinks  aswe approach the boundaries 0 and 1. Also,  the distributions are usually asymmetric and Gaussian-based mean  approaches are often inaccurate. the boundary point 0, 1.  Instead, beta regression assumes the response to follow a \textit{beta distribution}. The beta distribution is usually given by \begin{equation}  {f(y;p,q)} = {\frac{\Gamma(p+q)}{\Gamma(p)\Gamma(q)}}y^{p-1}(1-y)^{q-1}, \quad 0  \end{equation}