# deb-bose-MATH5960-assign1

01a

Given that $$\theta$$ is the true proportion of people over age 40 in my community to have hypertension, I would assume $$\theta\sim Beta(\alpha,\beta)$$ Encoding my prior belief about $$\theta$$ (and it’s point estimate such as mean), let’s assume $$\hat{\theta}=0.6$$. From $$Beta$$ distribution

\begin{align} \int\theta p(\theta)d\theta=\frac{\alpha}{\alpha+\beta}=0.6\\ \end{align}

Let’s choose $$\alpha=6,\beta=4$$

$$\begin{split}\displaystyle p(\theta|x)&\displaystyle\propto p(x|\theta)p(\theta)\end{split}\\$$

01b

Now after the initial survey (set of bernoulli trials; bionomial distribution), need to update belief about $$\theta$$. We know that with $$Beta$$ prior, the posterior is also a conjugate $$Beta$$ for such distribution. $$\theta|y\sim Beta(\alpha+y,n+\beta-y)$$

$$\begin{split}\displaystyle p(\theta|y)&\displaystyle\propto p(y|\theta)p(\theta)\\ &\displaystyle\propto\theta^{y}(1-\theta)^{n-y}\theta^{\alpha-1}(1-\theta)^{\beta-1}\end{split}\\$$

where $$n$$ = total number of people selected for the survey and $$y$$ = total number of hypertensive people observed. To calculate the point estimate

\begin{align} \overline{\theta}_{post}=\int\theta p(\theta|y)d\theta=E(\theta|y)=\frac{\alpha+y}{\alpha+\beta+n}\\ \end{align}

Plugging $$y=4,n=5,\alpha=6,\beta=4,\overline{\theta}_{post}=0.66$$. A $$10\%$$ improvement in my belief about $$\theta$$. The prior dominates.

01c

Similarly, plugging $$y=400,n=1000,\alpha=6,\beta=4,\overline{\theta}_{post}=0.40$$. A 33% reduction in my belief about $$\theta$$. The data dominates as opposed to the prior.

So with smaller data-sets, incorrect prior beliefs can be reinforced. The more data we gather, the more accurate our beliefs about $$\theta$$ approaches to the real one.