\documentclass[10pt]{article}
\usepackage{fullpage}
\usepackage{setspace}
\usepackage{parskip}
\usepackage{titlesec}
\usepackage[section]{placeins}
\usepackage{xcolor}
\usepackage{breakcites}
\usepackage{lineno}
\usepackage{hyphenat}
\PassOptionsToPackage{hyphens}{url}
\usepackage[colorlinks = true,
linkcolor = blue,
urlcolor = blue,
citecolor = blue,
anchorcolor = blue]{hyperref}
\usepackage{etoolbox}
\makeatletter
\patchcmd\@combinedblfloats{\box\@outputbox}{\unvbox\@outputbox}{}{%
\errmessage{\noexpand\@combinedblfloats could not be patched}%
}%
\makeatother
\usepackage[round]{natbib}
\let\cite\citep
\renewenvironment{abstract}
{{\bfseries\noindent{\abstractname}\par\nobreak}\footnotesize}
{\bigskip}
\titlespacing{\section}{0pt}{*3}{*1}
\titlespacing{\subsection}{0pt}{*2}{*0.5}
\titlespacing{\subsubsection}{0pt}{*1.5}{0pt}
\usepackage{authblk}
\usepackage{graphicx}
\usepackage[space]{grffile}
\usepackage{latexsym}
\usepackage{textcomp}
\usepackage{longtable}
\usepackage{tabulary}
\usepackage{booktabs,array,multirow}
\usepackage{amsfonts,amsmath,amssymb}
\providecommand\citet{\cite}
\providecommand\citep{\cite}
\providecommand\citealt{\cite}
% You can conditionalize code for latexml or normal latex using this.
\newif\iflatexml\latexmlfalse
\providecommand{\tightlist}{\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}%
\AtBeginDocument{\DeclareGraphicsExtensions{.pdf,.PDF,.eps,.EPS,.png,.PNG,.tif,.TIF,.jpg,.JPG,.jpeg,.JPEG}}
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
\begin{document}
\title{GLMM methods paper: Making the most of your {[}biodiversity;
ecological{]} data: Advantages of mixed effects/hierarchical models for
the analysis of complex data}
\author[1]{Nathan Brouwer}%
\affil[1]{National Aviary}%
\vspace{-1em}
\date{\today}
\begingroup
\let\center\flushleft
\let\endcenter\endflushleft
\maketitle
\endgroup
\sloppy
\textbf{Alternative titles:}
\par\null
* Making the most of your data: Advantages of mixed effects/hierarchical
models for the analysis of biodiversity monitoring and other complex
data~
* ?
\par\null
\textbf{GROUP x: Your data probably demand it}
\textbf{Advantage 1: Mixed effects models are typically the correct
model for your data}
Poster: ``Account for ``pseudoreplication''"
\par\null
\textbf{GROUP 2: Make use of all your hard-earned data}
\textbf{Advantage x: Improve inference for individual species by
Leverage information across groups/sites species (on partial pooling
etc)}
Poster: ``Make use of all of your hard-earned data''
\textbf{-}Lloyd case study: model rare species,
-Faaborg case study
-Costa Rica
\par\null
Avoid data exclusion rules (made figure for this for poster)
\par\null
\textbf{Potential problem: Does including rare groups introduce bias}
Poster: ``Minimal bias due to inclusion rules''
\par\null
\textbf{Advantage x: Improve inference across species by using
multi-level modeling}
\textbf{-}poster: ``Directly model group-level variation''
-trait modeling (Costa Rica; Mencia? Faaborg?)
\par\null
\textbf{Advantage z: Using all your data can increase power (vs. end
point analysis)}
\par\null
\textbf{Advantage c: Use variance partitioning to better understand your
data}
-can I characterize time series variation using varcomp? which time
series is more variable (Lloyd, Aceitillar, Faaborg)
-which time period is more varible - early 2000s or more recent?
-Crone?
\par\null\par\null
\textbf{GROUP X: Don't throw the baby out with the bathwater}
\textbf{Advantage x: They preserve the direct biological meaning of the
data (vs. remedial measures)}
\textbf{Advantage z: Say Ciao to Bonferonni}
\textbf{Advantage q: Smooth out pesky ``outliers''!}
\par\null\par\null
\textbf{GROUP X: Keeping your nose clean}
\textbf{Advantage z: Avoiding the garden of forking paths / data
dredging}
-simulation of how likely you are to get a significant trend when you
study x-species
\par\null\par\null
\par\null
Tying in to other fields
-deer exclosure studies: studies that model focal species
\par\null\par\null
\par\null\par\null\par\null\par\null\par\null\par\null
\textbf{Advantage 1: Mixed effects models are typically the correct
model for your data}
Unless you have done completely randomized assignment of your treatments
and taken only one measurement on study subjects, there is a good chance
you either need a mixed effects model or need to remediate the situation
(Murtaugh).~ Identifying when a the appropriate structure of a random
effects model can be tricky, especially with observational data and we
fairly commonly encounter models that are mis-specificed while reviewing
papers and also in the published literature.~ Remediation of the problem
can often involve including nuisance parameters that are not your focal
interest, such as including site or year as a factor in a model (series
of papers in Oecologial/Oikos on specifying ANOVAs for multi site
experiments?).~ Remediation can also involve taking the average of
subsamples, randomly selecting a single observation (example from
meta-analysis I read \ldots{} I think was in Biological Reviews \ldots{}
on parasites - Scandenavia dude who is controversial or some other
Europeans - randomly chose one effect from studies in their meta
analysis when there were multiple effect sizes reported. )~ In some
cases remediation can simplify the structure or interpretation of the
model (See also the note in Bolker's book chapter).~ Much of this paper
serves as an argument for the advantages of mixed models that are
relevant to remedial measures
Papers :On mis-specification
Schmidt-Catran \& Fairborht 2015. The Random Effects in Multilevel
Models: Getting Them Wrong and Getting Them Right
\par\null
Jensen et al 2017.~ Experimental design matters for statistical
analysis: how to handle blocking.~ Pest Management Science.
\href{http://onlinelibrary.wiley.com/doi/10.1002/ps.4773/full}{http://onlinelibrary.wiley.com/doi/10.1002/ps.4773/ful}
\par\null
Nakagawa - nested by design?
Mis specification results in pseudoreplication.
\par\null
\textbf{Forstmeir et al 2016 Detecting and avoiding likely
false-positive findings~--~a practical guid}
\textbf{Table 1: ``Non-independence of data points (e.g. related
individuals, temporal and spatial autocorrelation) ''}
\textbf{}
\begin{itemize}
\tightlist
\item
\textbf{Test for non-independence, autocorrelation}
\item
\textbf{Fit grouping variables as random effects (intercepts, slopes,
space, time, pedigrees)}
\item
\textbf{Run analysis at the level where independence is met}
\item
\textbf{Balance experiments for confounding effects}
\end{itemize}
\textbf{example "~(a)~}\emph{\textbf{Pseudoreplication at the individual
level}}
Papers on meta-analysis and not using random effects models? treating
within-study reps as independent?
\par\null\par\null
\textbf{\emph{Published Examples}}
\par\null
\#\#\#\# Lloyd et al 2015 bird monitoring
\#\#\#\# Burns and Steer 2006 Ibis Dominance rank influences food
hoarding in New Zealand Robins \emph{Petroica australi}
\#\#\#\# Binomial data - treating things as independent trials when they
aren't -
paper(s) in behavioral journals on sex ratios!
\textbf{Advantage 2: They preserve the direct biological meaning of the
data (vs. remedial measures)}
Murtaughs approach - does it favor just doing significant testing?~
Does/How does taking the average of a set of subsampales change the
interpretation of the data?
Multilevel models favored by quantitative ecologists b/c we are interest
in coming up with estimates of precise quantities and their variances at
relevant ecological scales, eg individual level vital rates, population
trends
\par\null
\textbf{\emph{Examples}}
Any examples of remedial measures gone bad - meta analysis throwing away
information
Others?
Poisson data, binomial - if you average it you are converting it to
normal (or are tempted to)?
2-stage modeling; linear regression then regression on the regression
slopes; changes interpretation; also ignores the correlations structure;
example for Analysis of Biological Data
\par\null\par\null\par\null\par\null
\par\null
\emph{}
Limitations and issues
-interpretation for split plot
-difficulty of convergence, diagnosis
-data dredging you random effects
-hypothesis testing
-going bayes (not actually a problem)
-data intensive
-power analysis harder
-AIC trickier
-random slopes models are really data intensive
-for variance partitioning you need \ldots{}?
-for estimating sigma to have more data (\textgreater{}5, 8, 20?)
\par\null\par\null\par\null\par\null
\textbf{Glossary}
power analysis
variance components
variance partitioning
Bayesian ANOVA
random intercepts
random slopes
crossed random effect: cross-classified
bootstrapping
parametric boot strapping
split-plot
pair t-test
Mixed model / Mixed effects model
random effects model
Mixture model
Hierarchical model
Multilevel model
Bayesian (Multi-level) Model
Repeated measures model
repeated-measures ANOVA (rmANOVA)
generalized linear model (GLM)
Generalized linear mixed model (GLMM)
Generalized estimating equation (GEE)
autocorrelation (spatial, temporal)
crossed random effects
two-stage modeling
meta-analysis
lme4
nlme
lmer
glmer
lme
BUGs
WinBUGs
OpenBUGs
JAGS
Stan
rstanarm
\par\null\par\null\par\null\par\null\par\null
\emph{}
\selectlanguage{english}
\FloatBarrier
\end{document}