deletions | additions
diff --git a/introduction.md b/introduction.md
index cce4335..d756945 100644
--- a/introduction.md
+++ b/introduction.md
...
- If a site name is given, try to locate the site using a
combination of Google and Google Maps
- If latitude and longitude are given in the paper, search by lat
and lon, which will return all sites within
\(\pm 1\) ± 1 degree lat and
lon. long.
- If an existing site is plausibly the same site as the one
mentioned in the paper, it will be necessary to check other
papers linked to the existing site.
...
To add a new covariate, go to the [new
covariate](http:www.betydb.org/covariates/new) page
### Extracting information from tables and graphs
1. Identify the data that is associated with each treatment
*note:* If the experiment has many factors, the paper may not report the mean and statistics for each treatment. Often, the reported data will reflect the results of more than one treatment (for example, if there was no effect of the treatment on the quantity of interest). In some cases it will be possible to obtain the values for each treatment, e.g. if there are _n-1_ values and _n_ treatments. If this is not the case, the treatment names and definitions should be changed to indicate the data reflect the results of more than one experimental treatment.
2. Enter the mean value of the trait
3. Enter the `statname`, `stat`, and number of replicates, `n`
associated with the mean
* `stat` is the value of the `statname` (i.e. `statname` might be
’standard deviation’ (SD) and the `stat` is the numerical value
of the statistic)
* Always measure size of error bar from the mean to the end of an
error bar. This is the value when presented as ( _X_ ± _SE_) or
_X(SE)_ and may be found in a table or on a graph.
* Sometimes CI and LSD are presented as the entire range from the
lower to the upper end of the confidence interval. In this case,
take 1/2 of the interval representing the distance from the mean
to the upper or lower bound
#### Extracting Data using R
To extract data from a jpg file in R using the digitize package:
1. Save image as a `*.jpg` file
2. Open R
3. Change the directory that R is using to the one where the image is
4. Use R code below to extract data, display it, and save it in a `csv`
file (steps below)
5. Upload csv to the project file in google spreadsheet, or open as
excel/openoffice and copy/paste to google spreadsheet
#### Extracting Data From a Figure using GetData
1. Open PDF in Adobe Reader.
2. Zoom in on the figure
3. Choose `Tools` → `Select and Zoom`
4. Open Paint
5. Paste Picture
6. Save as `authorYYYYabc\_figX.jpg`
7. Open Get Data
8. `File` → `open` open figure
9. Select button with two arrows (fourth from left)
10. Follow instructions to select x min, x max, y min and y max. If the
x-axis has a categorical variable, it does not matter what values
you use for x min and x max.
11. Make sure to set the correct values for the max and min of each
axis, and indicate if the axis is log-scaled
12. Select the target button (seven from left)
13. Click over center of desired data points and error bars
14. Copy data to a Google spreadsheet. See [Google Spreadsheets] (#Section 3).
15. Calculate SE as the distance between the error bar upper bound and
the mean (absolute value of difference between the two points)
### Adding a PFT, Species, and Cultivar
Plant functional types (PFTs) are used to group plants for statistical
...
## Converting Units and Adjustment to Temperature
Convert from root respiration data reported in George et al (where
O$_2$ O\(_2\)
was measured in µL to units of mass
In the appendix table, George 2003 reports the range of root respiration
rates, converted to
$15°C$ \(15°C\) and standard units:
$[11.26, \([11.26, 22.52] \frac{\mathrm{nmol CO}_2}{\mathrm{g}\
\mathrm{s}}$ \mathrm{s}}\)
In the original publication Allen (1969), root respiration was measured
at
$27°C$. \(27°C\). The values can be found in [Table 3] (#Table 3) and [Figure 2] (#Figure 2). The
data include a minimum (Group 2 Brunswick, NJ plants) and a maximum
(Group 3 Newbery, South Carolina), which I assume are the ones used by
George 2003:
$[27.2, \([27.2, 56.2] \frac{\mu\mathrm{L}\ \mathrm{O}_2}{10\mathrm{mg}\
\mathrm{h}}$ \mathrm{h}}\)
Transformed George 2003 measurements back to the measurement temperature
using a rearrangement of equation 1 from George, the standardized
temperature of
$15°C$ \(15°C\) stated in the Georgeh table legend, and
Q$_{10} Q\(_{10} =
2.075$ 2.075\) from George 2003, and the measurement temperature of
$27°C$ \(27°C\) reported by Allen 1969:
$R_T \(R_T = R_{15}[\exp(\ln(Q_{10})(T-
15))/10]$ 15))/10]\)
$[11.26, \([11.26, 22.52] * exp(log(2.075)*(27 -
15)/10)$ 15)/10)\)
Now we have the values that we would have expected to find in the Allen
paper, except that the units need to be converted back to the original:
$[27.03,54.07] \([27.03,54.07] \mathrm{nmol CO}_2\
\mathrm{g}^{-1}\mathrm{s}^{-1}$ \mathrm{g}^{-1}\mathrm{s}^{-1}\)
#### Required constants
-
$1\ \(1\ \mathrm{mol}\ \mathrm{O}_2 = 1\ \mathrm{mol}\
\mathrm{CO}_2$ \mathrm{CO}_2\)
since respiration is
$\mathrm{CH}_2\mathrm{O} \(\mathrm{CH}_2\mathrm{O} + \mathrm{O}_2 \to \mathrm{CO}_2 +
\mathrm{H}_2\mathrm{O}$ \mathrm{H}_2\mathrm{O}\)
- Density of
$\mathrm{O}_2$ \(\mathrm{O}_2\) at
$27^\circ C$:
$\frac{7.69 \(27^\circ C\):
\(\frac{7.69 \times 10^5\ \mathrm{ml}\ \mathrm{O}_2}{\mathrm{g}\
\mathrm{O}_2}$ \mathrm{O}_2}\)
first assume that Allen converted to sea level pressure (101 kPa),
although maybe they were measured at elevation (Allen may have
worked at \~ 900 kPa near Brevard, NC)
- Molar mass of
$\mathrm{O}_2$:
$\frac{32\mathrm{g}\ \mathrm{O}_2}{\mathrm{mol}}$ \(\mathrm{O}_2\):
\(\frac{32\mathrm{g}\ \mathrm{O}_2}{\mathrm{mol}}\)
- Treat 10mg, which is in the unit of root mass used by Allen, as a
unit of measurement for simplicity
Now convert
$[27.03,54.07] \([27.03,54.07] \mathrm{nmol CO}_2\
\mathrm{g}^{-1}\mathrm{s}^{-1}$ \mathrm{g}^{-1}\mathrm{s}^{-1}\) to
units of
$\frac{\mu\mathrm{L}\ \(\frac{\mu\mathrm{L}\ \textrm{O}_2}{10\mathrm{mg}\ \mathrm{root}\
\mathrm{h}}$. \mathrm{h}}\).
The expected result is the original values reported by Allen:
$[27.2, \([27.2, 56.2] \frac{\mu\mathrm{L}\ \mathrm{O}_2}{10\mathrm{mg}\
\mathrm{h}}$ \mathrm{h}}\)
$[27.03, \([27.03, 54.07]\ \frac{\mathrm{nmol}\ \mathrm{CO}_2}{\mathrm{g}\ \mathrm{root}\ \mathrm{s}} \times \frac{1\ \mathrm{g}}{100\times10\mathrm{mg}} \times \frac{3600\ \mathrm{s}}{\mathrm{h}} \times \frac{\mathrm{nmol}\ \mathrm{O}_2}{\mathrm{nmol}\ \mathrm{CO}_2}\frac{3.2 \times 10^{-8}\ \mathrm{g}\ \mathrm{O}_2}{\mathrm{nmol}\ \mathrm{O}_2}\times \frac{7.69\times10^5\ \mu\mathrm{L}\ \mathrm{O}_2}{\mathrm{g}\
\mathrm{O}_2}$ \mathrm{O}_2}\)
The result is:
$[23.8, \([23.8, 47.8] \frac{\mu\mathrm{L}\ \textrm{O}_2}{10\mathrm{mg}\ \mathrm{root}\
\mathrm{h}}$ \mathrm{h}}\)
These are the units reported in the Allen paper, but they appear to be
off by the temperature conversion factor,
$exp(log(2.075)*(27 \(exp(log(2.075)*(27 -
15)/10)=2.4$, 15)/10)=2.4\), e.g.
$[11.9, \([11.9, 23.9]\times 2.4=
[28.6,57.4]$, [28.6,57.4]\), values which are only 5 and 2
percent larger than the original values of
$[27.2, 56.2]$, \([27.2, 56.2]\), respectively
to be acceptable, but not exact. Since the ratio of observed:expected
values are different, it is not likely that
Q$_{10}$ Q\(_{10}\) or the atmospheric
pressure at time of measurement would explain this error.
#### Convert to units in BETYdb, find
$\textrm{k}$ \(\textrm{k}\)
:
$\textrm{k}\times\frac{\mu\mathrm{L}\ \(\textrm{k}\times\frac{\mu\mathrm{L}\ \textrm{O}_2}{10\mathrm{mg}\ \mathrm{root}\ \mathrm{h}} = \frac{\mu\mathrm{mol}\ \mathrm{CO}_2}{\mathrm{kg}\
\mathrm{s}}$ \mathrm{s}}\)
$k \(k = \frac{\mathrm{g}\ \mathrm{O}_2}{7.69\times10^5\ \mu\mathrm{L}\ \mathrm{O}_2}\times\frac{\mu\mathrm{mol}\ \mathrm{O}_2}{3.2 \times 10^{-5}\ \mathrm{g}\ \mathrm{O}_2} \times \frac{10^5\ \times 10\mathrm{mg}}{\mathrm{kg}} \times \frac{\mathrm{h}}{3600\
\mathrm{s}}=$
$= 1.13$ \mathrm{s}}=\)
\(= 1.13\)
#### Calculating
$MSE$ \(MSE\) given
$F$, $df_{\text{group}}$, \(F\), \(df_{\text{group}}\), and
$SS$ \(SS\)
Given:
$\label{eq:f} \(\label{eq:f}
F =
MS_g/MS_e$ MS_g/MS_e\)
Where
$g$ \(g\) indicates the group, or treatment. Rearranging this equation
gives:
$MS_e=MS_g/F$ \(MS_e=MS_g/F\)
Given
$MS_x \(MS_x =
SS_x/df_x$ SS_x/df_x\)
Substitute
$MS_e/df_e$ \(MS_e/df_e\) for
$SS_e$ \(SS_e\) in the first equation
$F=\frac{SS_g/df_g}{MS_e}$ \(F=\frac{SS_g/df_g}{MS_e}\)
Then solve for
$MS_e$ \(MS_e\)
$\label{eq:mse} \(\label{eq:mse}
MS_e = \frac{SS_g}{df_g\times
F}$ F}\)
$\label{eq:dft}
df_{\text{total}}=(df_a+1)\times(df_b+1)...\times(n)-1$ \(\label{eq:dft}
df_{\text{total}}=(df_a+1)\times(df_b+1)...\times(n)-1\)
Which depends on the experimental design:
For factors a, b... (usually 1 or 2, sometimes 3) where
$n$ \(n\) is the
number of replicates within each treatment combination.
- One-way anova
$df_{\text{total}}=an-1$; \(df_{\text{total}}=an-1\); where
$a$ \(a\) is the number of
treatments
- Two-way anova without replication
$df_{\text{total}}=(a+1)(b+1)-1$ \(df_{\text{total}}=(a+1)(b+1)-1\)
also known as ’’randomized complete block design’’ (RCBD)
- Two-way anova with
$n$ \(n\) replicates
$df_{\text{total}}=(a+1)(b+1)(n)-1$ \(df_{\text{total}}=(a+1)(b+1)(n)-1\) aka ’’RCBD with replication’’
#### Example
...
The results are from one (two?) factor ANOVA with repeated measures,
with treatment and week as the factors and no replication.
We will calculate MSE from the
$SS_{\text{treatment}}$
$df_{\text{treatment}}$, \(SS_{\text{treatment}}\)
\(df_{\text{treatment}}\), and
$F$-value \(F\)-value given in the table; these are
$109.58$, $2$, \(109.58\), \(2\), and
$0.570$, \(0.570\), respectively;
$df_{\text{weeks}}$ \(df_{\text{weeks}}\) is given
as
$10$. \(10\).
For the 1997 *Eriphorium vaginatum*, the mean
$A_{max}$ \(A_{max}\) in table 4 is
$13.49$. \(13.49\).
Calculate
$MS_e$:
$MS_e \(MS_e\):
\(MS_e = \frac{109.58}{0.57 \times 2} =
96.12$
## Reference Tables
**Table 3: Managements**
This is a list of managements to enter, with the most common management types in bold. It is more important to have management records for Yields than for traits. For greenhouse experiments, it is not necessary to include informaton on fertilizaton, lighting, or greenhouse temperature.
| Management Type | Units | Definition | Notes |
|:----------------|:------|:-----------|:------|
| Burned | aboveground biomass burned |
| CO2 fumigation | ppm | | |
| Fertilization_X | kg x ha$^{-1}$ | fertilization rate, element X | |
| Fungicide | kg x ha$^{-1}$ | | add type of fungicide to notes |
| Grazed | years | livestock grazing | pre-experiment land use |
| Harvest | | | no units, just date, equivalent to coppice, aboveground biomass removal |
| Herbicide | kg x ha$^{-1}$ | | add type of herbicide to notes: glyphosate, atrazine, many others |
| Irrigation | cm | | convert volume \ area to depth as required |
| Light | W m$^{-2}$ | | |
| O3 fumigation | ppm | | |
| Pesticide | kg x ha$^{-1}$ | | add type of pesticide to notes |
| Planting | plants m$^{-2}$ | | Convert row spacing to planting density if possible |
| Seeding | kg seeds x ha$^{-1}$ | | |
| Tillage | | | no units, maybe depth; *tillage* is equivalent to *cultivate* |
**Table 4: Date level of confidence (DateLOC) field**
Numbering convention for the DateLOC (Date level of confidence) field, used in managements, traits, and yields table.
| Dateloc | Definition |
|:--------|:-----------|
| 9 | no data |
| 8 | year |
| 7 | season |
| 6 | month |
| 5 | day |
| 4 | time of day i.e. morning, afternoon |
| 3 | hour |
| 2 | minute |
| 1 | second |
| 95 | unknown year, known day |
| 96 | unknown year, known month |
| ...etc | |
**Table 5: List of statistical summaries**
List of the statistics that can be entered into the statname field of traits and yields tables. Please see David (or Mike) if you have questions about statistics that do not appear in this list. If you have P, or LSD in a study with $n\neq b$ (e.g. not a RCBD, see Table 8), please convert these values prior to entering the data, and add a note that stat was transformed to the table. Note: These are listed in order of preference, e.g., if SD, SE, or MSE are provided then use these values.
| Statname | Name | Definition | Notes |
|:----------|:-----|:-----------|:------|
| SD | Standard Deviation | $\sqrt{\frac{1}{N} \sum{(x_i - \bar{x})^2}}$ | $\bar{x}$ is the mean |
| SE | Standard Error | $\frac{s}{\sqrt{n}}$& | |
| MSE | Mean Squared Error | | | like SD, but with multiple treatments; in R: $\frac{mean(aov(y~x)$residuals{^2}$/{aov(y~x)df}$ |
| 95\%CI | 95% Confidence Interval| $t_{1-^{\alpha}/_2,n}*s$ | measure the 95% CI from the mean, this is actually $^1/_2$ of the CI |
| LSD | Least Significant Difference | $t_{1-\frac{\alpha}{2},n}\sqrt{2\text{MSE}/b}$ | $b$ is the number of blocks (Rosenberg 2004) |
| MSD | Minimum Significant Difference | | |
**Table 6: Variables**
| Variable | Units | Median (90%CI) or Range | Definition |
|:---------|:------|:------------------------|:-----------|
| Vcmax | $\mu$ mol CO$_2$ m$^{2}$ s$^{-1}$ | $44 (12, 125)$ | maximum rubisco carboxylation capacity |
| SLA | m$^2$ kg$^{-1}$ | $15(4,27)$ | Specific Leaf Area area of leaf per unit mass of leaf |
| LMA | kg m$^{-2}$ | $0.09 (0.03, 0.33)$ | Leaf Mass Area (LMA = SLM = 1/SLA) mass of leaf per unit area of leaf |
| leafN | % | $2.2(0.8, 17)$ | leaf percent nitrogen |
| c2n leaf | leaf C:N ratio | $39(21,79)$ | use only if leafN not provided |
| leaf turnover rate | 1/year | $0.28(0.03,1.0) $ | |
| Jmax | $\mu$ mol photons m$^{-2}$ s$^{-1}$ | $121(30, 262)$ | maximum rate of electron transport |
| stomatal slope | | $9(1, 20)$ | |
| GS | | | stomatal conductance (= gs$_{\textrm{max}}$ |
| q* | | 0.2--5 | ratio of fine root to leaf biomass |
| **grasses* | ratio of root:leaf = below:above ground biomass | | |
| aboveground biomass | g m$^{-2}$ *or* g plant$^{-1}$ | | |
| root biomass | g m$^{-2}$ *or* g plant$^{-1}$ | | |
| **trees* | ratio of fine root:leaf biomass | | |
| leaf biomass | g m$^{-2}$ *or* g plant$^{-1}$ | | |
| fine root biomass (<2mm) | g m$^{-2}$ *or* g plant$^{-1}$ | | |
| root turnover rate | 1/year | 0.1--10 | rate of fine root loss (temperature dependent) year$^{-1}$ |
| leaf width | mm | 22(5,102) | |
| growth respiration factor | % | 0--1 | proportion of daily carbon gain lost to growth respiration |
| R$_{\textrm{dark}}$ | | $\mu$ mol CO$_2$ m$^{-2}$ s$^{-1}$ | dark respiration |
| quantum efficiency | % | 0--1 | efficiency of light conversion to carbon fixation, see Farqhuar model |
| dark respiration factor | % | 0--1 | converts Vm to leaf respiration |
| seedling mortality | % | 0--1 | proportion of seedlings that die |
| r fraction | % |0--1 | fraction of storage to seed reproduction |
| root respiration rate* | CO$_2$ kg$^{-1}$ fine roots s$^{-1}$ | 1--100 | rate of fine root respiration at reference soil temperature |
| f labile | % | 0--1 | fraction of litter that goes into the labile carbon pool
| water conductance | | |
**Table 7: Traits with required covariates**
A list of traits and the covariates that must be recorded along with the trait value in order to be converted to a constant scale from across studies.*notes:* stomatal conductance (gs) is only useful when reported in conjunction with other photosynthetic data, such as Amax. Specifically, if we have Amax and gs, then estimation of Vcmax only covaries with dark\_respiration\_factor and atmospheric CO2 concentration. We also now have information to help constrain stomatal\_slope. If we have Amax but not gs, then our estimate of Vcmax will covary with: dark_respiration_factor, CO2, stomatal_slope, cuticular_conductance, and vapor-pressure deficit VPD (which is more difficult to estimate than CO2, but still possible given lat, lon, and date).
Most important, there will be a strong covariance between Vcmax and stomatal_slope.
| Variable | Required Covariates | Optional Covariates |
|:---------|:--------------------|:--------------------|
| vcmax | irradiance and temperature (leaf or air) | |
|any leaf measurement | | canopy height |
| root\_respiration\_rate | temperature (root or soil, | soil moisture |
| | root\_diameter\_max | root size class (usually replace_contentlt;2mm$) |
| any respiration | temperature | |
| root biomass | | min. size cutoff, max. size cutoff |
| root, soil | depth (cm) | used for max and min depths of soil, if only one value, assume min depth = 0; negative values indicate above ground |
| gs (stomatal conductance) | A$_{max}$ | see notes in caption |
| stomatal\_slope (m) | humidity, temperature | specific humidity, assume leaf T = air T | 96.12\)
**Table 8: How to convert statistics from $P$, $LSD$, or $MSD$ to $SE$**
| From | To | Conversion | Rcode | Notes |
|:-----|:---|:-----------|:------|:------|
| P | SE | $SE = \frac{\bar{X}_1-\bar{X}_2}{t_{1-P/2,2n-2}\sqrt{2/n}}$ | (x1-x2)/(qt(1-P/2,2*n-2)*sqrt(2/n)) | $\bar{X}_{1,2}$ are two means being compared. |
| LSD | SE | $SE = \frac{LSD}{t_{1-\alpha/2,n}*\sqrt{2b}}$ | LSD/(qt(1-P/2,n)*sqrt(2*b)) | where $b$ is the number of blocks, $n$ is the number of replicates, and $n=b$ in a Randomized Complete Block Design |
| MSD | SE | $SE = \frac{MSD*n}{t_{1-\alpha, 2n-2}*\sqrt{2}}$ | msd*n/(qt(1-P/2,2*n-2)*sqrt(2)) | |
**Table 9: Useful conversions for entering site, management, yield, and trait data**
| From ($X$) | to ($Y$) | Conversion | Notes |
|:-----------|:---------|:-----------|:------|
| $X_2=$root production | $X_1=$root biomass & root turnover rate | $Y = X_2/X_1replace_contentamp; | Gill [2000] |
| DD$^{\circ}$ MM'SS | XX.ZZZZ | $\textrm{XX.ZZZZ} = \textrm{XX} + \textrm{MM}/60+\textrm{SS}/60$ | to convert latitude or longitude from degrees, minutes, seconds to decimal degrees |
| lb | kg | $Y=X\times 2.2$ | |
| mm/s | $\mu$ mol CO$_2$ m$^{2}$ s$^{-1}$ | $Y=X\times 0.04$ | |
| m$^2$ | ha | $Y = X/10^6$ | |
| g/m$^2$ | kg/ha | $Y=X\times 10$ | |
| US ton/acre | Mg/ha | $Y = X\times 2.24$ | |
| m$^3$/ha | cm | $Y=X/100$ | units used for irrigation and rainfall |
| % roots | root:shoot (q) | $Y=\frac{X}{1-X}$ | $\% \text{roots} = \frac{\text{root biomass}}{\text{total biomass}}$ |
| $\mu$ mol cm$^{-2}$ s$^{-1}$ | mmol m$^{-2}$ s$^{-1}$ | $Y = X/10$ | |
| mol m$^{-2}$ s$^{-1}$ | mmol m$^{-2}$ s$^{-1}$ | $Y = X/10^6$ | |
| mol m$^{-2}$ s$^{-1}$ | $\mu$ mol cm$^{-2}$ s$^{-1}$ | $Y = X/ 10^5$ | |
| mm s$^{-2}$ | mmol m$^{-3}$ s$^{-1}$ | $Y=X/41$ | Korner et al. [1988] |
| mg CO$_2$ g$^{-1}$ h$^{-1}$ | $\mu$ mol kg$^{-1}$ s$^{-1}$ | $Y = X\times 6.31$ | used for root\_respiration\_rate |
| $\mu$ mol | mol | $Y= X\times 10^6$ | |
| julian day (1--365) | date | | see ref: http://disc.gsfc.nasa.gov/julian_calendar.shtml (NASA Julian Calendar)
| spacing (m) | density (plants m$^{2}$) | $Y=\frac{1}{\textrm{row spacing}\times\textrm{plant spacing}}$ | |
| kg ha$^{-1}$ y$^{-1}$ | Mg ha$^{-1}$ y$^{-1}$ | $Y= X/1000$ | |
| g m$^{-2}$ y$^{-1}$ | Mg ha$^{-1}$ y$^{-1}$ | $Y= X/100$ | |
| kg | mg | $Y=X\times 10^6$ | |
| cm$^2$ | m$^2$ | $Y=X\times 10^4$ | |
## Acknowledgements
Patrick Mulroony [email protected] implemented the data entry
interface. Moein Azimi, David Bettinardi, and Nick Brady, along with
other members of the Dietze lab, have contributed to the ongoing
development of this document and the web interface that it describes.