David LeBauer deleted file Data Entry.md  almost 10 years ago

Commit id: a8fa5d3b629a8c64a63ef3388aa9bb9a98fc4d65

deletions | additions      

         

# Data Entry Overview  Before entering data, it is first necessary to add and select the  citation that is the source of the data. It is also necessary for each  data point to be associated with a Site, Treatment, and Species.  Cultivar information is also required when available, but it is only  relevant for domesticated species. Fields with an asterisk (*) are  required.     ## Adding a Trait  In general, a 'trait' is a phenotype (a characteristic that the plant  exhibits). The traits that we are primarily interested in collecting  data for are listed in [Table 6](#Table 6). Before adding trait data, it is necessary to have the citation, treatments, and site information already entered. If the correct  citation is not identified at the top of the page [Figure 8](#Figure 8). To add a new Trait,  go to the [new trait](http://www.betydb.org/traits/new)  page: `Trait` → `new`.  Presently, we are also using the Trait table to record ecosystem level  measurements other than Yield. Such ecosystem level measurements can  include leaf area index or net primary productivity, but are only  collected when required for a particular project. Most of the fields in the Traits table are also used in the Yields table. Here is a list of the fields with a brief description, followed  by more thorough explanations:  **Species**: Search for species in the database using the search box; if species  is not found   **Cultivar**: primarily used for crops; If the cultivar being used is not found in  drop-down box   **DateLOC**: Date Level of confidence. See for values.   **Mean**: mean is in units of tons per hectare per year (t/ha)   **Stat name**: is the name of the statistical method used (usually one of SE, SD, MSE,  CI, LSD, HSD, MSD). See for more details.   **Statistic**: is the value of the statistic associated with Stat name.   **N**: Always record N if provided. N is the number of experimental  replicates, often referred to as the sample size; N represents the  number of independent units within each treatment: in a field  setting, this is often the number of plots in each treatment, but in  a greenhouse, growth chamber, or pot-study this may be the number of  chambers, pots, or individual plants. Sometimes this value is not  clearly stated.   ### dateLOC  The date level of confidence (DateLOC) provides an indication of how accurately the date associated with the trait or yield observation is  known. It provides the values that should be entered in this field. If the  event occurred at a level of precision not defined by an integer in this  table, then use fractions. For example, we commonly use 5.5 to indicate a one week level of precision. If the exact year is not known, but the time of  year is, then use 91 to 97, with the second digit to indicate the information  known within the year.    **Figure 9**: [Form used to enter a new trait](https://www.betydb.org/traits/new)  #### Statistics  Our goal is to record statistics that can be used to estimate standard  deviation or standard error. Many different methods can be used to  summarize data, and this is reflected in the diversity of statistics  that are reported. An overview of these methods is given in a  description below.  Where available, direct estimates of variance are preferred, including  Standard Error (SE), sample Standard Deviation (SD), or Mean Squared  Error (MSE). SE is usually presented in the format of mean   (±SE). MSE is usually presented in a table. When  extracting SE or SD from a figure, measure from the mean to the upper or  lower bound. This is different than confidence intervals and range  statistics (described below), for which the entire range is collected.  If MSE, SD, or SE are not provided, it is possible that LSD, MSD, HSD,  or CI will be provided. These are range statistics and the most  frequently found range statistics include a Confidence Interval (95%CI),  Fisher’s Least Significant Difference (LSD), Tukey’s Honestly  Significant Difference (HSD), and Minimum Significant Difference (MSD).  Fundamentally, these methods calculate a range that indicates whether  two means are different or not, and this range uses different approaches  to penalize multiple comparisons. The important point is that these are  ranges and that we record the entire range.  Another type of statistic is a “test statistic”; most frequently there  will be an F-value that can be useful, but this should not be recorded  if MSE is available. Only if there is no other information available should you  record the P-value.  ## Adding a Yield and Covariate  The protocol for entering yield data is identical to entering data for a  trait, with a few exceptions:  1. There are no covariates associated with yield data  2. Yield data is always the dry harvestable biomass; if necessary,  moisture content can be added as a trait  Yield is equivalent to aboveground biomass on a per-area  basis, and has units of Mg ha^-1 y1   To add a new yield, go to the [new  yield](http://www.betydb.org/yields/new) page  Covariates are required for many of the traits. Covariates generally  indicate the environmental conditions under which a measurement was  made. Without covariate information, the trait data will have limited  value.  A complete list of required covariates can be found in . For all  respiration rates and photosynthetic parameters, temperature is recorded  as a covariate. Soil moisture, humidity, and other such variables that  were measured at the time of the measurement may be required in  order to standardize across studies.  When root data is recorded, the root size class needs to be entered as a  covariate. The term ’fine root’ often refers to the \(<\)2mm size class,  and in this case, the covariate `root_maximum_diameter` would be set to 2.   If the size class is a range, then the `root_minimum_diameter` can also be used.  To add a new covariate, go to the [new covariate](http:www.betydb.org/covariates/new) page