Heritability of growth strain in *Eucalyptus bosistoana*: a Bayesian approach with left-censored data

Abstract

Narrow-sense heritabilities of the wood properties of two-year-old *Eucalyptus bosistoana* were estimated from 623 stems. Heritability estimates for growth-strain (0.63), density (0.54), diameter (0.76), volumetric shrinkage (0.29), acoustic velocity (0.97) and stiffness (0.82) are presented. A modified version of the splitting test for detecting growth-strain is described. The modified rapid testing procedure results in left-censored growth-strain data, a Bayesian approach is implemented to reduce errors associated with censored data sets. Correlations between wood properties are presented and discussed, as well as trade-offs when shifting trait means by selective breeding.

The main use of plantation *Eucalyptus* species is the production of biomass for the pulp and paper, and bioenergy industries. These trees are fast-growing and can potentially produce high-quality timber for appearance, structural and engineered wood products. Unfortunately, this potential is hindered by the frequent presence of large growth-strains, which are associated with log splitting, warp, collapse and brittleheart, imposing substantial costs on processing (Yamamoto, 2007).

A few technological mitigation strategies have been developed to reduce the incidence of wood defects caused by growth-strain, but they are costly and only partially effective (Yamamoto, 2007). An alternative approach to the problem is to rely on the genetic control of growth-strain—shown in this article to be highly heritable—to select and grow individuals with low growth-strain. However, measuring growth strains in large numbers of trees (as needed for a successful breeding programme) has been difficult, time consuming and expensive until now. As an example, the largest reported studies to date assessed only 164 (Murphy et al., 2005) and 216 (Naranjo et al., 2012) trees.

The University of Canterbury has developed and implemented a rapid growth-strain testing procedure, based on the work by Chauhan et al. (2010) and Entwistle et al. (2014). In order to minimise the time taken to measure growth-strain on each tree, the rapid testing procedure does not account for negative values, where the wood in the centre of the stem is under tension rather than compression, assigning instead a zero that results in left-censored datasets.

Left-censored data are common in research areas where detection limits are high compared to the measured values, such as testing for the presents of drugs in an animal. There are many approaches to deal with censoring (e.g. Senn 2012) and in this article we use a Bayesian framework to impute the missing data from known data, reducing the error induced by zero inflation. A Bayesian approach makes it easier to include model complexity (e.g. censoring) while accounting for the hierarchical nature of the data. In addition, one can easily obtain complex distributions of functions of covariance components, like heritabilities, as a byproduct of the estimation process (Cappa et al., 2006). There are several examples of Bayesian applications in forest genetics; for example: (Soria et al., 1998) (univariate analysis of growth traits), (Cappa et al., 2006) (multivariate analysis of growth traits) and (Apiolaza et al., 2011a) (multivariate analysis of early wood properties).

We ran a pilot study consisting of two *Eucalyptus bosistoana* progeny tests from both seed and coppice grown stems, which included 623 individual stems from 40 half sibling families. Our estimates of narrow-sense heritability were obtained from left-censored growth-strain data and other wood properties, utilising a Bayesian approach. These results were used to design a much larger evaluation of the *E. bosistoana* breeding population currently underway.

An *E. bosistoana* open-pollinated progeny trial was established at an irrigated nursery site in Harewood, Christchurch, New Zealand. The trial represented 40 families from two provenances, for a total of 423 seedlings planted into 100 L bags, which were coppiced after the first harvest, giving a total of 623 tested samples. Two separate plantings (or trial sections) occurred in 2010 and 2012. The 2010 families originated from South East Australia, were harvested and coppiced in 2012, and harvested again in December 2014. The 2012 families originated from higher altitude in New South Wales and were harvested in 2013 (this data was not included in the analysis, due to the magnitude of errors induced by small, malformed stems) and again in October 2015 (at age two). All seedlings were established following a completely randomised design.

When harvested, each sample was processed for growth strain, volumetric shrinkage (displacement method, before and after drying), stem diameter (measured under bark using digital calipers), basic density (mass and displacement method) and dry acoustic velocity (resonance). Growth-strain was measured using a modified version of Chauhan et al. (2010) and Entwistle et al. (2014) method. The newly developed “rapid splitting test” substantially reduces measurement time, enabling larger numbers of samples to be processed. The method involves stripping the bark and measuring the under-bark large-end diameter of a clear section of the stem, giving an over estimate of the average diameter used by Chauhan et al. (2010). The sample is then cut lengthwise, with the slit length determined by the length of clear wood and diameter of the sample. Diameter and slit length are measured, recorded and the stem cut. The small-end of the sample is left intact with the large-end free to distort, which removes the need to clamp the two halves together. Finally, the opening is measured and recorded. The calculation of strain is unchanged (with the exception that average radius is now large-end radius) from Chauhan et al. (2010) and calculated using Equation \ref{gse}. It is important to note that the over estimate of radius slightly reduces the strain value, but does so linearly over all samples.

\[\label{gse} Y_u = \frac{0.87 \epsilon L^2}{R}\]

where \(Y_u\) is the deflection, \(\epsilon\) is the strain, \(L\) is the cut length and \(R\) is the big end cross-section radius.

All specimens were grown on the same site; however, they were grown during different time periods which are confounded with the effect of the two provenances and hence are included as a trial effect. The tree effect accounts for the measurement unit, as the same trees were assessed as both seedlings and coppice.

The analyses used a Bayesian approach to estimate the posterior distributions for the heritability of growth-strain and other wood properties. We implemented a hierarchical model where \(y_{ijklm}\) follows a left-censored normal distribution \(N(\omega_{ijklm}, \tau_{j|i})\) with predicted value \(\omega_{ijklm}\) and a trial-dependent precision \(\tau_{j|i}\). The precision (reciprocal variance) \(\tau[x_1]\) for each trial was given a vague gamma prior \(\Gamma(0.01, 0.01)\).

The predicted value for the \(i^{th}\) assessment is modelled as a function of an overall intercept, the effect of the \(j^{th}\) trial, \(k^{th}\) coppicing level, \(l^{th}\) family and \(m^{th}\) tree (to account for repeated assessment pre- and post-coppicing):

\[\omega_{jklm|i} = \mu + \alpha [x_{1 j|i}] + \beta [x_{2 k|i}] + \gamma [x_{3 l|i}] + \delta[x_{4 m|i}]\]

where \(x_1\), \(x_2\), \(x_3\) and \(x_4\) represent indicator variables for the levels of the factors.

The overall intercept (\(\mu\)), and individual-level coefficients for coppicing (\(\alpha_j\)) and site (\(\beta_k\)) were given vague normal prior distributions:

\[\begin{split} \mu &\sim N(0.5, 10^{-12}) \\ \alpha &\sim N(0.5, 10^{-12}) \\ \beta &\sim N(0.5, 10^{-12}) \end{split}\]

The family (\(\gamma_l\)) and tree (\(\delta_m\)) effects were assumed to come from normal distributions \(N(0, \tau_f)\) and \(N(0, \tau_t)\), with vague gamma priors \(\tau_f \sim \Gamma(0.01, 0.01)\) and \(\tau_t \sim \Gamma(0.01, 0.01)\) respectively. The statistical model is also presented graphically following Kruschke (2014) in Figure \ref{graph:model}.

Narrow-sense heritability at the trial level for all properties was calculated using Equation \ref{H2}. The constant of 2.5 used was suggested by Griffin et al. (1988) due to the unknown proportions of selfing, full-siblings and half-siblings within the open-pollinated families (loosely refereed to as half-sibling families in this article).

\[\label{H2} h^2 = \frac{2.5 \times \sigma_f^2}{\sigma_f^2 + \sigma_t^2 + \sigma^2}\]

where the \(\sigma^2\) variances were obtained as the reciprocal of the precisions (\(\tau\)) mentioned in the model description.

All the models were fitted using rjags, an R (R Core Team, 2016) interface to JAGS (Just Another Gibbs Sampler, (Plummer, 2015)) which uses Gibbs Sampling to estimate the marginal posterior distributions for the parameters of interest. Approximately 35% of the strain data was left-censored, and missing values due to censoring were imputed by JAGS as a random value below the limit of detection (using the function aureplacedverbatimaa ; for details see Lunn et al. (2013)), but above -1.5 (equivalent to a negative strain of 1500 micro-strains). We used a family model instead of an animal model as, for the purposes of a simple one-generation pedigree the models are equivalent (*sensu* identical expected values and variance, (Henderson, 1985)), while the family model is much less computationally intensive. The R/JAGS code is available as supplementary material.