\documentclass[alpha-refs]{wiley-article}
\usepackage{graphicx}
\usepackage[space]{grffile}
\usepackage{latexsym}
\usepackage{textcomp}
\usepackage{longtable}
\usepackage{tabulary}
\usepackage{booktabs,array,multirow}
\usepackage{amsfonts,amsmath,amssymb}
\usepackage{natbib}
\usepackage{url}
\usepackage{hyperref}
\hypersetup{colorlinks=false,pdfborder={0 0 0}}
\usepackage{etoolbox}
\makeatletter
\patchcmd\@combinedblfloats{\box\@outputbox}{\unvbox\@outputbox}{}{%
\errmessage{\noexpand\@combinedblfloats could not be patched}%
}%
\makeatother
% You can conditionalize code for latexml or normal latex using this.
\newif\iflatexml\latexmlfalse
\providecommand{\tightlist}{\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}%
\AtBeginDocument{\DeclareGraphicsExtensions{.pdf,.PDF,.eps,.EPS,.png,.PNG,.tif,.TIF,.jpg,.JPG,.jpeg,.JPEG}}
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
% Add any additional LaTeX packages and macros here
\usepackage{siunitx}
%---
\iflatexml
% Add any LateXML specific commands here
%---
\else
% The commands below will only change the exported PDF. Edit or remove as needed
\usepackage{lineno}
\linenumbers
\renewcommand{\sfdefault}{ptm}
\renewcommand{\familydefault}{\sfdefault}
\paperfield{Field of the paper}
\abbrevs{ABC, a black cat; DEF, doesn't ever fret; GHI, goes home immediately.}
\corraddress{Author One PhD, Department, Institution, City, State or Province, Postal Code, Country}
\corremail{correspondingauthor@email.com}
\presentadd{Department, Institution, City, State or Province, Postal Code, Country}
\fundinginfo{Funder One, Funder One Department, Grant/Award Number: 123456, 123457 and 123458; Funder Two, Funder Two Department, Grant/Award Number: 123459}
\fi
%---
\papertype{Original Article}
\title{Regularized Variational Data Assimilation for Bias Treatment using the
Wasserstein Metric}
\author[1]{Sagar Tamang}
\affil[1]{Affiliation not available}
\runningauthor{Sagar Tamang}
\begin{document}
\maketitle
\selectlanguage{english}
\begin{abstract}
This paper presents a new variational data assimilation (VDA) approach
for the formal treatment of bias in both model outputs and observations.
This approach relies on the Wasserstein metric stemming from the theory
of optimal mass transport to penalize the distance between the
probability histograms of the analysis state and it a priori reference
dataset, which is likely to be more uncertain but less biased than both
model and observations. Unlike previous bias-aware VDA approaches, the
new Wasserstein metric VDA (WM-VDA) dynamically treats systematic biases
of unknown magnitude and sign in both model and observations through
assimilation of the reference data in the probability domain and can
fully recover the probability histogram of the analysis state. The
performance of WM-VDA is compared with the classic three-dimensional VDA
(3D-Var) scheme on first-order linear dynamics and the chaotic Lorenz
attractor. Under positive systematic biases in both model and
observations, we consistently demonstrate a significant reduction in the
forecast bias and unbiased root mean squared error.
\textbf{Keywords} --- Variational data assimilation, bias treatment,
optimal mass transport, Wasserstein distance, regularization, chaotic
systems%
\end{abstract}%
\section{INTRODUCTION}
{\label{707961}}
The predictive accuracy of the Earth System Model (ESM) relies on a
series of differential equations that are often sensitive to their
initial conditions. Even a small error in estimates of their initial
conditions can lead to large forecast uncertainties. In short-to-medium
range forecast systems, an open-loop run of coupled land and weather
models often diverges from the true states and show low forecast skills
as the error keeps accumulating over time
\textbackslash{}citep\{charney1951dynamic,kalnay20074\}. To extend the
forecast lead time, the science of data assimilation (DA) attempts to
use the information content of the observations for improved estimates
of the ESMs initial conditions thus reducing their forecast
uncertainties
\textbackslash{}citep\{leith1993numerical,kalnay2003atmospheric\}. The
DA methods often involve iterative cycles at which the
\{\textbackslash{}it observations\} are optimally integrated with the
previous time forecasts (\{\textbackslash{}it background states\}) to
obtain an \{\textbackslash{}it a posteriori\} estimate of the initial
conditions (\{\textbackslash{}it analysis state\}) with reduced
uncertainty in a Bayesian setting
\textbackslash{}citep\{rabier2005overview,asch2016data\}. In the
literature, two major categories of DA methodologies exist, namely
\textbackslash{}blue\{filtering\} and variational methods
\textbackslash{}citep\{law2012evaluating\}. Advanced approaches such as
hybrid DA schemes are also being developed, which aim to combine and
take advantage of unique benefits of the two DA classes
\textbackslash{}citep\{wang2008hybrid,lorenc2015comparison\}.
\par\null
Although, both classic DA approaches have been widely used in land and
weather forecast systems, they are often based on a strict underlying
assumption that the error is drawn from a zero-mean Gaussian
distribution, which is not always realistic. Bias exists in
land-atmosphere models mainly due to under-representation of the
governing laws of physics and erroneous boundary conditions. Observation
bias also exists largely due to systematic errors in sensing systems and
retrieval algorithms represented by the observation operator in DA
systems \textbackslash{}citep\{dee2003detection,dee2005bias\}. From a
mathematical point of view, bias is simply the expected value of the
error that can be removed if the ground truth of the process was known,
which is not often feasible in reality. The problem is often exacerbated
due to difficulty in attribution of the bias to either model or
observations and/or both. \textbackslash{}blue\{At the same time,
point-scale observations such as those from \{\textbackslash{}it
in-situ\} gauges and radiosondes are often considered to be more close
to the ground truth, however, their assimilation into gridded model
outputs is not straightforward due to the existing scale gaps.\}
\par\null
Bias correction strategies in DA systems mainly fall under two general
categories: (i) dynamic bias-aware and (ii) re-scaling bias correction
schemes. \textbackslash{}orange\{Apart from these two general
categories, machine learning techniques have also been developed to
learn relationships between observations and ancillary variables for
bias correction \textbackslash{}citep\{jin2019machine\}. The dynamic
bias-aware schemes make prior assumptions about the nature of the bias
and attribute it either to model or observations which may not be
realistic as both models and observations suffer from systematic errors.
Early attempts to dynamically treat model biases are based on a two-step
bias estimation and correction approach, which is applied prior to the
analysis step
\textbackslash{}citep\{dee1998data,radakovich2001results,chepurin2005forecast\}.
Variants of bias-aware Kalman filter for colored noise
\textbackslash{}citep\{drecourt2006bias\} and a weak constrained
four-dimensional VDA (4D-Var)
\textbackslash{}citep\{zupanski1997general\} have also been proposed to
account for non-zero mean model errors. At the same time, there exists
another class of dynamic observational bias correction techniques that
rely on variants of variational bias-correction (VarBC) method, which
makes an \{\textbackslash{}it a priori\} estimate of bias and
dynamically update it using innovation information
\textbackslash{}citep\{auligne2007adaptive,dee2009variational,zhu2014enhanced\}.
Apart from VarBC, more recently, a new approach is~ proposed to treat
observation biases by iterative updates of the observation operator
\textbackslash{}citep\{hamilton2019correcting\}. A body of research also
has been devoted for simultaneously treating model and observation
biases using multi-stage hybrid filtering technique
\textbackslash{}citep\{pauwels2013simultaneous\}. However, the above
schemes still lack the ability to leverage climatologically unbiased
information from reference observations (e.g., \{\textbackslash{}it in
situ\} data) and has not yet been tested for effective bias correction
in chaotic systems. More importantly, the developed schemes largely
focus to retrieve an unbiased expected value of the forecast and remain
limited to characterization of the second-order forecast uncertainty.\}
\par\null
The re-scaling techniques do not make any explicit assumptions about
relative accuracy of the model and observation system
\textbackslash{}citep\{reichle2004bias,crow2005relevance,reichle2007comparison,kumar2009role,reichle2010assimilation,liu2018contributions\}.
This family of methods often involves mapping the observations onto the
model space by matching their Cumulative Distribution Function (CDF).
While the CDF-matching technique is comparatively easier in
implementation than the dynamic approach and prevents any numerical
instabilities in model simulations, it implicitly assumes that model
forecasts are unbiased and partly ignores the information content of
observations. For example, if our observations were less biased than the
model outputs, this approach basically fails to effectively remove the
bias. Furthermore, it is a static scheme and there exists no formal way
to extend CDF-matching scheme to dynamically account for changes in the
bias \textbackslash{}citep\{kumar2012comparison\} and its seasonality
\textbackslash{}citep\{de2016soil\}.
\par\null
Conceptually, CDF-matching techniques move probability masses from one
distribution to another. To transform the static CDF-matching to a
dynamic scheme, there are two key questions that we aim to answer: Can
we quantify the movement of probability masses as a cost through a
convex metric? How can this cost be employed to assimilate relatively
unbiased \{\textbackslash{}it in situ\} data for dynamic bias correction
in the VDA framework?
\par\null
The Wasserstein metric (WM)
\textbackslash{}citep\{villani2008optimal,santambrogio2015optimal\},
also known as the Earth Mover's distance
\textbackslash{}citep\{rubner2000earth\} stems from the theory of
optimal mass transport (OMT)
\textbackslash{}citep\{monge1781memoire,kantorovich1942translocation,villani2003topics\},
which provides a concrete ground to compare probability measures
\textbackslash{}citep\{brenier1991polar,gangbo1996geometry,benamou2015iterative,chen2017matrix,chen2018optimal,chen2018wasserstein\}.
Specifically, this distance metric can quantify the dissimilarity
between two probability histograms in terms of the amount of ``work''
done during displacement of probability masses between them. Thus, we
hypothesize that inclusion of such metric in the VDA cost function can
reduce analysis biases. The rationale is that the work done during
displacement of the probability masses, is not only a function of the
shape of probability histograms but also the difference between their
central positions, \textbackslash{}blue\{as described in Section
\ref{sec:OMT}\}. The use of the Wasserstein metric in
DA has been explored previously
\textbackslash{}citep\{ning2014coping,feyeux2018optimal\}.
\textbackslash{}citet\{ning2014coping\} introduced the concept of OMT in
classical VDA framework and demonstrated that the bias in the background
state results in an unrealistic bimodal distribution of the analysis
state. However, the study was conducted on linear systems only for model
bias correction without accounting for any form of observation biases.
\textbackslash{}citet\{feyeux2018optimal\} proposed to fully replace the
quadratic costs in the classic VDA by the Wasserstein metric. Even
though, the latter approach extends the classic VDA beyond a minimum
mean squared error approximation, it does not provide any road map for
bias correction, which is the central focus of this paper.
\par\null
This paper presents a new VDA approach through regularizing the classic
VDA problem with cost associated with the Wasserstein metric, hereafter
referred to as the Wasserstein Metric VDA (WM-VDA). Unlike previous VDA
techniques, WM-VDA treats unknown biases of different magnitudes and
signs both in the model dynamics and observations. To that end, WM-VDA
needs to be informed by an \{\textbackslash{}it a priori\} reference
distribution or histogram (e.g., from \{\textbackslash{}it in situ\}
data) that encodes the space-time variability of the state variables of
interest in probability domain. This \{\textbackslash{}it a priori\}
histogram must be less biased but could exhibit larger higher-order
uncertainties than the observations and model forecasts.
\textbackslash{}blue\{More importantly, unlike classic DA methods, the
WM-VDA allows full recovery of the probability histogram of the analysis
state in the probability domain, which can lead to forecast uncertainty
quantification beyond second-order statistics\}. The idea is tested on a
first-order linear dynamical system as a test-bed and the chaotic
Lorenz-63 \textbackslash{}citep\{lorenz1963deterministic\} attractor
that represents nonlinear dynamics of convective circulation in a
shallow fluid layer. The results demonstrate that the presented approach
is capable in preserving the geometric shape of the distribution of
analysis state when both the background state and observations are
systematically biased and extend the forecast skills by controlling the
propagation of bias in the phase space of a highly chaotic system.
\par\null
The paper is organized as follows: Section 2 discusses the concept of
classic VDA focusing on the 3D-Var. In this section, a summary on the
theory of OMT and the Wasserstein metric is also provided. The
mathematical formulation of the proposed WM-VDA is explained in Section
3. Section 4 implements the WM-VDA on a first-order linear and the
nonlinear Lorenz-63 dynamic systems. The results are interpreted and
compared with the 3D-Var and \textbackslash{}orange\{CDF-matching
technique\}. Summary and concluding remarks are presented in Section
5.{\ref{730632}}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/example-image-rectangle/example-image-rectangle}
\caption{{\textbf{Please replace this figure.} Although we encourage authors to
send us the highest-quality figures possible, for peer-review purposes
we are can accept a wide variety of formats, sizes, and resolutions.
Legends should be concise but comprehensive -- the figure and its legend
must be understandable without reference to the text. Include
definitions of any symbols used and define/explain all abbreviations and
units of measurement.
{\label{div-126281}}%
}}
\end{center}
\end{figure}
\par\null
\subsection{Second Level Heading}\label{second-level-heading}
If data, scripts or other artefacts used to generate the analyses
presented in the article are available via a publicly available data
repository, please include a reference to the location of the material
within the article.
\par\null
\subsection{LaTeX and Mathematical notation}
\label{sec:latex}
You can also include LaTeX code in your documents. Here is a simple inline equation $e^{i\pi}=-1$ and here's a longer equation, numbered:
\begin{equation}
\label{eqn:some}
\int_0^{+\infty}e^{-x^2}dx=\frac{\sqrt{\pi}}{2}
\end{equation}
\par\null
\subsection{Adding Citations and a References
List}
You can cite bibliographic entries easily in Authorea. For example, here
are a couple of citations~\cite{Cavalleri_2016,Gregory_2015}. By default citations are
formatted according to a format accepted by the journal. Here is one
more citation~\cite{Meskine_2019}
\par\null
\subsection{Customizing the exported
PDF}
{\label{730632}}
You can add packages, create aliases and macros (newcommands), as well
as specify corresponding address, corresponding email, funding
information etc by clicking ``Settings'', ``Edit Macros''.
\par\null
\subsubsection{Third Level Heading}
Supporting information will be included with the published article. For
submission any supporting information should be supplied as separate
files but referred to in the text. Appendices will be published after
the references. For submission they should be supplied as separate files
but referred to in the text.
\par\null
\begin{quote}
The significant problems we have cannot be solved at the same level of
thinking with which we created them.
\end{quote}
Measurements should be given in SI or SI-derived units. Chemical
substances should be referred to by the generic name only. Trade names
should not be used. Drugs should be referred to by their generic names.
If proprietary drugs have been used in the study, refer to these by
their generic name, mentioning the proprietary name, and the name and
location of the manufacturer, in parentheses.\selectlanguage{english}
\begin{table}[h!]
\centering
\normalsize\begin{tabulary}{1.0\textwidth}{CCCCC}
Variables & JKL1 (n=30) & Control (n=40) & MN2 & t (68) \\
Age at testing & 38 & 58 & 504.48 & 58 ms \\
Age at testing & 38 & 58 & 504.48 & 58 ms \\
Age at testing & 38 & 58 & 504.48 & 58 ms \\
Age at testing & 38 & 58 & 504.48 & 58 ms \\
Age at testing & 38 & 58 & 504.48 & 58 ms \\
Age at testing & 38 & 58 & 504.48 & 58 ms \\
\end{tabulary}
\caption{{This is a table. Tables should be self-contained and complement, but not
duplicate, information contained in the text. They should be not be
provided as images. Legends should be concise but comprehensive -- the
table, legend and footnotes must be understandable without reference to
the text.
{\label{536540}}%
}}
\end{table}\section{Acknowledgements}\label{acknowledgements}
Acknowledgements should include contributions from anyone who does not
meet the criteria for authorship (for example, to recognize
contributions from people who provided technical help, collation of
data, writing assistance, acquisition of funding, or a department
chairperson who provided general support), as well as any funding or
other support information.
\selectlanguage{english}
\FloatBarrier
\bibliography{bibliography/converted_to_latex.bib%
}
\end{document}