Authorea

Pavel Erofeev edited GPR.tex over 9 years ago

Commit id: 7d68d9c1b80094f950037a11c28de44dfdff4691

deletions | additions

In multidimensional regression problem we assume that $f: \mathcal{X} \rightarrow \mathbb{R}, \mathcal{X}\subset\mathbb{R}^m$ is an unknow dependency function. We are given a noisy \textit{learning set} $D = \left\{\left(\mathbf{x}_i, y_i\right)\right\}$, where $y_i = f(\mathbf{x}_i) + \varepsilon_i, \mathbf{x}_i\in\mathcal{X}, \varepsilon_i\sim\mathcal{N}(0,\sigma^2)$ for $i=1,\dots,N$ sampled independently and identically distributed (i.i.d.) from some unknown distribution. The goal is to predict the response $\hat y^*$ on unseen test points $x^*$ with small mean-squared error under the data distribution, i.e. find such function $\hat{f}$ from specific class $\mathcal{C}$ that approximation error on test set $D_{test} = \bigl(X_*, Y_*\bigr) = \bigl\{\bigl(\mathbf{x}_j, y_j = f(\mathbf{x}_j)\bigr), j = \overline{1, N_*}\bigr\}$ \begin{equation} \label{eq:ApproxError} \eps(\hat{f} \varepsilon(\hat{f} | D_{test}) = \sqrt{\frac{1}{N_*} \sum\limits_{j = 1}^{N_*} \bigl(y_j - \hat{f}(\mathbf{x}_j)\bigr)^2}. \end{equation} is minimum.