Interpretation and Complexity Reduction for Gaussian Processes Regression

[section] [theorem] [theorem]Lemma

Abstract

[section] [theorem] [theorem]Lemma

[section] [theorem] [theorem]Lemma

Introduction

[section] [theorem] [theorem]Lemma

Gaussian Processes Regression

\label{sec:GaussianProcessesRegression}

In multidimensional regression problem we assume that \(f: \mathcal{X} \rightarrow \mathbb{R}, \mathcal{X}\subset\mathbb{R}^m\) is an unknow dependency function. We are given a noisy learning set \(D = \left\{\left(\mathbf{x}_i, y_i\right)\right\}\), where \(y_i = f(\mathbf{x}_i) + \varepsilon_i, \mathbf{x}_i\in\mathcal{X}, \varepsilon_i\sim\mathcal{N}(0,\sigma^2)\) for \(i=1,\dots,N\) sampled independently and identically distributed (i.i.d.) from some unknown distribution. The goal is to predict the response \(\hat y^*\) on unseen test points \(x^*\) with small mean-squared error under the data distribution, i.e. find such function \(\hat{f}\) from specific class \(\mathcal{C}\) that approximation error on test set, \(D_{test} =\left\{\left(\mathbf{x}_j, y_j = f(\mathbf{x}_j)\right)\middle| j = \overline{1, N_*}\right\}\), \[\label{eq:approx_error} \varepsilon\left(\hat{f} \middle| D_{test}\right) = \sqrt{\frac{1}{N_*} \sum\limits_{j = 1}^{N_*} \bigl(y_j - \hat{f}(\mathbf{x}_j)\bigr)^2}.\] is minimum.