# Interpretation and Complexity Reduction for Gaussian Processes Regression

[section] [theorem] [theorem]Lemma

Abstract

[section] [theorem] [theorem]Lemma

[section] [theorem] [theorem]Lemma

# Introduction

[section] [theorem] [theorem]Lemma

# Gaussian Processes Regression

\label{sec:GaussianProcessesRegression}

In multidimensional regression problem we assume that $$f: \mathcal{X} \rightarrow \mathbb{R}, \mathcal{X}\subset\mathbb{R}^m$$ is an unknow dependency function. We are given a noisy learning set $$D = \left\{\left(\mathbf{x}_i, y_i\right)\right\}$$, where $$y_i = f(\mathbf{x}_i) + \varepsilon_i, \mathbf{x}_i\in\mathcal{X}, \varepsilon_i\sim\mathcal{N}(0,\sigma^2)$$ for $$i=1,\dots,N$$ sampled independently and identically distributed (i.i.d.) from some unknown distribution. The goal is to predict the response $$\hat y^*$$ on unseen test points $$x^*$$ with small mean-squared error under the data distribution, i.e. find such function $$\hat{f}$$ from specific class $$\mathcal{C}$$ that approximation error on test set, $$D_{test} =\left\{\left(\mathbf{x}_j, y_j = f(\mathbf{x}_j)\right)\middle| j = \overline{1, N_*}\right\}$$, $\label{eq:approx_error} \varepsilon\left(\hat{f} \middle| D_{test}\right) = \sqrt{\frac{1}{N_*} \sum\limits_{j = 1}^{N_*} \bigl(y_j - \hat{f}(\mathbf{x}_j)\bigr)^2}.$ is minimum.