Diffusion tensors allow for the modeling and measurement of water diffusion inside the brain. They are used as indirect measures of brain connectivity, aligning with white matter fiber directions. A diffusion tensor is usually fitted to the diffusion weighted images (DWI) by solving the Stejskal-Tanner equation:

\begin{equation} \label{eq:ST} \label{eq:ST}S_{k}=S_{0}\exp(-\beta g_{k}^{T}Dg_{k})\\ \end{equation}where \(S_{k}\) are the measured diffusion weighted images in the gradient direction \(g_{k}\), \(S_{0}\) is the baseline image and \(D\) is the diffusion tensor. Note that \(D\) is a symmetric positive definite matrix.

The classic method for fitting the tensor to the diffusion data is by means of linear regression, such that the following sum of squared differences error is minimized:

\begin{equation} \label{eq:LS} \label{eq:LS}D=\mbox{argmin}_{Q\in{\cal S}}\frac{1}{2}\sum_{k}\left(\log\left(\frac{S_{k}}{S_{0}}\right)+\beta g_{k}^{T}Qg_{k}\right)^{2}\\ \end{equation}where \({\cal Q}\) is the space of symmetric tensors. The minimization in the space of symmetric tensors is imposed by solving only for the \(6\) degrees of freedom of the symmetric \(D\). However, there is no constraint that guarantees the resulting diffusion tensor to be PSD, i.e., the regression fitting is not computed in the manifold of PSD tensors.

Although indefinite diffusion tensors are non-physical, they commonly arise from diffusion estimation in locations where the DWI signal is corrupted by noise and the degree of anisotropy is high (Skare 2000), (Koay 2006). Indefiniteness occurs when the eigenvalues of \(D\) are of different signs. Several authors proposed method to restrict the estimated tensor to the PSD cone. To estimate a positive semidefinite diffusion tensor one can:

- •
project the tensor back onto the face of the cone of positive semidefinite tensors (at every iteration step in an iterative scheme),

- •
assure positive semidefiniteness during estimation enforcing the tensor flow to stay within the manifold of SPD matrices, or

- •
assure by construction that the estimation can never leave the cone of SPD tensors.

In this paper we will follow the third strategy by estimating the tensors directly in the manifold of SPD matrices using Riemannian arguments and exploiting the differential geometry of the underlying manifold studied intensively in (Moakher 2010). We start by proposing a new elegant formulation of the classic formulation \ref{eq:LS} that leads in a natural manner to the normal equations.

Let \(A\) be an \(m\times n\) matrix and \(a_{i}\) its \(i\)th column. Then \(\mbox{vec}(A)\) is the \(mn\times 1\) vector

\begin{equation} \mbox{vec}(A)=\begin{pmatrix}a_{1}\\ a_{2}\\ \vdots\\ a_{mn}\end{pmatrix}\nonumber \\ \end{equation}The properties of vec operator was extensively studied by Magnus (Magnus 1989) and we will use the nice following one: let \(A,B\) and \(C\) three matrices such that the matrix \(ABC\) is defined, then :

\begin{equation} \mbox{vec}(ABC)=(C^{T}\otimes A)\mbox{vec}(B),\nonumber \\ \end{equation}where \(\otimes\) is the Kronecker product of two matrices. Hence, the term \(g_{k}^{T}Qg_{k}\) in \ref{eq:LS} can be rewritten as \(g_{k}^{T}Qg_{k}=\left(g_{k}^{T}\otimes g_{k}^{T}\right)\mbox{vec}(Q).\) The \(3\times 3\) matrice \(Q\) being symmetric, only \(6\) entries are independent, we then use the v operator which takes only the \(6\times 1\) independents entries of \(\mbox{vec}(Q)\) (see Magnus (Magnus 1989) , for instance) by stacking together the lower triangular part of \(Q\). More precisely, for an \(n\times n\) symmetric matrice, v is defined by

\begin{equation} \mbox{vec}(Q)=D_{n}\mbox{v}(Q)\nonumber \\ \end{equation}where \(D_{n}\) is the duplication matrix (see (Abadir) for a definition). Note that \(D_{n}\) is \(n^{2}\times\frac{n(n+1)}{2}\); in the case \(n=3\) \(D_{n}\) is :

\begin{equation} D_{n}=\begin{bmatrix}1&.&.&.&.&.\\ .&1&.&.&.&.\\ .&.&1&.&.&.\\ .&1&.&.&.&.\\ .&.&.&1&.&.\\ .&.&.&.&1&.\\ .&.&1&.&.&.\\ .&.&.&.&1&.\\ .&.&.&.&.&1\end{bmatrix}\nonumber \\ \end{equation}where the dots stand for zeros.

The equation \ref{eq:LS} can rewritten as:

Let's note by \(m\) is the number of gradients, i.e., number of DWI which must be at least equal to \(6\). With the sequel notations

- •
\(m\times 1\) column vector \(b=\left(-\frac{1}{\beta}\log\left(\frac{S_{k}}{S_{0}}\right)\right)_{k=1:m}\)

- •
\(A\) is the \(m\times\frac{n(n+1)}{2}\) such that

\begin{equation} A=\begin{bmatrix}\left(g_{1}^{T}\otimes g_{1}^{T}\right)D_{n}\\ \left(g_{2}^{T}\otimes g_{2}^{T}\right)D_{n}\\ \vdots\\ \left(g_{k}^{T}\otimes g_{k}^{T}\right)D_{n}\\ \vdots\\ \left(g_{m}^{T}\otimes g_{m}^{T}\right)D_{n}\end{bmatrix}\nonumber \\ \end{equation} - •
\(f(x)=\frac{1}{2}\parallel Ax-b\parallel^{2}_{2},\)

the least-squares problem (\ref{eq:LS}) becomes the standard one

\begin{equation} \label{eq:SLS} \label{eq:SLS}\hat{x}=\mbox{Argmin}_{x\in\mbox{v}({\cal P}(n))}f(x)\\ \end{equation}where \(x=\mbox{v}(Q)\). We know that any minimizer \(\hat{x}\) of \(f\) must satisfy

\begin{equation} \nabla f(\hat{x})=0,\nonumber \\ \end{equation}where \(\nabla f(\hat{x})=0\) is the gradient of \(f\) evaluated at \(\hat{x}\). This gradient can be expressed in matrix form as \(\nabla f(x)=A^{T}(Ax-b).\) Any minimizer \(\hat{x}\) of \(f(x)\) must satisfy the normal equations

\begin{equation} \label{eq:NE} \label{eq:NE}A^{T}A\hat{x}=A^{T}b\\ \end{equation}We assume that the columns of \(A\) are linearly independent (which is experimentally guaranted when there is no colinear couple of gradients \(g_{k}\)) and then we have the invertibility of Gram matrix \(A^{T}A\). Finally we obtain that

\begin{equation} \hat{x}=\left(A^{T}A\right)^{-1}A^{T}b\nonumber \\ \end{equation}is the unique solution of the normal equations (\ref{eq:NE}).