MATH stuff

Complex derivative

Here we provide a definition for the ’complex’ derivative of a real-valued function \(f : {\mathbb{C}}^n \to {\mathbb{R}}\) with respect to its complex variables. The notation \(f : {\mathbb{C}}^n \to {\mathbb{R}}\) means “\(f\) is a mapping (or function) from the set of column vectors of size \(n\) with complex components (denoted \({\mathbb{C}}^n\)) into the set of real numbers (denoted \({\mathbb{R}}\)).”

The complex derivative of \(x = a + jb \in {\mathbb{C}}\), \(a,b \in {\mathbb{R}}\), is defined as \[\frac{dx}{dx} = \frac{dx}{da} + j\frac{dx}{db}.\]

Example 1.

Given \(x = a + jb \in {\mathbb{C}}\), \(a,b \in {\mathbb{R}}\), What is \(D|x|\)?

Solution:
We have \[|x| = \sqrt{x^*x} = \sqrt{(a-jb)(a+jb)} = \sqrt{a^2 + b^2}. \nonumber\] Applying the definition of the complex derivative yields \[\begin{aligned} \frac{d|x|}{dx} &=& \frac{d|x|}{da} + j\frac{d|x|}{db} \nonumber\\ &=& \frac{2a}{2\sqrt{a^2 + b^2}} + j\frac{2b}{2\sqrt{a^2 + b^2}} \nonumber\\ &=& \frac{a}{\sqrt{a^2 + b^2}} + j\frac{b}{\sqrt{a^2 + b^2}} \nonumber\\ &=& \frac{x}{|x|}. \nonumber\end{aligned}\]

Example 2.

Given \(x = a + jb \in {\mathbb{C}}\), \(a,b \in {\mathbb{R}}\), What is \(D|x|^2\)?

Solution:
We have \[|x|^2 = x^*x = (a-jb)(a+jb) = a^2 + b^2. \nonumber\] Applying the definition of the complex derivative yields \[\begin{aligned} \frac{d|x|^2}{dx} &=& \frac{d|x|^2}{da} + j\frac{d|x|^2}{db} \nonumber\\ &=& 2a + j2b \nonumber\\ &=& 2x. \nonumber\end{aligned}\] Suppose \(f: {\mathbb{C}}^n \to {\mathbb{R}}\) is a real-valued function and \(x \in {\mathop{\bf int}}{\mathop{\bf dom}}f\). The derivative \(Df(x)\) is a \(1 \times n\) matrix (a row vector), defined by \[\label{eqn:derivative} Df(x) = \left[ \frac{\partial f}{\partial x_1}(x), \dots, \frac{\partial f}{\partial x_n}(x) \right].\]

Example 3.

Given \(x = [x_1, \ldots, x_n]^T \in {\mathbb{C}}^n\) with \(x_i = a_i + jb_i \in {\mathbb{C}}\), \(a_i,b_i \in {\mathbb{R}}\), What is \(D\|x\|_{\ell_2}^2\)?

Solution:
We have \[\begin{aligned} \|x\|_{\ell_2}^2 &=& \sum_{i=1}^n |x_i|^2 = \sum_{i=1}^n x_i^*x_i \nonumber\\ &=& \sum_{i=1}^n (a_i +jb_i)^*(a_i +jb_i) \nonumber\\ &=& \sum_{i=1}^n (a_i -jb_i)(a_i +jb_i) \nonumber\\ &=& \sum_{i=1}^n (a_i^2 +b_i^2). \nonumber\end{aligned}\] We first look at the first element of Equation \ref{eqn:derivative} with \(f(x) = \|x\|_{\ell_2}^2\). Applying the definition of the complex derivative gives \[\begin{aligned} \frac{\partial \|x\|_{\ell_2}^2}{\partial x_1} &=& \frac{\partial \|x\|_{\ell_2}^2}{\partial a_1} + j\frac{\partial \|x\|_{\ell_2}^2}{\partial b_1} \nonumber\\ &=& \frac{\partial }{\partial a_1} \left(\sum_{i=1}^n (a_i^2 +b_i^2)\right) + j\frac{\partial }{\partial b_1} \left(\sum_{i=1}^n (a_i^2 +b_i^2)\right) \nonumber\\ &=& 2a_1 + j2b_1 \nonumber\\ &=& 2x_1. \nonumber\end{aligned}\] Therefore, it follows that \[\begin{aligned} Df(x) &=& \left[ \frac{\partial \|x\|_{\ell_2}^2}{\partial x_1}, \dots, \frac{\partial \|x\|_{\ell_2}^2}{\partial x_n} \right] \nonumber\\ &=& \left[2x_1, \ldots, 2x_n \right] \nonumber\\ &=& 2x^T. \nonumber\end{aligned}\]

Example 4.

Suppose \(A \in {\mathbb{C}}^{m \times n}\), and \(x = [x_1, \ldots, x_n]^T \in {\mathbb{C}}^n\) with \(x_i = a_i + jb_i \in {\mathbb{C}}\), \(a_i,b_i \in {\mathbb{R}}\). What is \(D(Ax)\)?

Solution:
Since \(f(x) = Ax : {\mathbb{C}}^n \to {\mathbb{C}}^m\), we have \[D(Ax) = \left[ \frac{\partial (Ax)}{\partial x_1}, \dots, \frac{\partial (Ax)}{\partial x_n} \right]. \nonumber\] Since \(Ax \in {\mathbb{C}}^m\), we express it as \[Ax = \left[ \begin{array}{c} (Ax)_1 \\ \vdots \\ (Ax)_m \end{array} \right] = \left[ \begin{array}{c} \sum_{i=1}^n A_{1i}x_i \\ \vdots \\ \sum_{i=1}^n A_{mi}x_i \end{array} \right], \nonumber\] and it follows that \[\frac{\partial (Ax)}{\partial x_1} = \left[ \begin{array}{c} \frac{\partial (Ax)_1}{\partial x_1} \\ \vdots \\ \frac{\partial (Ax)_m}{\partial x_1} \end{array} \right] = \left[ \begin{array}{c} A_{11} \\ \vdots \\ A_{m1} \end{array} \right]. \nonumber\] Using the expression above, we write the derivative of \(Ax\) as \[D(Ax) = \left[ \begin{array}{ccc} \frac{\partial (Ax)_1}{\partial x_1} & \cdots & \frac{\partial (Ax)_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial (Ax)_m}{\partial x_1} & \cdots & \frac{\partial (Ax)_m}{\partial x_n} \end{array} \right] = \left[ \begin{array}{ccc} A_{11} & \cdots & A_{1n} \\ \vdots & \ddots & \vdots \\ A_{m1} & \cdots & A_{mn} \end{array} \right] = A. \nonumber\]

Complex gradient

Suppose \(f: {\mathbb{C}}^n \to {\mathbb{R}}\) is a real-valued function and \(x \in {\mathop{\bf int}}{\mathop{\bf dom}}f\). The gradient \(\nabla f(x)\) is an \(n \times 1\) matrix (a column vector), defined by \[\nabla f(x) = Df(x)^T = \left[ \begin{array}{c} \frac{\partial f}{\partial x_1}(x) \\ \vdots \\ \frac{\partial f}{\partial x_n}(x) \end{array} \right]. \nonumber\]

Chain rule

Let \(f : {\mathbb{C}}^n \to {\mathbb{C}}^m\) be differentiable at \(x \in {\mathop{\bf int}}{\mathop{\bf dom}}f\), and let \(g : {\mathbb{C}}^m \to {\mathbb{R}}\) be differentiable at \(f(x) \in {\mathop{\bf int}}{\mathop{\bf dom}}g\). Define the composite function \(h = g \circ f: {\mathbb{C}}^n \to {\mathbb{R}}\) by \(h(x) = g(f(x))\), with \({\mathop{\bf dom}}h = \{x \, | \, f(x) \in {\mathop{\bf dom}}g\}\). Then h is differentiable at \(x\), with derivative \[Dh(x) = Dg(f(x))Df(x),\] Taking the transpose of \(Dh(x) = Dg(f(x))Df(x)\) gives the gradient of \(h(x)\): \[\begin{aligned} \nabla h(x) &=& Dh(x)^T \nonumber\\ &=& (Dg(f(x))Df(x))^T \nonumber\\ &=& Df(x)^T Dg(f(x))^T \nonumber\\ &=& \nabla f(x) \nabla g(f(x)). \nonumber\end{aligned}\]

Suppose \(f : {\mathbb{C}}^n \to {\mathbb{C}}^m\) and \(x \in {\mathop{\bf int}}{\mathop{\bf dom}}f\). It follows that if \[f(x) = \left[ \begin{array}{c} f_1(x_1, \ldots, x_n) \\ \vdots \\ f_m(x_1, \ldots, x_n) \end{array} \right], \nonumber\] then the derivative of \(f\) at \(x\), denoted \(Df(x) \in {\mathbb{C}}^{m \times n}\), is given by \[\begin{aligned} Df(x) &=& \left[ \frac{\partial f}{\partial x_1}(x), \dots, \frac{\partial f}{\partial x_n}(x) \right] \nonumber\\ &=& \left[ \begin{array}{ccc} \frac{\partial f_1}{\partial x_1}(x_1, \ldots, x_n) & \cdots & \frac{\partial f_1}{\partial x_n}(x_1, \ldots, x_n) \\ \vdots & \ddots & \vdots \\ \frac{\partial f_m}{\partial x_1}(x_1, \ldots, x_n) & \cdots & \frac{\partial f_m}{\partial x_n}(x_1, \ldots, x_n) \\ \end{array} \right]. \nonumber\end{aligned}\]

Sparse MRI Appendix A revisited

Here we provide a detailed derivation of the ’complex’ gradient of each term in the cost function as defined in Equation [A1]. Since by definition the ’complex’ gradient is the transpose of the ’complex’ derivative, we first find expressions in terms of derivatives. The cost function \(f : {\mathbb{C}}^{{n_v}} \to {\mathbb{R}}\) is given by \[f(m) = \| {\mathcal{F}_u}m - y\|^2_{\ell_2} + \lambda \| \Psi m\|_{\ell_1}.\] The ’complex’ de