ROUGH DRAFT authorea.com/107780
Main Data History
Export
Show Index Toggle 0 comments
  •  Quick Edit
  • MATH stuff

    Complex derivative

    Here we provide a definition for the ’complex’ derivative of a real-valued function \(f : {\mathbb{C}}^n \to {\mathbb{R}}\) with respect to its complex variables. The notation \(f : {\mathbb{C}}^n \to {\mathbb{R}}\) means “\(f\) is a mapping (or function) from the set of column vectors of size \(n\) with complex components (denoted \({\mathbb{C}}^n\)) into the set of real numbers (denoted \({\mathbb{R}}\)).”

    The complex derivative of \(x = a + jb \in {\mathbb{C}}\), \(a,b \in {\mathbb{R}}\), is defined as \[\frac{dx}{dx} = \frac{dx}{da} + j\frac{dx}{db}.\]

    Example 1.

    Given \(x = a + jb \in {\mathbb{C}}\), \(a,b \in {\mathbb{R}}\), What is \(D|x|\)?

    Solution:
    We have \[|x| = \sqrt{x^*x} = \sqrt{(a-jb)(a+jb)} = \sqrt{a^2 + b^2}. \nonumber\] Applying the definition of the complex derivative yields \[\begin{aligned} \frac{d|x|}{dx} &=& \frac{d|x|}{da} + j\frac{d|x|}{db} \nonumber\\ &=& \frac{2a}{2\sqrt{a^2 + b^2}} + j\frac{2b}{2\sqrt{a^2 + b^2}} \nonumber\\ &=& \frac{a}{\sqrt{a^2 + b^2}} + j\frac{b}{\sqrt{a^2 + b^2}} \nonumber\\ &=& \frac{x}{|x|}. \nonumber\end{aligned}\]

    Example 2.

    Given \(x = a + jb \in {\mathbb{C}}\), \(a,b \in {\mathbb{R}}\), What is \(D|x|^2\)?

    Solution:
    We have \[|x|^2 = x^*x = (a-jb)(a+jb) = a^2 + b^2. \nonumber\] Applying the definition of the complex derivative yields \[\begin{aligned} \frac{d|x|^2}{dx} &=& \frac{d|x|^2}{da} + j\frac{d|x|^2}{db} \nonumber\\ &=& 2a + j2b \nonumber\\ &=& 2x. \nonumber\end{aligned}\] Suppose \(f: {\mathbb{C}}^n \to {\mathbb{R}}\) is a real-valued function and \(x \in {\mathop{\bf int}}{\mathop{\bf dom}}f\). The derivative \(Df(x)\) is a \(1 \times n\) matrix (a row vector), defined by \[\label{eqn:derivative} Df(x) = \left[ \frac{\partial f}{\partial x_1}(x), \dots, \frac{\partial f}{\partial x_n}(x) \right].\]

    Example 3.

    Given \(x = [x_1, \ldots, x_n]^T \in {\mathbb{C}}^n\) with \(x_i = a_i + jb_i \in {\mathbb{C}}\), \(a_i,b_i \in {\mathbb{R}}\), What is \(D\|x\|_{\ell_2}^2\)?

    Solution:
    We have \[\begin{aligned} \|x\|_{\ell_2}^2 &=& \sum_{i=1}^n |x_i|^2 = \sum_{i=1}^n x_i^*x_i \nonumber\\ &=& \sum_{i=1}^n (a_i +jb_i)^*(a_i +jb_i) \nonumber\\ &=& \sum_{i=1}^n (a_i -jb_i)(a_i +jb_i) \nonumber\\ &=& \sum_{i=1}^n (a_i^2 +b_i^2). \nonumber\end{aligned}\] We first look at the first element of Equation \ref{eqn:derivative} with \(f(x) = \|x\|_{\ell_2}^2\). Applying the definition of the complex derivative gives \[\begin{aligned} \frac{\partial \|x\|_{\ell_2}^2}{\partial x_1} &=& \frac{\partial \|x\|_{\ell_2}^2}{\partial a_1} + j\frac{\partial \|x\|_{\ell_2}^2}{\partial b_1} \nonumber\\ &=& \frac{\partial }{\partial a_1} \left(\sum_{i=1}^n (a_i^2 +b_i^2)\right) + j\frac{\partial }{\partial b_1} \left(\sum_{i=1}^n (a_i^2 +b_i^2)\right) \nonumber\\ &=& 2a_1 + j2b_1 \nonumber\\ &=& 2x_1. \nonumber\end{aligned}\] Therefore, it follows that \[\begin{aligned} Df(x) &=& \left[ \frac{\partial \|x\|_{\ell_2}^2}{\partial x_1}, \dots, \frac{\partial \|x\|_{\ell_2}^2}{\partial x_n} \right] \nonumber\\ &=& \left[2x_1, \ldots, 2x_n \right] \nonumber\\ &=& 2x^T. \nonumber\end{aligned}\]

    Example 4.

    Suppose \(A \in {\mathbb{C}}^{m \times n}\), and \(x = [x_1, \ldots, x_n]^T \in {\mathbb{C}}^n\) with \(x_i = a_i + jb_i \in {\mathbb{C}}\), \(a_i,b_i \in {\mathbb{R}}\). What is \(D(Ax)\)?

    Solution:
    Since \(f(x) = Ax : {\mathbb{C}}^n \to {\mathbb{C}}^m\), we have \[D(Ax) = \left[ \frac{\partial (Ax)}{\partial x_1}, \dots, \frac{\partial (Ax)}{\partial x_n} \right]. \nonumber\] Since \(Ax \in {\mathbb{C}}^m\), we express it as \[Ax = \left[ \begin{array}{c} (Ax)_1 \\ \vdots \\ (Ax)_m \end{array} \right] = \left[ \begin{array}{c} \sum_{i=1}^n A_{1i}x_i \\ \vdots \\ \sum_{i=1}^n A_{mi}x_i \end{array} \right], \nonumber\] and it follows that \[\frac{\partial (Ax)}{\partial x_1} = \left[ \begin{array}{c} \frac{\partial (Ax)_1}{\partial x_1} \\ \vdots \\ \frac{\partial (Ax)_m}{\partial x_1} \end{array} \right] = \left[ \begin{array}{c} A_{11} \\ \vdots \\ A_{m1} \end{array} \right]. \nonumber\] Using the expression above, we write the derivative of \(Ax\) as \[D(Ax) = \left[ \begin{array}{ccc} \frac{\partial (Ax)_1}{\partial x_1} & \cdots & \frac{\partial (Ax)_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial (Ax)_m}{\partial x_1} & \cdots & \frac{\partial (Ax)_m}{\partial x_n} \end{array} \right] = \left[ \begin{array}{ccc} A_{11} & \cdots & A_{1n} \\ \vdots & \ddots & \vdots \\ A_{m1} & \cdots & A_{mn} \end{array} \right] = A. \nonumber\]

    Complex gradient

    Suppose \(f: {\mathbb{C}}^n \to {\mathbb{R}}\) is a real-valued function and \(x \in {\mathop{\bf int}}{\mathop{\bf dom}}f\). The gradient \(\nabla f(x)\) is an \(n \times 1\) matrix (a column vector), defined by \[\nabla f(x) = Df(x)^T = \left[ \begin{array}{c} \frac{\partial f}{\partial x_1}(x) \\ \vdots \\ \frac{\partial f}{\partial x_n}(x) \end{array} \right]. \nonumber\]

    Chain rule

    Let \(f : {\mathbb{C}}^n \to {\mathbb{C}}^m\) be differentiable at \(x \in {\mathop{\bf int}}{\mathop{\bf dom}}f\), and let \(g : {\mathbb{C}}^m \to {\mathbb{R}}\) be differentiable at \(f(x) \in {\mathop{\bf int}}{\mathop{\bf dom}}g\). Define the composite function \(h = g \circ f: {\mathbb{C}}^n \to {\mathbb{R}}\) by \(h(x) = g(f(x))\), with \({\mathop{\bf dom}}h = \{x \, | \, f(x) \in {\mathop{\bf dom}}g\}\). Then h is differentiable at \(x\), with derivative \[Dh(x) = Dg(f(x))Df(x),\] Taking the transpose of \(Dh(x) = Dg(f(x))Df(x)\) gives the gradient of \(h(x)\): \[\begin{aligned} \nabla h(x) &=& Dh(x)^T \nonumber\\ &=& (Dg(f(x))Df(x))^T \nonumber\\ &=& Df(x)^T Dg(f(x))^T \nonumber\\ &=& \nabla f(x) \nabla g(f(x)). \nonumber\end{aligned}\]

    Suppose \(f : {\mathbb{C}}^n \to {\mathbb{C}}^m\) and \(x \in {\mathop{\bf int}}{\mathop{\bf dom}}f\). It follows that if \[f(x) = \left[ \begin{array}{c} f_1(x_1, \ldots, x_n) \\ \vdots \\ f_m(x_1, \ldots, x_n) \end{array} \right], \nonumber\] then the derivative of \(f\) at \(x\), denoted \(Df(x) \in {\mathbb{C}}^{m \times n}\), is given by \[\begin{aligned} Df(x) &=& \left[ \frac{\partial f}{\partial x_1}(x), \dots, \frac{\partial f}{\partial x_n}(x) \right] \nonumber\\ &=& \left[ \begin{array}{ccc} \frac{\partial f_1}{\partial x_1}(x_1, \ldots, x_n) & \cdots & \frac{\partial f_1}{\partial x_n}(x_1, \ldots, x_n) \\ \vdots & \ddots & \vdots \\ \frac{\partial f_m}{\partial x_1}(x_1, \ldots, x_n) & \cdots & \frac{\partial f_m}{\partial x_n}(x_1, \ldots, x_n) \\ \end{array} \right]. \nonumber\end{aligned}\]

    Sparse MRI Appendix A revisited

    Here we provide a detailed derivation of the ’complex’ gradient of each term in the cost function as defined in Equation [A1]. Since by definition the ’complex’ gradient is the transpose of the ’complex’ derivative, we first find expressions in terms of derivatives. The cost function \(f : {\mathbb{C}}^{{n_v}} \to {\mathbb{R}}\) is given by \[f(m) = \| {\mathcal{F}_u}m - y\|^2_{\ell_2} + \lambda \| \Psi m\|_{\ell_1}.\] The ’complex’ derivative of the cost function \(f\) at \(m\), denoted \(Df(m) \), is a \(1 \times {n_v}\) row vector, and given by \[Df(m) = D\| {\mathcal{F}_u}m - y\|^2_{\ell_2} + \lambda D\| \Psi m\|_{\ell_1}.\] Let’s look at each term separately. For the first term, \(D\| {\mathcal{F}_u}m - y\|^2_{\ell_2}\), we apply the chain rule to the composite function \(h({\mathcal{F}_u}m - y)\), where \(h(x) = \|x\|^2_{\ell_2} : {\mathbb{C}}^{{n_v}} \to {\mathbb{C}}^{{n_k}}\). It follows that \[\begin{aligned} D\| {\mathcal{F}_u}m - y\|^2_{\ell_2} &=& Dh({\mathcal{F}_u}m - y) D({\mathcal{F}_u}m - y) \nonumber\\ &=& 2({\mathcal{F}_u}m - y)^T {\mathcal{F}_u}. \nonumber\end{aligned}\] Lustig et al. used a smooth approximation of the absolute value of a complex number \(x = a + jb \in {\mathbb{C}}\), given as \[|x| \approx \sqrt{x^*x + \mu}, \nonumber\] where \(\mu\) is a positive smoothing parameter. With this approximation, applying the definition of the complex derivative yields \[\begin{aligned} \frac{d|x|}{dx} &=& \frac{d|x|}{da} + j\frac{d|x|}{db} \nonumber\\ &\approx& \frac{d}{da} \left( \sqrt{(a + jb)^*(a + jb) + \mu} \right) + j\frac{d}{db} \left( \sqrt{(a + jb)^*(a + jb) + \mu} \right) \nonumber\\ &\approx& \frac{d}{da} \left( \sqrt{a^2 + b^2 + \mu} \right) + j\frac{d}{db} \left( \sqrt{a^2 + b^2 + \mu} \right) \nonumber\\ &\approx& \frac{2a}{2\sqrt{a^2 + b^2 + \mu}} + j\frac{2b}{2\sqrt{a^2 + b^2 + \mu}} \nonumber\\