Properties of (the Inverse of) the Gradient of a (Strictly) Convex \(\mathrm{C}^{1}\) Functions

In this short note, I present the characterizations of *convex*, continuously differentiable functions in a Eucledean space \(\mathbb{R}^{n}\) as the functions whose derivatives are monotone. If it is *strictly convex*, then its derivatives are also *strictly* monotone and are one-to-one mappings. This note is the supplemental material of Section 3.3 of the paper [5] since it is obvious by the results that if a strictly convex \(\mathrm{C}^{1}\) function \(S|_{\mathsf{int}}:\mathcal{U}_{\mathsf{int}}\to\mathbb{R}\) with an \(m\)-dim. manifold \(\mathcal{U}_{\mathsf{int}}\subseteq\mathbb{R}^{m}\) satisfies \(\mathrm{Im}(\nabla S|_{\mathsf{int}}^{\mathsf{T}})=\mathbb{R}^{m}\), then

- 1.
its gradient \(\nabla S|_{\mathsf{int}}\) is strictly monotone and bijective (by Theorem 1 below);

- 2.
its inverse \(\sigma\doteq\big{(}\nabla S|_{\mathsf{int}}^{\mathsf{T}}\big{)}^{-1}\) exists and is also strictly monotone, continuous, and bijective (by Theorem 3 below).

Throughout the note, we let \(D\) be a given open subset of \(\mathbb{R}^{m}\).

As a first step, define convex sets and convex functions as follows:

Definition 1. \(D\) is *convex* if:

Definition 2. A function \(f:D\to\mathbb{R}\) is *convex* if

- 1)
\(D\) is convex;

- 2)
for any \(x,y\in D\): \(\quad f(tx+(1-t)y)\leq t\,f(x)+(1-t)\,f(y)\qquad\forall t\in[0,1]\).

\(\;\,f\) is said to be *strictly convex* if 2) is replaced by

- \(\;\,\)2\({}^{\prime}\))
for any \(x,y\in D\) such that \(x\neq y\), \(\quad f(tx+(1-t)y)<t\,f(x)+(1-t)\,f(y)\qquad\forall t\in(0,1)\).

By a \(\mathrm{C}^{1}\) function, it is meant a function \(f:D\to\mathbb{R}\) whose first partial derivatives exist and all continuous. The following lemma is essential in the proof of the theorems.

Lemma 1. Suppose \(f:D\to\mathbb{R}\) is convex and \(\mathrm{C}^{1}\). Then, for any \(x\), \(y\in D\),

\begin{equation} f(y)\geq f(x)+\nabla f(x)(y-x).\nonumber \\ \end{equation}Moreover, if \(f\) is strictly convex, then for any \(x\), \(y\in D\) such that \(x\neq y\),

\begin{equation} f(y)>f(x)+\nabla f(x)(y-x).\nonumber \\ \end{equation}Here, we show that the gradient of a convex (resp. strictly convex) \(\mathrm{C}^{1}\) function \(f:D\to\mathbb{R}\) is monotone (resp. strictly monotone). For this, we precisely define the monotone function as follows.

Definition 3. A map \(F:D\to\mathbb{R}^{m}\) on a convex open subset \(D\) of \(\mathbb{R}^{m}\) is monotone if for all \(x\), \(y\in D\),

\begin{equation} (y-x)^{\mathsf{T}}(F(y)-F(x))\geq 0.\nonumber \\ \end{equation}\(F\) is said to be strictly monotone if for all \(x\), \(y\in D\) such that \(x\neq y\),

\begin{equation} (y-x)^{\mathsf{T}}(F(y)-F(x))>0.\nonumber \\ \end{equation}Theorem 1. Let \(f:D\to\mathbb{R}\) be \(\mathrm{C}^{1}\). Then, its gradient \(\nabla f^{\mathsf{T}}:D\to\mathbb{R}^{m}\) is monotone (resp. strictly monotone) if \(f\) is convex (resp. strictly convex).

Proof. For any \(x\), \(y\in D\), we have

\begin{equation} f(y)\geq f(x)+\nabla f(x)(y-x)\nonumber \\ \end{equation}by Lemma 1. By choosing the points reverse, we also have

\begin{equation} f(x)\geq f(y)+\nabla f(y)(x-y)\nonumber \\ \end{equation}Adding these inequalities and rearranging it finally yield

\begin{equation} (\nabla f(y)-\nabla f(x))(y-x)\geq 0.\nonumber \\ \end{equation}The proof of the strictly convex case can be done in the exactly same manner with the strict inequality ‘\(>\)’ rather than ‘\(\geq\).’ ∎

When \(f:D\to\mathbb{R}\) is strictly convex and \(\mathrm{C}^{1}\) function, then its gradient \(\nabla f^{\mathsf{T}}:D\to\mathbb{R}^{m}\) is an injective mapping. This can be shown via applying the following lemma:

Lemma 2. A map \(F:D\to\mathbb{R}^{m}\) is injective if it is strictly monotone.

Proof. The proof can be easily done by contradiction. Suppose there exist \(x\), \(y\in D\) such that \(F(x)=F(y)\) but \(x\neq y\). Then, we can directly see the contradiction from the definition of the strict monotonicity as

\begin{equation} 0=(y-x)^{\mathsf{T}}\mathbf{0}=(y-x)^{\mathsf{T}}(F(y)-F(x))>0,\nonumber \\ \end{equation}which implies “\(0>0\),” where \(\mathbf{0}\) is the zero vector in \(\mathbb{R}^{m}\). Hence, if \(x\), \(y\in D\) satisfies \(F(x)=F(y)\), then \(x=y\), so that \(F\) is injective. ∎

Theorem 2. Suppose \(f:D\to\mathbb{R}\) is strictly convex and \(\mathrm{C}^{1}\). Then, its gradient \(\nabla f^{\mathsf{T}}\) is an injective mapping.

Proof. By Theorem 1, \(\nabla f^{\mathsf{T}}\) is strictly monotone, and hence it is injective by Lemma 2. ∎

The key point given by Theorem 2 is that if \(\mathrm{Im}(\nabla f^{\mathsf{T}})=\mathbb{R}^{m}\), then \(\nabla f\) is now bijective, so that the inverse function \((\nabla f^{\mathsf{T}})^{-1}:\mathbb{R}^{m}\to D\) is defined over \(\mathbb{R}^{m}\). This inverse function has the following properties.

Theorem 3. Suppose \(f:D\to\mathbb{R}\) is strictly convex and \(\mathrm{C}^{1}\). If \(\mathrm{Im}(\nabla f^{\mathsf{T}})=\mathbb{R}^{m}\), then

- 1.
\(\nabla f\) is bijective and the inverse function \((\nabla f^{\mathsf{T}})^{-1}:\mathbb{R}^{m}\to\mathcal{D}\) is defined over \(\mathbb{R}^{m}\);

- 2.
the inverse \((\nabla f^{\mathsf{T}})^{-1}\) is also strictly monotone, bijective, and continuous.

Proof. The first argument and the fact that the inverse \((\nabla f^{\mathsf{T}})^{-1}\) is also bijective are obvious by the above discussion. To show strict monotonicity, suppose \(z,w\in\mathbb{R}^{m}\) and let \(x^{\mathsf{T}}\doteq(\nabla f)^{-1}(z)\) and \(y^{\mathsf{T}}\doteq(\nabla f)^{-1}(w)\). Then, since \(\nabla f\) (and \((\nabla f)^{-1}\)) is bijective, we have \(z=\nabla f^{\mathsf{T}}(x)\) and \(w=\nabla f^{\mathsf{T}}(y)\). Therefore, by strict monotonicity of \(\nabla f\) (Theorem 1), we conclude that

\begin{equation} \big{(}(\nabla f)^{-1}(z)-(\nabla f)^{-1}(w)\big{)}(z-w)=(x-y)^{\mathsf{T}}(\nabla f^{\mathsf{T}}(x)-\nabla f^{\mathsf{T}}(y))>0,\nonumber \\ \end{equation}meaning that \((\nabla f^{\mathsf{T}})^{-1}\) is also strictly monotone. Finally, as \(\nabla f^{\mathsf{T}}\) is bijective and continuous, the application of invariance of domain [4] proves continuity of \((\nabla f^{\mathsf{T}})^{-1}\). ∎