Yilin Mo move the cdc paper here  over 8 years ago

Commit id: be474414a7e3051056228b0e5d133736d64097cf

deletions | additions      

         

\section{KPA in CPS}\label{sec:sysid}  In this section, we shall first apply closed-loop system identification technique to the CPS and investigate the identifiability condition of $\mG(z)$ and $\mK(z)$ in Section~\ref{sec:idGK} (an algorithm to perform the identification has been proposed in \cite{Yuan2015}). A stealthy attack which is enabled by KPA is discussed in Section~\ref{sec:stealthy}.           

\begin{abstract}  A substantial amount of research on the security of cyber-physical systems assumes that the physical system model is available to the adversary. In this paper, we argue that such an assumption can be relaxed, given that the adversary might still be able to identify the system model by observing the control input and sensory data from the system. In such a setup, the attack with the goal of identifying the system model using the knowledge of input-output data can be categorized as a Known-Plaintext Attack (KPA) in the information security literature. We first prove a necessary condition and a sufficient condition, under which the adversary can successfully identify the transfer function of the physical system. We then provide a low-rank controller design which renders the system unidentifiable to the adversary, while trading off the LQG performance.   \end{abstract}           

\subsection{Identification algorithm}\label{sec:identification}  This subsection is devoted to providing a numerical algorithm for the adversary to derive $\mG(z)$ when $\mK(z)$ is full normal row rank, which is based on spectral factorization~\cite{Sayed:2001yu}.   % \ye{change} Given $\Phi_{y,u}(z)$, we can decompose the system to the following form   %\begin{equation}  % \Phi(z)=\mC(z) D\mC^*(z)=\mZ(z)+\mZ^T(-z),  %\end{equation}  %% in which $\mZ(z)$ is the one-sided z-transform of $\mathbb{E}\{y(k) u(k)\}$  %in which $D=\begin{bmatrix} Q & 0 \\ 0 & R \end{bmatrix}$.  Since the feedback system is asymptotic stable, $\Phi_{y,u}(z)$ has no poles on the unit circle. Consider a Mobius transform $z=\frac{1+s}{1-s}$ and let $\Psi_{y,u}(s)=\Phi_{y,u}\left(\frac{1+s}{1-s}\right)$, then for $\Psi_{y,u}(s)$ there exists a positive real matrix $\mS(s)$ \cite{keith}, such that   \begin{equation} \label{Zdef}  \mS(s) + \mS^T(-s) = \Psi(s)=\mW(s)\mW^T(-s).  \end{equation}  \begin{mydef}[Global Minimality]  For a given spectral density $\Psi(s)$, the globally-minimal degree is the smallest degree of all its spectral factors $\mW(s)$.  \end{mydef}  Any system of globally-minimal degree is said to be \emph{globally minimal}. Anderson \cite{anderson} provides an algebraic characterization of all realizations of all spectral factors as follows. Minimal realizations of $\mS$ are related to globally-minimal realizations of spectral factors of $\Psi$ by the following lemma.  \begin{lemma}[\cite{anderson}] \label{lemma:andlem}  Let $(A,B_s,C,D_s)$ be a minimal realization of the positive-real matrix $\mS(s)$ of \eqref{Zdef}, then the system $(A,B,C,D)$ is a globally-minimal realization of a spectral factor of $\Psi(s)$, i.e., $\mW(s)$ if and only if the following equations hold:  \begin{equation} \label{eq:and}  \begin{aligned}  RA^T + AR &= -BB^T\\  RC^T &= B_s - BD^T\\  2D_s &= DD^T  \end{aligned}  \end{equation}  \noindent for some positive-definite and symmetric matrix ${R\in \RR^{n \times n}}$.  \end{lemma}  For a properly chosen $R$, $\mW(s)$ can be computed from its realization. Since $\mW\left(\frac{z-1}{z+1}\right)=\mC(z)D^{1/2}J\triangleq \mC(z)\hat{D}$, for some signed identity matrix $J$ \cite{hayden}  \begin{equation}  \lim_{z\rightarrow\infty}\mW\left(\frac{z-1}{z+1}\right)=\begin{bmatrix} 0 & I \\  0 & 0   \end{bmatrix} \hat{D}.  \end{equation}  We partition $\mW$ and $\hat{D}$ to four blocks with corresponding dimensions as $\mC$ in \eqref{eq:Cz}. Then it follows that   \begin{equation}\label{eq:D22}  \hat{D}_{22}=\lim_{z\rightarrow\infty}\mW_{12}\left(\frac{z-1}{z+1}\right).  \end{equation}  Finally, once $\hat{D}_{22}$ is obtained, we can obtain an estimate closed-loop transfer function   \begin{equation}\label{eq:estC}  \hat{\mC}(z)=\mW\left(\frac{z-1}{z+1}\right)\begin{bmatrix} I & 0 \\ 0 & \hat{D}^{-1}_{22}  \end{bmatrix},  \end{equation}  and the transfer functions for plant and controller, $\mG(z)$, $\mK(z)$ using \eqref{eq:khg}.  We summarize the identification procedure to the following Algorithm~\ref{alg:main}.   %  %  %\begin{itemize}  %\item Each element in $\Psi[i,j](s)$ can be expanded as a sum of partial fractions and a term $\Psi[i,j](\infty)$.   %\item Those partial fractions with poles in $Re[s] < 0$ may then be summed together, and when add to $\frac{1}{2}\Psi[i,j](\infty)$ yield the i-j entry of $\mZ(s)$.  %\item The sum of the partial fractions with poles in $Re[s] > 0$ and $\frac{1}{2}\Psi[i,j](\infty)$ yields the j-i entry of $\mZ(-s)$.   %\end{itemize}  \begin{algorithm}  \caption{Identification algorithm for $\mG(z)$, $\mK(z)$}  \textbf{Inputs:} Input-output data $y_k$ and $u_k$.\\  \textbf{Outputs:} The transfer functions $\mG$ and $\mK$.   \begin{step}  Compute $\Phi_{y,u}(z)$ from input-output data $y_k$ and $u_k$ and let $\Psi_{y,u}(s)=\Phi_{y,u}(\frac{1+s}{1-s})$;   \end{step}  \begin{step}  Each element in $\Psi[i,j](s)$ can be expanded as a sum of partial fractions and a term $\Psi[i,j](\infty)$. Those partial fractions with poles in $Re[s] < 0$ may then be summed together, and when add to $\frac{1}{2}\Psi[i,j](\infty)$ yield $\mS[i,j](s)$;  \end{step}  \begin{step}  Compute $\mW(s)$ from \eqref{eq:and} for a minimal realization of $\mS(s)$ and a properly chosen $R\succeq0$.  \end{step}  \begin{step}  Compute $\hat{D}_{22}$ based on $\mW\left(\frac{z-1}{z+1}\right)$ from \eqref{eq:D22}.  \end{step}   \begin{step}  Once an estimate $\hat{\mC}(z)$ is computed using \eqref{eq:estC} and we can obtain $\mG(z)$ and $\mK(z)$ using \eqref{eq:khg}.  \end{step}  \label{alg:main}  \end{algorithm}  \begin{remark}  Since the main theme of this paper is to bring up the potential security issue in the classic feedback systems and propose a new control architecture which is robust to such attacks, the following numerical issues in spectral factorization are out of the scope of this paper, i..e., how the estimate of $\Phi_{y,u}(z)$ depends on the number of samples and how this error would propagate into the identification of $\mG(z)$ and $\mK(z)$.   \end{remark}  %  % \subsection{Identifiability of $(A,B,C)$}  % Once $\mG(z)$ and $\mK(z)$ are uniquely determined from data, the next question is whether the attacker is able to obtain the state-space model. From a modelling perspective, the state-space carries more information about the underlying physics.   %  % \begin{mydef}  % Given the spectrum $\Phi_{y,u}(z)$ generated by input-output data, we shall say $$\Pi\triangleq\left(A,B,C, K, L, U, R, S,P \right)$$ are admissible to $\Phi_{y,u}$ if $\Pi$ is consistent to the prior knowledge and has spectrum $\Phi_{y,u}$ following the computation above.   % \end{mydef}  %  % \begin{mydef}  % We say the feedback system $(A,B,C)$ is identifiable from the input-output data $(y,u)$ if any admissible $$\hat{\Pi}=\left(\hat{A},\hat{B},\hat{C}, \hat{K}, \hat{L}, \hat{U}, \hat{R}, \hat{S},\hat{P} \right)$$ has $\hat{A}=A$, $\hat{B}=B$ and $\hat{C}=C$.  % \end{mydef}  %  % \begin{myass}  % Assume that the state-space model $(A,B,C)$ is both controllable and observable and therefore minimal.  % \end{myass}  %  % In this section, the identifiability of the state-space model shall be investigated. This differs from close-loop system identification in the sense that the attacker has extra information that the controller knows the plant dynamics as well as uses a predefined control strategy. Once the attacker has learned $\mG(z)$, $\mH(z)$ and $\mK(z)$ with the following form   % \begin{align}  % \mG(z)&=C(zI-A)^{-1}B,\\  % \mH(z)&=C(zI-A)^{-1}V_{11},\\  % \mK(z)&=K(zI-A+BK+LC)^{-1}L,  % \end{align}  % the next step is to determine whether the state-space can be uniquely identified from $\mG(z)$ and $\mK(z)$. If not, what is the ``minimal'' additional information that is needed to make state-space realization identifiable.   %  % Starting from a transfer function $\mG$, we can obtain the system matrices up to a symmetric transformation   % \begin{align*}  % \hat{A}=T^{-1}AT,~\hat{B}=T^{-1}B,~\hat{C}=CT,  % \end{align*}  % in which $T\in GL(n)$.  %   % \ye{I am here and will continue tomorrow}  % \begin{proposition}\label{prop:mK}  % Given the closed-loop system with LQG controller, given a state-space realization of $\mG(z)$: $\left(\hat{A},\hat{B},\hat{C}\right)$ and there only exists a realization of $\mK(z)$,   % \end{proposition}  % \begin{proof}  % A realization of $\mK(z)$ has the following form (let $\hat{X}\triangleq \hat{A}+\hat{B}\hat{K}+\hat{L}\hat{C}$)  % \begin{align*}  % \bar{X}=T_1^{-1}\hat{X}T_1,~\bar{L}=T_1^{-1}\hat{L},~\bar{K}=\hat{K}T_1,  % \end{align*}  %for some $T_1\in GL(n)$. If we were able to solve $T_1$, then we shall be able to obtain   %  % This is equivalent to have   % \begin{equation}  % \begin{aligned}  % \hat{X}&=W^{-1}(\hat{A}+\hat{B}\hat{K}W^{-1}+W\hat{L}\hat{C})W\\  % &=W^{-1}\hat{A}W+ W^{-1}\hat{B}\hat{K}+\hat{L}\hat{C}W.  % \end{aligned}  % \end{equation}  %   % Given $\left(\hat{A},\hat{B},\hat{C}\right)$  %   %   %   % \end{proof}  %   %   %  % \begin{proposition}\label{prop:t}  % Assume the feedback control system uses the LQG control strategy, using the estimated $\hat{A}, \hat{B}, \hat{C}$, then the attacker can identify $\hat{K}, \hat{L}, \hat{U},\hat{R}, \hat{S},\hat{P}$ with the following form  % \begin{align*}  % \hat{K}=KT,~\hat{L}=T^{-1}L,~\hat{R}=R,~\hat{U}=U,\\  % \hat{S}=T^{-T}ST^{-1},~\hat{P}=TPT^T.  % \end{align*}  % \end{proposition}  % \begin{proof}  % This can be easily shown from the definition.  % \end{proof}  %  %\begin{remark}  % The attacker can identify $R$ from the spectral factorization and $U$.  %\end{remark}  %  % \begin{theorem}\label{thm:abc}  % Given the feedback control system defined in previous sections, assume the attacker has identified $R$ using spectral factorization in previous section and there does not exist $T\in GL(n)$ such that   % \begin{align}  % TQT^T=Q,\\  % TWT^T=W,  % \end{align}  % then the attacker would be able to identify $(A,~B,~C)$.  % \end{theorem}  % \begin{proof}  % It follows from definitions.  % \end{proof}  %  %  % \begin{corollary}\label{coro:}  % If $Q$ and $W$ satisfy that $Q^{-1/2}WQ^{-1/2}\neq kI$ for any $k\in\mathbb{R}$, then there is a unique solution to the identification problem.   % \end{corollary}  % \begin{proof}  % Since $Q$ is positive definite, let $\bar{T}=Q^{-1/2}TQ^{1/2}$,   % \begin{align}  % \bar{T}\bar{T}^T&=I\\  % \bar{T}Q^{-1/2}WQ^{-1/2}\bar{T}^T &=Q^{-1/2}WQ^{-1/2},   % \end{align}  % then $\bar{T}$ is a unitary matrix. Let $M\triangleq Q^{-1/2}WQ^{-1/2}$, then $M$ is a diagonal matrix satisfying $\bar{T}M\bar{T}^T=M$. In this case $\bar{T}M\bar{T}^T=\sum_{i}M[i,i]t_it_i^T=M.$  %  % Let $\bar{T}=\begin{bmatrix}  % t_1 & t_2 & \ldots & t_n  % \end{bmatrix},$ since $\bar{T}$ is unitary then $t^T_it_i=1$ and $t_i^Tt_j=0$ for $i\neq j$, then we right multiply $t_i$ at both sides of the equation and obtain   % \begin{equation}  % M[i,i]t_i=Mt_i, ~\Leftrightarrow M[i,i]t_i[j]=M[j,j]t_i[j] \forall i, j,  % \end{equation}  % which can easily lead to $M[i,i]=c,~\forall i$ since $\forall i$, $t_i[j]$ can not be $0$ for all $j$.  % \end{proof}  %  % \begin{remark}  % Both the plant $\mG$ including the state-space model $(A,B,C)$ and the LQG controller $\mK$ are identifiable under certain assumptions. Here is a table that summaries the different assumptions.  % \end{remark}  %   %\begin{table}[b]  %\centering  %\begin{tabular}[htbp]{| c | c | c | }  %\hline  %State & Identifiable & Identifiability condition \\ \hline  %$\mH$ & $\mH$ & $x_6$ \\ \hline  %$\mG$ & Yes & $x_7$ \\ \hline  %$\mK$ & Yes & $x_{15}$ \\\hline  %\end{tabular}  %\caption{A summary of identifiability conditions.}  %\label{table:summary}  %\end{table}  %  % % \subsection{Identification procedure}  % % This section shall propose an algorithm that takes $\Phi_{y,u}$ to the $(A,B,C)$. Once the identifiability of $\mG$ and $\mK$ and $(A,B,C)$ have been guaranteed, an identification algorithm shall be proposed to obtain $\mG(z)$ and $\mH(z)$ and return $(A,B,C)$.   % %  % %  % % Some observations for understanding the proposed algorithm  % % \begin{itemize}  % % \item $Q^{-1/2}WQ^{-1/2}=Q^{-1}W$ since both $Q$ and $W$ are diagonal;  % % \item $\bar{T}=Q^{-1/2}TQ^{1/2}$ \ye{can we get $T$ from $\bar{T}$?}  % % \end{itemize}  % %  % %  % %  % % \begin{algorithm}  % % \caption{Identification algorithm for $\mG$, $\mK$ and $(A,B,C)$ for input-output data}  % % \textbf{Inputs:} Input-output data $y_k$ and $u_k$.\\  % % \textbf{Outputs:} The transfer functions $\mG$ and $\mH$ and state-space realization $(A,B,C)$.  % % \begin{step}  % % Compute $\Phi_{y,u}(z)$ from data;  % % \end{step}  % % \begin{step}  % % Factorize $\Phi_{y,u}$ and compute a stable, minimum-phase transfer matrix $\mC(z)$;  % % \end{step}  % % \begin{step}  % % Compute $\mG(z)$, $\mK(z)$ from eq.~\eqref{eq:khg};  % % \end{step}  % % \begin{step}  % % Find a minimal realization $(A_1, B_1, C_1)$ of $\mG(z)$;  % % \end{step}  % % \begin{step}  % % Compute the corresponding parameters in the controller $\Pi_1$ from the defition;  % % \end{step}  % % \begin{step}  % % Compute $M_1=Q_1^{-1}W_1$ and obtain a unitary matrix $T_1=Q^{1/2}T_2Q^{-1/2}$ from a Takagi decomposition of $M_1$ as follows $M_1=T_2MT_2^T$, in which $M$ is a diagonal matrix and positive definite;   % % \end{step}  % % \begin{step}  % % Obtain $(A,B,C)$ from the following   % % \begin{equation}  % % A=T_1^{-1}A_1T_1,~B=T_1^{-1}B_1,~C=C_1T_1.   % % \end{equation}  % % \end{step}  % % \end{algorithm}  % % \begin{lemma}[Takagi decomposition]  % % For a symmetric matrix $A$, it can be decompose to $A=VDV^T$ where $D$ is a non-negative matrix   % % \end{lemma}  % %   % % \ye{code up}           

\appendix  % \comments{Yilin: is it necessary? The following lemma is needed to prove of Theorem~\ref{lemma:idgk}.  % \begin{lemma}\label{lemma:lt}  % [Liouville's theorem] Every holomorphic function $f$ for which there exists a positive number $M$ such that $|f(z)| \le M$ for all $z\in\mathbb{C}$ is constant.  % \end{lemma}}  \begin{proof}[Proof of Lemma~\ref{lemma:idgk}]  From the definition, $\Phi_{y,u}(w)$ is real rational and positive semi-definite for $|z|=1$. The closed-loop transfer function $\mC(z)$ is stable and minimum phase. Therefore, $\mC(z)$ is analytic in $|z|\ge 1$. Since $\mG, \mH$ are strictly proper, we have $\mG(\infty) = 0,\,\mH(\infty)=0$. On the other hand, since $\mK$ is proper and rational, $\mK(\infty)$ exists. Hence  \begin{align*}  \lim_{z\rightarrow\infty} \mC(z)= \begin{bmatrix}   0 & I \\  0 & \mathcal K(\infty)  \end{bmatrix}.  \end{align*}  Assume that both $\mC,~D=\begin{bmatrix}  Q & 0 \\   0 & R  \end{bmatrix}$ and $\hat \mC,~\hat D=\begin{bmatrix}  \hat Q & 0 \\  0 & \hat R  \end{bmatrix}$ give the same $\Phi_{y,u}$ satisfying   \begin{enumerate}  \item $D$ and $\hat D$ are block diagonal and positive definite matrices; \item both $\mC$ and $\hat \mC$ are stable and minimum phase,  \end{enumerate}  then there exists a paraunitary matrix $\mathcal V(z)$ such that \cite{anderson3}  \begin{align}  \hat \mC(z)=\mC(z)\mathcal V(z),\label{eq:c1c2}\\  \hat D= \mathcal V(z) D\mathcal V^*(z).\label{eq:q1q2}  \end{align}  From \eqref{eq:c1c2}, since both $\mC(z)$ and $\hat \mC(z)$ are stable and minimum phase, $\mathcal V(z)$ is stable and minimum phase, which implies that $\mathcal V(z)$ is a constant matrix independent of $z$ \cite{anderson, hayden}. Therefore, we denote it simply as $V$. Take $z\rightarrow\infty$ on both sides of \eqref{eq:c1c2} yields  \begin{equation}  \begin{bmatrix}  0 & I \\  0 & \hat \mK(\infty)   \end{bmatrix}= \begin{bmatrix}  0 & I \\  0 & \mK(\infty)  \end{bmatrix}\lim_{z\rightarrow\infty}V,  \end{equation}  which leads to   \begin{align}  V_{21}=0,\, V_{22}=I.  \end{align}  Since $VV^* = I$, we have $V_{12} = 0$ and $V_{11}V_{11}^* = I$. As a result, \eqref{eq:c1c2} and \eqref{eq:q1q2} imply that  \begin{equation}  \begin{aligned}  &\hat\mC=\mC\begin{bmatrix}V_{11} & 0 \\ 0 & I \end{bmatrix}  \Leftrightarrow\left\{  \begin{array}{r@{\;=\;}l}  \hat\mC_{11} & \mC_{11} V_{11}\\  \hat\mC_{12} & \mC_{12} \\  \hat\mC_{21} & \mC_{21}V_{11}\\   \hat\mC_{22} & \mC_{22}   \end{array}  \right. ~~\text{and}\\  &\hat{D}=\begin{bmatrix}V^*_{11} & 0 \\ 0 & I \end{bmatrix}D\begin{bmatrix}V_{11} & 0 \\ 0 & I \end{bmatrix}  \Leftrightarrow\left\{  \begin{array}{r@{\;=\;}l}  \hat{Q} & V^*_{11}QV_{11}\\  \hat{R} & R  \end{array}\right..  \end{aligned}  \end{equation}  % \comments{Yilin: The original proof. Is it necessary?}  %   % Due to symmetry, $ V^*(z)$ is stable and minimum phase. Take $z\rightarrow\infty$ on both sides of \eqref{eq:c1c2} yields  % \begin{equation}  % \begin{bmatrix}  % 0 & I \\  % 0 & \hat \mK(\infty)   % \end{bmatrix}= \begin{bmatrix}  % 0 & I \\  % 0 & \mK(\infty)  % \end{bmatrix}\lim_{z\rightarrow\infty}V(z),  % \end{equation}  % which leads to   % \begin{align}  % \lim_{z\rightarrow\infty}V_{21}(z)=0,\, \lim_{z\rightarrow\infty}V_{22}(z)=I.  % \end{align}  %   % Given the following facts:  % \begin{itemize}  % \item[a)] since $V(z)$ is stable and minimum phase, $V_{21}(z)$ is analytic and nonsingular $|z|>1$;  % \item[b)] similarly since $V^*(z)$ is stable and minimum phase, $V^*_{21}(z)=V^T_{21}(z^{-1})$ is analytic and nonsingular $|z|>1$, in other words, $V_{21}(z)$ is analytic and nonsingular $|z|<1$;  % \item[c)] since $V(z)D_2V^*(z)=D_1$ on $|z|=1$, $V(z)$ is bounded there so is $V_{21}(z)$.   % \end{itemize}   %   % Hence $V_{21}(z)$ is analytic everywhere with $\lim_{z\rightarrow\infty}V_{21}(z)=0$. By Lemma~\ref{lemma:lt}, $V_{21}(z)=0,~\forall z\in\mathbb{C}$. Following similar deduction, we can obtain $V_{22}(z)=I~ \forall z\in\mathbb{C}$.   %   % Since $V(z)$ is paraunitary, $V(z)V^*(z)=I$, which implies that $V_{12}(z)=0$ and $V_{11}(z)V^*_{11}(z)=I$. Therefore, $V_{11}(z)$ is a stable, minimum-phase paraunitary matrix, which implies that $V_{11}(z)$ must be a constant unitary matrix independent of $z$. Thus, we denote it as $V_{11}$. From \eqref{eq:q1q2}, it can be shown that   % \begin{equation}  % \begin{aligned}  % V_{11}Q_2V_{11}^T=Q_1.  % \end{aligned}  % \end{equation}  %   % Based on the derivation, for any $\hat{\mC}$ and $\hat{D}$ that satisfy   % \begin{equation}  % \begin{aligned}  % &\hat\mC=\mC\begin{bmatrix}V_{11} & 0 \\ 0 & I \end{bmatrix}  % \Leftrightarrow\left\{  % \begin{array}{r@{\;=\;}l}  % \hat\mC_{11} & \mC_{11} V_{11}\\  % \hat\mC_{12} & \mC_{12} \\  % \hat\mC_{21} & \mC_{21}V_{11}\\   % \hat\mC_{22} & \mC_{22}   % \end{array}  % \right. ~~\text{and}\\  % &\hat{D}=\begin{bmatrix}V^*_{11} & 0 \\ 0 & I \end{bmatrix}D\begin{bmatrix}V_{11} & 0 \\ 0 & I \end{bmatrix}  % \Leftrightarrow\left\{  % \begin{array}{r@{\;=\;}l}  % \hat{Q} & V^*_{11}QV_{11}\\  % \hat{R} & R  % \end{array}\right..  % \end{aligned}  % \end{equation}  %   % Finally, combining Proposition~\ref{prop:feedback}, we can obtain the identifiability condition for admissible $\Sigma$, which completes the proof.  \end{proof}  \begin{proof}[Proof to Lemma~\ref{lemma:X}]  Assume that $(X,\,\tilde S)$ is the optimal solution for \eqref{eq:optimization3}. Since $\tilde S\geq g_X(\tilde S)$ and $g_X$ is monotonically non-decreasing in $\tilde S$, we know that  \begin{align*}  \tilde S\geq g_X(\tilde S)\geq g_X^{(2)}(\tilde S)\geq \dots \geq 0,  \end{align*}  where   \begin{align*}  g_X^{(1)}(\tilde S) \triangleq g_X(\tilde S),\, g_X^{(n+1)}(\tilde S) \triangleq g_X\left(g_X^{(n)}(\tilde S)\right).  \end{align*}  Since $g_X^{(n)}(\tilde S)$ is monotonically decreasing and positive semidefinite, it will converge to a matrix $\tilde S^* = g_X(\tilde S^*)\leq \tilde S$. Therefore, $(X,\,\tilde S^*)$ is also the optimal solution of \eqref{eq:optimization3}, which finishes the proof.  \end{proof}  % \begin{proof}[Proof to Lemma~\ref{lemma:pro}]  % It is easy to see that $\mathbb X$ is convex and closed. Assume that $X$ is a symmetric projection matrix with rank $q$. Since $X^2 = X$, $X$ has $q$ eigenvalues to be $1$ and all the other eigenvalues to be $0$. As a result, we know that  % \begin{align*}  % 0\preceq X\preceq I,\,\tr(X) = q.   % \end{align*}  % Therefore, $\mathbb X$ contains all the symmetric projection matrix of rank $q$. Now suppose that $X$ is an extreme point of $\mathbb X$. In other words, if $X$ can be written as  % \begin{align*}  % X = \alpha X_1 + (1-\alpha)X_2,   % \end{align*}  % where $X_1,X_2\in \mathbb X$ and $0\leq \alpha\leq 1$, then $\alpha$ must be either $0$ or $1$. We will prove that an extreme point $X$ must be a projection matrix by contradiction. Suppose that $X\in\mathbb X$ is an extreme point and $X$ is not a projection matrix. We know that $X$ can be written as  % \begin{align*}  % X = V^T \Lambda V,   % \end{align*}  % where $V$ is an orthonormal matrix and $\Lambda = \diag(\lambda_1,\dots,\lambda_p)$, such that $0\leq \lambda_i\leq 1$ and $\sum_{i=1}^p \lambda_i = q$. Since $X$ is not a projection matrix, there must be at least two $\lambda_i$s that are neither $0$ nor $1$. Without loss of generality, let us assume that $0< \lambda_1\leq \lambda_2<1$. Hence, there exists $\delta > 0$, such that  % \begin{align*}  % \lambda_1 - \delta \geq 0,\,\lambda_2 + \delta \leq 1.  % \end{align*}  % Therefore, we can define   % \begin{align*}  % X_1 &= V^T\diag(\lambda_1-\delta,\lambda_2+\delta,\lambda_3,\dots,\lambda_p)V,\\  % X_2 &= V^T\diag(\lambda_1+\delta,\lambda_2-\delta,\lambda_3,\dots,\lambda_p)V.  % \end{align*}  % It is easy to check that $X_1,\,X_2\in \mathbb X$ and $X = 0.5X_1+0.5X_2$, which contradicts with the fact that $X$ is an extreme point. Therefore, only projection matrices can be the extreme points of $\mathbb X$. Now by Krein-Milman Theorem, we know that $\mathbb X$ is the closed convex hull of projection matrices.  % \end{proof}           

\subsection{What can the attacker do after KPA?}  \label{sec:stealthy}  In this section, we briefly describe a stealthy attack on the CPS after the adversary has obtained the transfer function $\mG(z)$. The goal of this subsection is to demonstrate that KPA can enable other attacks discussed in the literature. For more detailed discuss on stealthy attack, please refer to~\cite{Pasqualetti2013}.   We assume that the adversary compromised a subset of actuators and sensors and can change the corresponding control inputs and sensor measurements respectively. As a result, the system equation becomes:  \begin{align*}  x(k+1) &= Ax(k) + B\left[u(k) + \Gamma_u u^a(k)\right]+w(k),\\  y(k) &= Cx(k) + v(k) + \Gamma_y y^a(k),\\  u(k) &= \mathcal K(z)y(k),  \end{align*}  where $u^a(k)$ and $y^a(k)$ is the bias on the control inputs and the sensor measurements injected by the adversary at time $k$. $\Gamma_u$ ($\Gamma_y$) is a diagonal matrix with binary diagonal elements, such that the $i$th diagonal elements is $1$ if and only if the $i$th actuator (sensor) is compromised by the attacker. Since the matrices $\Gamma_u$ and $\Gamma_y$ represent the set of compromised actuators and sensors, they are known to the attacker. Let us define  \begin{align*}  \mG_a(z) \triangleq C(zI-A)^{-1}B\Gamma_u = \mG(z)\Gamma_u.   \end{align*}  Clearly, the whole trajectory of the sensor measurements $y$ is a function of the noise process $w,\,v$, the initial condition $x(0)$ and the adversary's action $u^a,\,y^a$. Therefore, we shall denote it as   \begin{align*}  y = f(w,v,x(0),u^a,y^a).   \end{align*}  Notice that we omitted the control input $u$ since $u$ can be calculated from $y$.  Now if there exists a scalar $z_*\in \mathbb C$, and two vectors $u_*\in \mathbb C^p$ and $y_*\in \mathbb C^m$, such that  \begin{align*}  \mG_a(z_*)u_* + \Gamma_y y_* = 0,  \end{align*}  then the adversary can choose  \begin{align}  u^a(k) = z_*^k u_*,\,y^a(k) = z_*^k y_*.  \label{eq:stealthyinput}  \end{align}  Let us define $x_* \triangleq (z_*I-A)^{-1}B\Gamma_u u_*$. One can verify that  \begin{align*}  f(w,v,x(0)+x_*,u^a,y^a) = f(w,v,x(0),0,0).  \end{align*}  Therefore, the attack is stealthy since given the sensory data $y$, the controller cannot distinguish the following two cases from the sensory data:  \begin{enumerate}  \item the initial condition is $x(0)+x_*$ and the adversary injected $u^a$ and $y^a$ defined in \eqref{eq:stealthyinput};  \item the initial condition is $x(0)$ and no adversary exists.  \end{enumerate}  \begin{remark}  It is worth noticing that the adversary only need to compute $z_*,\,u_*$ and $y_*$ to launch the attack, which only requires the knowledge of $\mG(z),\,\Gamma_u$ and $\Gamma_y$.   \end{remark}           

\section{Conclusion}  \label{sec:conclusion}  We consider KPA in CPS and provide a necessary condition and a sufficient condition under which the transfer function of the physical system can be uniquely identified by an adversary who passively observes the control input and sensory data. Our results demonstrate the vulnerability of the classical MIMO feedback control systems to KPA. A low-rank controller design framework is then proposed to prevent the adversary from identifying the exact physical system model. The design trade-off between system performance and security has been investigated.           

\section{Low-Rank Controller Design against KPA}  \label{sec:countermeasure}  By Theorem~\ref{thm:nonidentifiable}, one way to prevent the adversary from identifying $\mathcal G(z)$ is to enforce the factorization \eqref{eq:factorization} on the controller transfer function $\mK(z)$. Let us define the following ``virtual'' control input:  \begin{equation}  \tilde u(k) \triangleq \tilde{\mathcal K}(z) y(k).   \label{eq:tildeudef}  \end{equation}  Hence, $u(k) = \mathcal K(z) y(k) = F \tilde u(k)$. The factorization on $\mK(z)$ implies the CPS diagram illustrated in Fig~\ref{fig:factorization}.  \begin{figure}[ht]  \begin{center}  \begin{tikzpicture}[>=stealth',  box/.style={rectangle, draw=blue!50,fill=blue!20,rounded corners, semithick,minimum size=7mm},  point/.style={coordinate},  every node/.append style={font=\small}  ]  \matrix[row sep = 5mm, column sep = 5mm]{  %first row  \node (p1) [] {$w(k)$};  & \node (noisetransfer) [box] {$\mathcal H(z)$};  &   &\node (p2) [point] {};  &\\  %second row  \node (p3) [point] {};  &\node (plant) [box] {$\mathcal G(z)$};  &   & \node (plus) [circle,draw,inner sep=2pt] {};  & \node (p5) [point] {};\\  %third row  &  &   & \node (p6) [] {$v(k)$};  & \\  %fourth row  \node (p7) [point] {};  &\node (F) [box] {$F$};  &  & \node (controller) [box] {$\tilde \mK (z)$};  & \node (p8) [point] {};\\  };  \draw [semithick,->] (p1)--(noisetransfer);  \draw [semithick,->] (noisetransfer)--(p2)--(plus);  \draw [semithick,->] (p6)--(plus);  \draw [semithick,->] (plant)--(plus);  \draw [semithick,->] (plus)--(p5)-- node[midway,right]{$y(k)$} (p8) -- (controller);  \draw [semithick,->] (controller)-- node[midway,above]{$\tilde u(k)$} (F);  \draw [semithick,->] (F)--(p7)-- node[midway,left]{$u(k)$}(p3)--(plant);  \draw [semithick] (plus.north)--(plus.south);  \draw [semithick] (plus.east)--(plus.west);  \end{tikzpicture}  \end{center}  \caption{The diagram of the CPS with a low-rank controller design, where $\mK(z)$ is factorized into $F\tilde \mK(z)$.}  \label{fig:factorization}  \end{figure}  Since we are restricting ourselves to use a low-rank controller, the performance of the system may not be optimal. In this section, we consider the problem of optimizing the following infinite horizon LQG performance:  \begin{equation}  J = \limsup_{T\rightarrow \infty}\frac{1}{T}\min_{u(k)}\mathbb{E}\left[\sum_{k=0}^{T-1} x(k)^TWx(k)+u(k)^TUu(k)\right],  \label{eq:lqgcost}  \end{equation}  under the constraint that $F \in \mathbb R^{p\times q}$ where $q$ is given. The $W,\,U$ matrices are assumed to be positive semidefinite. We shall first consider how to design $\tilde {\mathcal K}(z)$ when $F$ is given. We then provide a heuristic algorithm to compute the optimal $F$ based on convex relaxation.         

\usepackage{subfigure}  \usepackage{amssymb, amsmath, amsfonts}  \usepackage{amsthm}  \usepackage{tikz}  \usetikzlibrary{matrix,fit,external,arrows}  \usepackage{pgfplots}  \newtheorem{proposition}{Proposition}  \newtheorem{theorem}{Theorem}  \newtheorem{definition}{Definition}  \newtheorem{lemma}{Lemma}  \newtheorem{conjecture}{Conjecture}  \newtheorem{corollary}{Corollary}  \newtheorem{remark}{Remark}  \newtheorem{assumption}{Assumption}  \newtheorem{step}{Step}  \newlength\figureheight  \newlength\figurewidth  \DeclareMathOperator*{\argmin}{arg\; min} % argmin  \DeclareMathOperator*{\argmax}{arg\; max} % argmax  \DeclareMathOperator*{\tr}{tr} % trace  \DeclareMathOperator{\Cov}{Cov}  \DeclareMathOperator{\logdet}{log\;det}  \DeclareMathOperator{\rank}{rank}  \newtheorem{mydef}{Definition}  \newtheorem{mylem}{Lemma}  \newtheorem{mythe}{Theorem}  \newtheorem{myrem}{Remark}  \newtheorem{myalg}{Algorithm}  \newtheorem{mycol}{Corollary}  \newtheorem{myegg}{Example}  \newtheorem{myass}{Assumption}  \newtheorem{myprob}{Problem}  \newcommand{\RR}{\mathbb{R}}  \newcommand{\bbS}{\mathbb{S}}  \newcommand{\ha}{\hat{a}}  \newcommand{\hb}{\hat{b}}  \newcommand{\hc}{\hat{c}}  \newcommand{\hk}{\hat{k}}  \newcommand{\din}{\textrm{dim}}  \newcommand{\diag}{\textrm{diag}}  \newcommand{\al}{\alpha}  \newcommand{\mG}{\mathcal{G}}  \newcommand{\mH}{\mathcal{H}}  \newcommand{\mC}{\mathcal{C}}  \newcommand{\mK}{\mathcal{K}}  \newcommand{\mS}{\mathcal{S}}  \newcommand{\mW}{\mathcal{W}}  \newcommand{\mZ}{\mathcal{Z}}  \newcommand{\mE}{\mathcal{E}}  \newcommand{\mPhi}{\mathcal{Phi}}  \newcommand{\ga}{\gamma}           

\subsection{On the identifiability of $\mG(z),~\mK(z)$}\label{sec:idGK}  This subsection is devoted to deriving the identifiability condition of $\mG(z)$ and $\mK(z)$. The identifiability of such systems have been investigated based on spectral factorization.   \begin{mydef}  %Let $e_{i}(k)$ be a discrete-time zero-mean, jointly wide-sense stationary random processes and the vector  Let $e(k)=(e_1(k),..,e_{N}(k))^T$ be a $N$-dimensional discrete-time, zero-mean, wide-sense stationary random process. For any $\tau \in \mathbb{Z}$, define its autocorrelation function $R_e(\tau)$ and power spectral density $\Phi_e(z)$ as  \begin{align*}   R_{e}(\tau)&\triangleq \mathbb{E}[e(k)e^T(k+\tau)].\\  \Phi_{e}(z)&\triangleq\sum_{\tau=-\infty}^{\infty}R_{e}(\tau)z^{-\tau} .  \end{align*}  %Given a $N$-dimensional discrete-time, zero-mean, wide-sense stationary random process $e(k)=(e_1(k),..,e_{N}(k))^T$, we  \end{mydef}  Since we assume that the closed-loop system is asymptotically stable, $\left[\begin{smallmatrix}y(k)\\u(k)\end{smallmatrix}\right]$ converges to a stationary process. Hence, the adversary can compute (or estimate) the joint power spectral density $\Phi_{y,u}$ for the limiting stationary process, if it observes the system for a sufficient amount of time. By \eqref{eq:utoy} and \eqref{eq:ytou}, we know that $\Phi_{y,u}$ satisfies the following equation:  \begin{equation}  \Phi_{y,u}(z)=\mC(z)\begin{bmatrix}   Q & 0 \\ 0 & R   \end{bmatrix}\mC^*(z).  \end{equation}   where the closed-loop transfer function $\mC(z)$ has the following form  \begin{align}  \mC(z) &=\begin{bmatrix}  \mC_{11}(z) & \mC_{12}(z)\\  \mC_{21}(z) & \mC_{22}(z)  \end{bmatrix}\label{eq:Cz}\\  &\triangleq\begin{bmatrix}  (I-\mG\mK)^{-1}\mH & (I-\mG\mK)^{-1}\\  \mK(I-\mG\mK)^{-1}\mH & \mK(I-\mG\mK)^{-1}  \end{bmatrix}.\nonumber  \end{align}   \begin{myass}\label{ass:stable}  $\mC(z)$ is asymptotically stable and minimum phase, i.e., all the poles and zeros of $\mC(z)$ lie strictly inside the unit disk.  \end{myass}  \begin{remark}  This is a commonly adopted assumption for input-output stability and internal stability.  \end{remark}  %\begin{mydef}  % Given the spectrum $\Phi_{y,u}(z)$, we shall say $\Sigma\triangleq\left(\mG, \mK, \mH, Q, R\right)$ are admissible to $\Phi_{y,u}$ if any element in $\Sigma$ is consistent to the prior knowledge and gives $\Phi_{y,u}$ following the computation above.   %\end{mydef}  %\begin{mydef}  % We say the feedback system $\mG(z)$ and $\mK(z)$ is identifiable from the input-output data $(y,u)$ if for any admissible $\hat{\Sigma}\triangleq\left(\hat{\mG}, \hat{\mK}, \hat{\mH}, \hat{Q}, \hat{R}\right)$ that $\hat{\mG}=\mG$ and $\hat{\mK}=\mK$.  %\end{mydef}  We first consider the identifiability of $\mC(z)$ from the joint spectral density $\Phi_{y,u}$.   \begin{lemma}\label{lemma:idgk}  Under the Assumption~\ref{ass:wp} and \ref{ass:stable}, if there exists $\mC(z),\,Q,\,\,R$ and $\hat \mC(z),\,\hat Q,\,\hat R$ that lead to the same $\Phi_{y,u}$, then there exists a unitary matrix $V_{11}$, such that  \begin{align}  \hat \mC_{11}(z) &= \mC_{11}(z)V_{11},  &\hat \mC_{12}(z)&= \mC_{12}(z),\nonumber\\   \hat \mC_{21}(z) &= \mC_{21}(z)V_{11},  &\hat \mC_{22}(z)&= \mC_{22}(z)\nonumber,\\   \hat Q &=V^*_{11}QV_{11},  &\hat R &= R.\label{eq:Cidentifability}  \end{align}  \end{lemma}  % \begin{proof}  % See Appendix.  % \end{proof}  We now consider the identifiability of $\mG(z)$, $\mK(z)$ and $\mH(z)$ from $\mC(z)$. Before continuing on, we need the following definition:  \begin{mydef}  We define the normal rank of a transfer matrix $\mathcal A(z)$ to be the maximum rank of $\mathcal A(z)$ over all $z\in \mathbb{C}$.   \end{mydef}  \begin{proposition}\label{prop:feedback}  Given $\mC(z)$, the following transfer functions can be uniquely specified :  \begin{equation}\label{eq:khg}  \begin{aligned}  \mK(z)&=\mC_{22}(z)\mC^{-1}_{12}(z),\\  \mH(z)&= \mC^{-1}_{12}(z)\mC_{11}(z),\\  \mG(z)\mK(z)&=I-\mC^{-1}_{12}(z).  \end{aligned}  \end{equation}  If $\mK(z)$ has full normal row rank then $\mG(z)$ can be uniquely determined from the following equality  \begin{equation}  \mG(z)=(I-\mC^{-1}_{12}(z))\mK^{\dagger}(z),  \end{equation}   where $\mK^{\dagger}(z)$ is the unique transfer matrix satisfies $\mK(z)\mK^{\dagger}(z)=I$.  \end{proposition}  % \begin{proof}  % This is straightforward from the definition of $\mC(z)$.  % \end{proof}  %\begin{lemma}\label{lemma:sf}(\cite{anderson1}) Let $\Phi(z)=\Phi^*(z)$ be real rational, nonsingular almost everywhere and nonnegative for $|z|=1$, there exists a unique real rational $\bar{\mC}$ which is analytic in $|z|\ge1$ and $\bar{\mC}(\infty)=I$ and a unique positive definite $\bar{D}$ such that   %\begin{equation}  %\Phi(z)=\bar{\mC}(z)\bar{D}\bar{\mC}^*(z).  %\end{equation}   %\end{lemma}  Based on Lemma~\ref{lemma:idgk} and Proposition~\ref{prop:feedback}, we have the following theorem about the identifiability of $\mG(z)$ and $\mK(z)$.  \begin{theorem}\label{thm:gk}  Consider the feedback control scheme described in Sec~\ref{sec:model}. Under the Assumption~\ref{ass:wp} and \ref{ass:stable}, the following statements hold:  \begin{itemize}  \item $\mG(z)\mK(z)$ and $\mK(z)$ are uniquely identifiable;  \item $R$ is uniquely identifiable;  \item $\mH(z)$ and $Q$ can be identified up to the following transformation  \begin{equation}  \begin{aligned}  \hat{\mH}(z)&=\mH(z) V_{11}\\  \hat{Q}&=V^*_{11}QV_{11},  \end{aligned}  \end{equation}  in which $V_{11}$ is a unitary matrix.   \end{itemize}  Furthermore, if $\mK(z)$ if full normal row rank, then $\mG(z)$ is uniquely identifiable.  \end{theorem}  % \begin{proof}  % Let $\mC(z)$ be the true closed-loop transfer function. By Lemma~\ref{lemma:idgk}, any $\hat \mC(z)$ we derive from $\Phi_{y,u}$ must satisfy \eqref{eq:Cidentifability}. Hence, by Proposition~\ref{prop:feedback},   % \begin{align*}  % \hat \mK(z) = \hat \mC_{22}(z) \hat \mC_{12}^{-1}(z) = \mC_{22}(z) \mC_{12}^{-1}(z) = \mK(z),  % \end{align*}  % which implies that $\mK(z)$ is uniquely identifiable. The other statements can be proved by similar arguments.  % \end{proof}  We now provide a sufficient condition under which the system is not identifiable by the adversary:  \begin{theorem}  \label{thm:nonidentifiable}  Let $w(k),\,v(k)$ be a realization of the noise process and $x(k),\,y(k),\,u(k)$ be the corresponding system state, sensor measurements and control input that satisfy \eqref{eq:systemdescription}, \eqref{eq:sensordescription} and \eqref{eq:utoy}. If $\mK(z)$ can be factorized into  \begin{align}  \mK(z) = F\tilde \mK(z),   \label{eq:factorization}  \end{align}  where $F\in \mathbb R^{p\times q}$ is a constant matrix with $q < p$ and $\tilde \mK(z)\in \mathbb C^{q\times m}$ is a transfer function, then there exists a matrix $\hat B\neq B$, such that the following equalities hold for $\hat B$:  \begin{align*}  x(k+1) &= Ax(k) + \hat Bu(k) + w(k),\\  y(k) & = Cx(k) + v(k),\,u(k) = \mK(z)y(k).  \end{align*}  \end{theorem}  % \begin{proof}  % Since $q < p$, $F$ is not full row rank, which implies the existence of a real matrix $\Delta B\neq 0$, such that $\Delta BF =0$. One can verify that $\hat B = B + \Delta B$ is the required matrix.  % \end{proof}  \begin{remark}  Clearly, if the factorization described by \eqref{eq:factorization} is possible, then the adversary cannot tell the difference between the physical system model $\mG(z) = C(zI-A)^{-1}B$ and $\hat \mG(z) = C(zI-A)^{-1}\hat B$ since they share the same input and output relation. This is due to the fact that the controller only inject the control input that lies in the column space of $F$ and hence there are some ambiguities in the $B$ matrix.  It is worth noticing that \eqref{eq:factorization} implies that $\mK(z)$ is not full normal row rank. In fact, the normal rank of $\mK(z)$ is at most $q$. On the other hand, a non full normal row rank matrix $\mK(z)$ can always be decomposed as $\mK(z) = \mathcal F(z)\tilde \mK(z)$, where $\mathcal F(z)$ is a $p$ by $q$ transfer matrix with $q  \end{remark}           

\section{Introduction}  Cyber-Physical Systems (CPSs) refer to the embedding of widespread sensing, networking, computation, and control into physical spaces with the goal of making them safer, more efficient and reliable. Driven by the miniaturization and integration of sensing, communication, and computation in cost effective devices, CPSs are bound to transform several industries such as aerospace, transportation, built environment, energy, health-care, and manufacturing, to name a few. While the use of dedicated communication networks has so far sheltered systems from the outside world, use of off-the-shelf networking and computing, combined with unattended operation of a plethora of devices, provides several opportunities for malicious entities to inject attacks on CPSs. A wide variety of motivations exist for launching an attack on CPSs, ranging from economic reasons such as drawing a financial gain, all the way to terrorism. Any attack on safety-critical CPSs may significantly hamper the economy and lead to the loss of human lives. While the threat of attacks on CPSs tends to be underplayed at times, the Stuxnet worm provided a clear sample of the future to come~\cite{Chen2010,Fidler2011}.  A substantial amount of research effort has been dedicated to identifying possible security vulnerabilities of the CPS and develop countermeasures. To this end, many attack models, such as stealthy attack\footnote{The stealthy attack is also referred to as false data injection attack, zero dynamics attack in the literature.}~\cite{Liu2009,Sundaram2011,Pasqualetti2013,Fawzi2014,Teixeira2015a}, replay attack~\cite{Mo2014,Mo2015} and covert attack~\cite{Smith2011}, have been proposed by various researchers. Teixeira et al.~\cite{Teixeira2015} propose a characterization of different attack models based on the attacker's resources, which are divided into three different categories: knowledge of the system model, knowledge of the real-time control and sensory data (disclosure resources) and the capability to modify the control and sensory data (disruptive resources). Their results illustrate that many attack models proposed in the literature require the knowledge of the system models from the adversary. For example, in the stealthy attack scenario~\cite{Pasqualetti2013}, the adversary will inject an external control input to the physical system and then remove the physical system's response to this malicious input from the sensors' measurements. The system operator will not be able to detect the attack if the response to the malicious control input is removed perfectly. However, such an attack requires the adversary to know the perfect model of the physical system, which may be difficult to acquire in many practical scenarios, since the modeling information is usually stored inside the controller. On the other hand, we argue that in many situations, the control and sensory data are much easier to acquire. This is due to the fact that these data are typically not encrypted for many CPSs~\cite{Koscher2010}. Furthermore, even if the control and sensory data are encrypted, it might be easier to break the security of sensors and actuators due to their low computational capability. Thus, for the adversary, the disclosure resources may be more available than the model knowledge.  In this paper, we discuss whether the adversary can use its disclosure resources to gain the model knowledge by the means of system identification. We model the CPS as a linear feedback control system, which is illustrated in Fig~\ref{fig:feedback}. The adversary is assumed to \emph{only use} its disclosure resources. In other words, it can only passively observe the control input $u$ and the sensory data $y$ and cannot inject any disturbances to the system. The goal of the adversary is to learn the physical system model $\mathcal G(z)$, which further enables the adversary to launch other attacks, such as stealthy attack and covert attack.  Such an attack model is very similar to the Known-Plaintext Attack (KPA) studied in information security, where the adversary has samples of both the plaintext and the corresponding ciphertext and want to deduce the encryption key. For our case, one can view the system model, the control input $u$ and the sensory data $y$ as the encryption key, plaintext and ciphertext respectively.   %  %It is worth mentioning that with additional disruptive resources, the adversary can also launch a more powerful Chosen-Plaintext Attack (CPA), where it can actively modify the control input $u$ and observe the corresponding system output $y$. However, if the attacker changes the control input $u$ carelessly, it may result in a substantial change in the sensor measurement $y$, which could enable the system to detect the presence of the malicious third party. If the stealthiness of the attack is of concern to the attacker, then a reasonable strategy of the adversary is to first launch a passive KPA without risking being detected. After a coarse system model is learned, the attacker can then design a stealthy control input $u$ to identify a more accurate model. In such a scenario, the KPA is the first step for the adversary to gain model knowledge.  %In Computer Science, there are three types of attack models as listed below. Imagine in the feedback control context, the plant text can be thought as the input of the system while the cipher text output. We shall refer to the corresponding analogy in systems theory.  %{\bf Known-plaintext attack:} it is assumed that pairs of plaintext and the corresponding enciphered text are available to the analyst. %During World War II, the Allies used known-plaintexts in their successful cryptanalysis of the Enigma machine cipher. The plaintext samples are called ``cribs''; the term originated at Bletchley Park, the British World War II decryption operation.  % In the system and control theory, consider a feedback configuration in Fig.~\ref{fig:feedback}, this kind of attack can be modelled as only the output is available to the attacker.  %{\bf Ciphertext-only attack:} it is assumed that only the cipher text is available to the attacker. In the noiseless case, Yuan et. al. \cite{yeautomatica, yeACC} proposed an algorithm that can predict the future output by using historical values of $y$.   %{\bf Chosen-plaintext attack:} in this scenario, the attacker is able to choose a number of plaintexts to be enciphered and have access to the resulting ciphertext. This corresponds to the experimental design in system identification.   As a result, we will focus on KPA in this paper. The main contributions of the paper are twofold:  \begin{enumerate}  \item We provide a necessary condition and a sufficient condition, under which the system is vulnerable to KPA, i.e., the adversary can successfully identify the system model $\mathcal G(z)$. The results can be viewed as an application of classical system identification \cite{anderson1, anderson2, anderson3, Ljung, keith, anderson} for the closed-loop system described in Section~\ref{sec:sysid}.   \item We design a countermeasure to KPA by using a ``low-rank'' controller design strategy for $\mK(z)$ while trading off the $LQG$ control performance.  \end{enumerate}  %\begin{itemize}  % \item Attacker model has been introduced the   % \item Comparing to existed literature in cyber-physical systems theory,   % \Ye{cite others' work in assuming knowing $\mG$} this paper can obtain $\mG$ with theoretical guarantee from data. This furthermore provides a   % \item Design tradeoff between the performance of feedback controller versus security consideration has been investigated.   % \item This also adds new results to the system identification community that   %\end{itemize}  The rest of the paper is organized as follows: In Section~\ref{sec:model}, we model the system as a linear feedback control system subject to Gaussian process and measurement noise. In Section~\ref{sec:sysid}, we provide necessary and sufficient conditions, under which the adversary can identify the system model $\mathcal G(z)$. We further provide a numerical algorithm for the adversary to compute $\mathcal G(z)$. In Section~\ref{sec:countermeasure}, we present a controller design which is resilient to KPA while only incurring minimal control performance loss.         

untitled.tex abstract.tex  introduction.tex   notations.tex   model.tex  KPA.tex   identifiability.tex   algorithm.tex  attack.tex   counter.tex  optimalk.tex  optimalF.tex  simulation.tex  conclusion.tex   appendix.tex           

\section{System model} \label{sec:model}  We model the physical system has a linear time invariant system, which takes the following form:  \begin{align}  \label{eq:systemdescription}  x(k+1) &= Ax(k) + Bu(k) + w(k), \\  y(k) &= C x(k) + v(k),  \label{eq:sensordescription}  \end{align}  where $x(k) \in \mathbb R^n$, $u(k)\in \mathbb R^p$, $y(k)\in \mathbb R^m$ are the state, the control input and the sensor measurement at time $k$ respectively. $w(k)\in \mathbb R^n$, $v(k)\in \mathbb R^m$ are the process and measurement noise at time $k$. We assume that $w(k),\,v(k),\,x(0)$ are jointly independent zero mean Gaussian random variables with covariance $\Sigma$, $Q$ and $R$ respectively. We further assume that $Q,R\succ 0$ are strictly positive definite and $(A,B)$ is stabilizable and $(A,C)$ is detectable.  %\begin{remark}  % At the first glance, it seems that our choice of estimator and controller are quite limited. However, all the analysis carried out can be easily generalised to any constant gain linear estimator and controller.  %\end{remark}  %\begin{myass} \label{assGM} The system $(A,B,C,D)$ is globally minimal.   %\end{myass}  %%The open loop expression for the output   %%\begin{equation*}  %%y(k)=\mG(z)u(k)+H_1(z)w(k),  %%\end{equation*}  %%We assume that the feedback control has the form that   %%\begin{equation*}  %%u(k)= \mK(z)y(k)+H_2(z)v(k).  %%\end{equation*}  %%where $z^{-1}$ denotes the unit delay operator.   %%  %%We can compute the joint spectral density  %%\begin{equation}  %%\Phi_{yu}(\omega)=C(z)\begin{bmatrix} Q & 0 \\ 0 & R \end{bmatrix}C^*(z)~ \text{and}~ z=e^{jw},  %%\end{equation}   %%where the closed-loop transfer function  %%\begin{equation}  %%C(z)=\begin{bmatrix}  %%(I-GK)^{-1}H_1 & G(I-KG)^{-1}H_2\\  %%K(I-GK)^{-1}H_1 & (I-KG)^{-1}H_2  %%\end{bmatrix}.   %%\end{equation}   %%  %%The learning can be two-step procedure,   %%first,   %%  %%Once the identifiability condition satisfies, we can then compute the   %  %\Ye{yun... zhege configuration keyi identify G. ok, let's consider another configuration}  %   %Consider systems defined by the following discrete-time Linear, Time-Invariant (LTI) plant:  %\begin{equation}\label{eq:LTI00}  %\begin{aligned}  %x(k+1) &= Ax(k) + Bu(k)+ w(k) \\  %y(k) &= Cx(k) + v(k)  %\end{aligned}  %\end{equation}  %\noindent with input ${u(k) \in \RR^m}$, state ${x(k) \in \RR^n}$, output ${y(k) \in \RR^p}$, system matrices ${A \in \mathbb{R}^{n \times n}}$, ${B \in \mathbb{R}^{n \times m}}$, ${C \in \mathbb{R}^{p \times n}}$ and ${D \in \mathbb{R}^{p \times m}}$.   %\Ye{add intro to LQG }  From system model in \eqref{eq:systemdescription}, we can write down the relation between sensor measurement $y$ and the control input $u$ and the noise process $w$ and $v$ as follows:  \begin{equation}  y(k)=\mG(z)u(k)+\mH(z)w(k)+v(k),  \label{eq:utoy}  \end{equation}  in which $\mG(z)\triangleq C(zI-A)^{-1}B$ and $\mH(z)\triangleq C(zI-A)^{-1}$, and $z^{-1}$ is the unit delay. We assume that the controller is also a linear time invariant controller. Therefore, the control input can be written as   \begin{equation}  u(k)=\mK(z)y(k).  \label{eq:ytou}  \end{equation}  %in which $\mK(z)\triangleq L(zI-A+BL+KCA)^{-1}K$ \ye{check this}.  %We assume that the feedback control has the form that   %\begin{equation*}  %u(k)= \mK(z)y(k).  %\end{equation*}  %where $z^{-1}$ denotes the unit delay operator.   %Make the following conventional assumptions:  %\begin{myass} \label{asse}The system is driven by unknown white noise $w(t)$ and $v(t)$ have a joint Gaussian distribution with zero mean and covariance   %\begin{equation}  %\mathbb{E}\begin{bmatrix} w(k) \\ v(k) \end{bmatrix} \begin{bmatrix}w^T(\tau) & v^T(\tau)\end{bmatrix} = \begin{bmatrix} Q & 0 \\ 0 & R\end{bmatrix} \delta(k-\tau).  %\end{equation}   %\end{myass}  We restrict the future discussions to the controller that satisfies the following assumption:  \begin{myass}\label{ass:wp}[Controller]  The transfer function of the controller $\mK(z)$ is a proper rational function of $z$. Furthermore, the closed-loop system is asymptotically stable.  \end{myass}  \begin{remark}  If we assume that $\mK(z)$ is rational, then $\mK(z)$ being proper is equivalent to the controller being causal. Moreover, the limit $\lim_{z\rightarrow\infty}\mK(\infty)<\infty$ is well-defined. For the closed-loop system, since $\mG(z)$ is a strictly proper transfer function, it follows that $\lim_{z\rightarrow\infty} \mG(z)\mK(z) = 0$, which implies that $I-\mG(z)\mK(z)$ is invertible almost everywhere.  \end{remark}  We assume that an adversary passively observes the control input $u(k)$ and the sensory data $y(k)$ from time $0$ to $\infty$. The goal of the adversary is to infer the physical system model $\mG(z)$ from $u(k)$ and $y(k)$.           

\subsection*{Notations}  $A\succeq B:$ $A-B$ is a positive semi-definite matrix. $\mathbb{E}:$ expected value. $\mathbb S^n:$ the set of $n\times n$ symmetric matrices. If $U$ is a positive semidefinite matrix, then $U^{1/2}$ is a positive semidefinite matrix that satisfies $U^{1/2}U^{1/2} = U$. We will use calligraphic letters to denote transfer matrices and normal letters to denote constant matrices. A rational transfer function is called to be \emph{proper} if the degree of the numerator does not exceed the degree of the denominator. It is called strictly proper if the degree of the numerator is less than the degree of the denominator. For a rational transfer matrix $\mathcal V(z)$, we define $\mathcal V^*(z)=\mathcal V^T(\frac{1}{z})$.           

\subsection{Optimal $F$}  Now we consider how to design the optimal $F$ matrix in order to minimize the LQG cost. Since the second term on the RHS of \eqref{eq:optlqgcost} is independent of $F$, the optimization problem can be formulated as the following optimization problem:   \begin{align}  &\mathop{\textrm{minimize}}\limits_{F\in\mathbb R^{p\times q}}&  & \tr(\tilde SY). \label{eq:optimization}  \end{align}  By applying matrix inversion lemma on the RHS of \eqref{eq:tildeS}, we have  \begin{align}  \tilde S = A^T\left(\tilde S^{-1} + \tilde B \tilde U^{-1} \tilde B^T \right)^{-1}A + W,  \label{eq:tildeS2}  \end{align}  where  \begin{align*}  &\tilde B\tilde U^{-1}\tilde B^T = BF\left(F^TUF\right)^{-1}F^TB \\  &= B U^{-1/2} \left[U^{1/2}F\left( F^TUF \right)^{-1}F^TU^{1/2} \right] U^{-1/2}B^T.  \end{align*}  Let us denote  \begin{equation}  \begin{split}  X \triangleq U^{1/2}F\left( F^TUF \right)^{-1}F^TU^{1/2},\, \bar B \triangleq BU^{-1/2}.  \end{split}  \label{eq:defX}  \end{equation}  It is easy to verify that $X^2 = X$ and $X=X^T$. Hence $X$ is a symmetric projection matrix. Furthermore, $\rank(X) = \rank(F) = q$.  On the other hand, assume that $X$ is a symmetric projection matrix of rank $q$. Let $v_1,\dots,v_q$ to be the orthonormal basis of the column space of $X$. Then the following $F$ will satisfy \eqref{eq:defX}:  \begin{align}  F = U^{-1/2} \begin{bmatrix}  v_1 & \dots&v_q  \end{bmatrix}.  \label{eq:XtoF}  \end{align}  Therefore, instead of optimizing over $F$, the optimization problem \eqref{eq:optimization} can be manipulated into  \begin{align}  &\mathop{\textrm{minimize}}\limits_{X\in\mathbb S^p}&  & \tr(\tilde SY) \label{eq:optimization2}\\  &\textrm{subject to}&  &\tilde S = g_X(\tilde S), \nonumber\\  &&  &X = X^T,\,X^2 = X,\,\rank(X) = q,\nonumber  \end{align}  where  \begin{equation}  g_X(\tilde S) \triangleq A^T\left(\tilde S^{-1} + \bar BX\bar B^T \right)^{-1}A + W.  \label{eq:defriccati}  \end{equation}  We will first manipulate the constraint $\tilde S = g_X(\tilde S)$ into Linear Matrix Inequalities (LMIs). To this end, we need the following intermediate result~\cite{Sinopoli:2004br}:  \begin{proposition}  For a fixed $X$, $g_X(\tilde S)$ is monotonically non-decreasing in $\tilde S$.  \end{proposition}  Consider the following optimization problem:  \begin{align}  &\mathop{\textrm{minimize}}\limits_{X\in\mathbb S^p,\,\tilde S}&  & \tr(\tilde SY) \label{eq:optimization3}\\  &\textrm{subject to}&  &\tilde S \geq g_X(\tilde S), \nonumber\\  &&  &X = X^T,\,X^2 = X,\,\rank(X) = q,\nonumber  \end{align}  where we relax the $\tilde S = g_X(\tilde S)$ constraint in \eqref{eq:optimization2} to $\tilde S\geq g_X(\tilde S)$. The next theorem proves that \eqref{eq:optimization2} and \eqref{eq:optimization3} are equivalent:  \begin{lemma}\label{lemma:X}  There exists an optimal solution $(X,\,\tilde S)$ for the optimization problem~\eqref{eq:optimization3} (not necessarily unique), such that the following equality holds  \begin{align*}  \tilde S = g_X(\tilde S).  \end{align*}  \end{lemma}  % \begin{proof}  % See Appendix.  % \end{proof}  We will now rewrite the constraint $\tilde S \geq g_X(\tilde S)$ as an LMI. To this end, let us take the inverse on both sides of $\tilde S\geq g_X(\tilde S)$ and apply matrix inversion lemma on the RHS,   \begin{align}  \label{eq:lmi1}  W^{-1} - \tilde S^{-1} - W^{-1}A^TZ^{-1}AW^{-1}\succeq 0,  \end{align}  where  \begin{displaymath}  Z = \tilde S^{-1}+AW^{-1}A^T+\bar BX\bar B^T.  \end{displaymath}  Let us define $T =\tilde S^{-1}$, using Schur complement, we know that \eqref{eq:lmi1} is equivalent to:  \begin{align}  \label{eq:lmi2}  \begin{bmatrix}  T+AW^{-1}A^T+\bar BX\bar B^T & AW^{-1}\\  W^{-1}A^T & W^{-1} - T  \end{bmatrix} \succeq 0 .  \end{align}  Therefore, optimization problem~\eqref{eq:optimization3} is equivalent to:  \begin{align}  &\mathop{\textrm{minimize}}\limits_{X,\,\tilde S,\,T}&  & \tr(\tilde SY) \label{eq:optimization4}\\  &\textrm{subject to}&  &\begin{bmatrix}  \tilde S&I\\  I&T  \end{bmatrix}\succeq 0, \nonumber\\  &&  &\begin{bmatrix}  T+AW^{-1}A^T+\bar BX\bar B^T & AW^{-1}\\  W^{-1}A^T & W^{-1} - T  \end{bmatrix} \succeq 0 ,\nonumber\\  &&  &X = X^T,\,X^2 = X,\,\rank(X) = q.\nonumber  \end{align}  The first constraint is equivalent to $\tilde S\succeq T^{-1}\succeq 0$. Since we are minimizing $\tr(\tilde SY)$ and $Y\succeq 0$, the optimal solution must have $\tilde S = T^{-1}$.  We will now relax the constraint on $X$ into a convex constraint, which is given by the following lemma:  \begin{lemma}\label{lemma:pro}  The closed convex hull of all rank $q$ projection matrix $X\in \mathbb S^p$ is given by  \begin{align*}  \mathbb X = \{X\in \mathbb S^p: 0\preceq X\preceq I,\, \tr(X) = q\} .  \end{align*}  \end{lemma}  % \begin{proof}  % If $X$ is a symmetric projection matrix of rank $q$, then $X$ has $q$ eigenvalues at $1$ and $p-q$ eigenvalues at $0$. It is easy to verify that $\mathbb X$ is convex and contains $X$. One can further verify that the projection matrices are the extreme points of $\mathbb X$ and hence the lemma can be prove by Krein-Milman theorem. The detailed proof is omitted due to space limit.   % \end{proof}  %  Hence, by Lemma~\ref{lemma:pro}, the optimization problem can be relaxed to the following semidefinite programing optimization and solved efficiently:  \begin{align}  &\mathop{\textrm{minimize}}\limits_{X,\,\tilde S,\,T}&  & \tr(\tilde SY) \label{eq:optimization5}\\  &\textrm{subject to}&  &\begin{bmatrix}  \tilde S&I\\  I&T  \end{bmatrix}\succeq 0, \nonumber\\  &&  &\begin{bmatrix}  T+AW^{-1}A^T+\bar BX\bar B^T & AW^{-1}\\  W^{-1}A^T & W^{-1} - T  \end{bmatrix} \succeq 0 ,\nonumber\\  &&  &X = X^T,\,0\preceq X\preceq I,\,\text{tr}(X) = q.\nonumber  \end{align}  \begin{remark}  In summary, the optimization problem~\eqref{eq:optimization}, \eqref{eq:optimization2}, \eqref{eq:optimization3} and \eqref{eq:optimization4} are all equivalent. On the other hand, the constraint on $X$ in \eqref{eq:optimization4} is relaxed into a convex constraint in \eqref{eq:optimization5}. Therefore, the optimal value of \eqref{eq:optimization5} is no greater than the optimal value of \eqref{eq:optimization}, \eqref{eq:optimization2}, \eqref{eq:optimization3} and \eqref{eq:optimization4}.   \end{remark}  Denote the optimal solution of \eqref{eq:optimization5} as $(X_*,\,\tilde S_*,\, T_*)$. Since we relaxed the constraint on $X$, $X_*$ is not necessarily a projection matrix. To derive a projection matrix from $X_*$, one can do an eigendecomposition and rewritten $X_*$ as  \begin{align*}  X_* = U_*\diag(\lambda_1,\dots,\lambda_p)U_*^T,  \end{align*}  where $U_*$ is a orthonormal matrix and $\lambda_1\geq \dots\geq \lambda_p$. We can define a projection matrix $X_0$ from $X_*$ as  \begin{align*}  X_0 = U_* \diag(\underbrace{1,\dots,1}_q,\underbrace{0,\dots,0}_{p-q})U_*^T.   \end{align*}  Denote the corresponding fixed point of $\tilde S = g_{X_0}(\tilde S)$ as $\tilde S_0$. Let us further denote the optimal value of \eqref{eq:optimization} as $\alpha$. Clearly, $X_0$ lies in the feasible set of the optimization problem \eqref{eq:optimization2}. Therefore,  $ \tr(\tilde S_0Y)\geq \alpha.  $ On the other hand, since \eqref{eq:optimization5} is a relaxed problem, we have  $ \alpha \geq \tr(\tilde S_*Y).   $ Therefore, we know the optimality gap of our heuristic solution is bounded by  \begin{align*}  \tr(\tilde S_0Y) - \alpha\leq \tr(\tilde S_0Y)-\tr(\tilde S_*Y).  \end{align*}  Furthermore, if $X_*$ is indeed a projection matrix, then the optimality gap is $0$ and we solve \eqref{eq:optimization} exactly.  %           

\subsection{Optimal $\tilde {\mathcal K}(z)$}  Since $u(k) = F \tilde u(k)$, we can rewrite the system equation as  \begin{align*}  x(k+1) = A x(k) + \tilde B\tilde u(k) + w(k),  \end{align*}  where $\tilde B\triangleq BF$. Furthermore, the objective function of LQG can be rewritten as  \begin{align*}  &J = \limsup_{T\rightarrow \infty}\frac{1}{T}\min_{\tilde u(k)}\mathbb{E}\left[\sum_{k=0}^{T-1} x(k)^TWx(k)+\tilde u(k)^T\tilde U\tilde u(k)\right],  \end{align*}  where $\tilde U \triangleq F^TUF\in \mathbb R^{q\times q}$. Therefore, the optimal control is given by a Kalman filter and a LQR controller~\cite{Schenato:2007dh}:  \subsubsection*{Kalman Filter}  The state estimation of the Kalman filter (with a fixed gain) is given by:  \begin{align*}  \hat x(k)& = \hat x(k|k - 1) + K (y(k) - C \hat x (k|k - 1) ) ,\\  \hat x (k + 1|k) & = A \hat x(k) +Bu(k).  \end{align*}  where  $ K= P C^T (CP C^T + R)^{ - 1}, $  and $P$ is the fixed point of the following Riccati equation:  \begin{align*}  P = APA^T+Q - APC^T(CPC^T+R)^{-1}CPA^T.  \end{align*}  \subsubsection*{LQR controller}  The optimal control can then be derived as a linear function of the state estimate:  \begin{align}  \tilde u(k) = \tilde L \hat x(k),  \end{align}  where  \begin{align*}  \tilde L = -(\tilde B^T\tilde S \tilde B + \tilde U)^{-1}\tilde B^T\tilde S A,  \end{align*}  and $\tilde S$ is the solution of the following Riccati equation  \begin{align}  \tilde S = A^T \tilde SA+W - A^T\tilde S\tilde B(\tilde B^T\tilde S\tilde B+\tilde U)^{-1}\tilde B^T \tilde SA.  \label{eq:tildeS}  \end{align}  The corresponding $\tilde K(z)$ is given by  \begin{align*}  \tilde K(z) = z\tilde L\left[zI - (I-KC)(A+B\tilde L)\right]^{-1}K.  \end{align*}  The corresponding LQG cost is given by  \begin{align}  J^*& = \tr(\tilde SQ) + \tr[(W+A^T\tilde SA-\tilde S)(P-KCP)] \nonumber\\  &= \tr(\tilde S Y)+ \tr[W(P-KCP)] ,  \label{eq:optlqgcost}  \end{align}  where   \begin{equation}  \begin{split}  Y &\triangleq Q+A(P-KCP)A^T-(P-KCP) \\  &= PC^T(CPC^T+R)^{-1}CP\succeq 0.  \end{split}  \end{equation}           

\section{Numerical Example}  \label{sec:simulation}  In this section, we provide a numerical example to illustrate our low-rank controller design. We assume that $n = 15,\,m = p =10$. The $Q,\,R,\,W,\,U$ matrices are all chosen to be identity matrix. The matrix $A,\,B,\,C$ are randomly generated, where each entry is independently drawn from a uniform distribution on $[0,1]$.  We consider the LQG performance of a low-rank controller versus the performance of an optimal controller with no rank constraint. We define the relative performance loss as  \begin{align*}  \frac{J\text{ of the low rank controller}}{J\text{ of the optimal controller}} - 1.  \end{align*}  We will choose $q$ from $5$ to $9$ and for each $q$, we will perform $1000$ random experiments. Fig~\ref{fig:lqgloss} is the box and whisker diagram of the relative LQG performance loss generated by the random experiments.  One can see that even if $q = 5$, meaning that we only use half degrees of freedom to design the controller, the LQG loss is still small, with median loss at $6\%$.