Scott Fluhrer edited section_Abstract_This_paper_explores__.tex  almost 9 years ago

Commit id: b17984ec66121ef694234f4e5e514f4c901d1395

deletions | additions      

       

During decryption, the decryptor recovers the values $m$ and $b$; it then uses the above logic to recompute $R$ (and checks to make sure that was the $R$ used to generate the ciphertext; this prevents invalidly generated ciphertexts from becoming an issue). What this means is that the encryptor must use this formula to generate $R$ from $b$, $m$ and the public key.  \section{Key Recovery Attack 1}  The NTRU public key $H$ is the polynomial $F^{-1}G$ (computed over $\mod q$), where $F$ and $G$ are sparse polynomials; the polynomial $F$ is the private key. One way to try to recover the private key is to search for an $F$ such the product polynomial $FH$ is sparse (which, in this case, means considers of $dG+1$ coefficients of the value $p$, $dG$ coefficients of the value $-p$, and the rest 0).  Now, the defined NTRU parameter sets work in one of two ways; in the straight-forward method, the key generation process selects a random polynomial whose coefficients consists of $dF$ 1's, $dF$ -1's, and the rest 0 (where $dF$ is a parameter from the parameter set). The other method, known as the product form, has the key generation process select three polynomials $F_1, F_2, F_3$, and effectively sets $F = F_1F_2 + F_3$. Each of these polynomials $F_1, F_2, F_3$ is sparser than the target $F$ polynomial; and computing $F_1(F_2H) + F_3H$ is faster than computing $FH$ directly.  To use Grover's algorithm, we could search for the polynomial $F$ for which $FH$ is sparse. Now, if we do that in the straight-forward manner, that doesn't work (as for all defined parameter sets, there are more than $2^{2k}$ possible values of $F$).  If we are working with a parameter set that uses the product form, one way to rewrite the equation $FH = G$ is in the form $F_1F_2H = G - F_3H$, where we know that each coefficient of $G$ is either $0, p$ or $-p$.  The next obvious improvement is to check for this equality $\bmod p$; as all coefficients of $G$ are $0 \bmod p$, this simplifies the equation to $F_1F_2 \equiv -F_3H \pmod{p}$.  Now, there is an obvious objection to this; both $F_1F_2$ and $F_3H$ are computed modulo $q$, which is relatively prime to $p$; what happens if an addition within $G - F_3H$ which "wraps"; that is, $-F_3H$ has one sign, and $G - F_3H$ has the other? Well, the obvious answer is "the search for the public key fails in that case". It is worth noting that the probability of such a wrap happening is reasonable; the parameter sets in question have $p=3$, $q=2048$ and $G$ having between 267 and 495 nonzero entries; assuming that the coefficents of $F_3H$ are random, this gives us a success probability between $0.67$ and $0.48$. So, the obvious way to address this failure mode is, if the initial search fails, we add a random constant to both sides of the equation, and rerun the search; this has the effect of doubling the expected search time.  So, the obvious search algorithm is to store the coefficients of $- F_3H \pmod p$ (for all possible $F_3$ values), and then perform the Quantum Computer search over all possible $F_1, F_2$ values for where the coefficients of $F_1F_2H \pmod 3$ is a match for one of the precomputed values. With reasonable probability the correct private key will be the only one that gives a plausible setting; in fact, we don't need to search over all the coordinates, instead it is sufficient to search over enough to make any false hit unlikely. If we find a match, this will allow us to rederive $F$ in (strictly speaking) fewer than $2^k$ operations.  In addition, we can reduce this work effort by taking advantage of the rotational symmetry of the multiply operation within $F[x]/x^N-1$; if we consider the product $AB$, and then consider the product $A'B$ (where $A'$ has the same coefficients as $A$, only rotated by $m$ positions, then the product $A'B$ will have the same coefficients as $AB$, only rotated by $m$ positions. This can easily be seen as the operation of rotating by $m$ positions is the operation of multiplying by the polynomial $x^m$, and $(x^mA)B = x^m(AB)$. This allows us to reduce the number of $F_3$ polynomials we consider by a factor of N (because if a specific value of $F_3$ gives a solution with small coefficients, so will any rotation of $F_3$ and the corresponding $F_1F_2$.  Ttere is also a second symmetry that we can take advantage of; if we rotate the elements of $F_1$ left by $m$ positions, and the elements of $F_2$ right by $m$ positions, the resulting product $F'_1 F'_2$ will be unchanged. That is, $(x^m F_1)(x^{-m}) F_2 = F_1 F_2$. This is a second symmetry that reduces the number of the products $F_1 F_2$ we need to consider by a factor of $N$.  The result of these two improvements reduce the number of $- F_3H \bmod p$ values we need to precompute by a factor of $N$, and the number of $F_1 F_2 \bmod p$ polynomials that we need the Quantum Computer to search over by a factor of N.  When we consider the parameter set EES593EP1 (which has a design strength of 192), we find it has $N = 593$, and $dF_1 = 10$, $dF_2 = 10$ and $dF_3 = 8$. This implies that the total number of $F_3$ polynomials is $\binom{593}{16}\binom{16}{8}$ (because each $F_3$ polynomial coefficients consists of 8 $p$s, 8 $-p$'s, and 577 $0$s), and the number of $F_1F_2$ polynomials as $\binom{593}{20}\binom{20}{10}\binom{593}{20}\binom{20}{10}$. When we take into account the factor of $N$ decrease (because of the rotational symmetry) , this gives us $|F_3|/N \approx 2^{107.285}$. When we consider the number of $F_1 F_2$ polynomials, we take account of the factor $N$ decrease (because of the second rotational symmetric), this gives us $|F_1||F_2|/N \approx 2^{271.165}$. A Quantum Computer is able to search over a set of this size in approximately $2^{136}$ time, and this second step would dominate. When we account for the failure probability (and the possibility we're need to rerun this procedure), this gives us an overall time of $O(2^{137})$, which is considerably smaller than the original design goal of 192 bits.  When we consider all four product form parameter sets, we find that the 112 bit product form set (EES401EP2) can be attacked with an expected $O(2^{104})$ work, the 128 bit set (EES439EP1) can be attacked with an expected $O(2^{112})$ work, and the 256 bit set (EES743EP1) can be attacked with an expected $O(2^{197})$ work. In this last case, it's actually the derivation of the possible values of $-F_3H$ which is the dominating factor (because $dF_3=15$ for this parameter set, which is comparitively large.  Now, this approach has managed to recover the private key using fewer NTRU multiplications than expected. On the other hand, this approach is quite impractical (even beyond the number of operations involved); it assumes that we can practically check for the existence of an entry in a table with more than $2^{65}$ entries in constant time.  An obvious alternative approach would be to search for equalities between $F_1F_2H$ and $G-F_3H$ modulo 2; this would allow us to ignore the sign differences in the $F_1, F_2, F_3$ coefficients (and avoids the possibility of the attack failing because of a wrap, as all have parameter sets in question has q being a power of 2). However, this approach turns out not to work, because $G$ turns out to be relatively dense evaluated modulo 2, and so there's no good way to determine when we've detected the correct $F_1, F_2, F_3$ set.  \section{Key Recovery Attack 2}  When the NTRU library generates a key, it selects a random polynomial $F$ using the same procedure as it does when generating a  \section{Plaintext Recovery Attack 1}  If someone has a Quantum Computer, one way they may try to recover the plaintext given a ciphertext $C$ and a public key $H$ is to run Grover's algorithm, with the guess being $\rho$ the output of the hashed string. That is, they would use the function that takes a value for $\rho$, run it through the random number generator to generate a guess of $R$, and check if the polynomial $C-HR$ consists solely of the coefficients $\{0, 1, -1\}$; the correct guess of the hash will do that (as $C-HR$ is the encoded message $M$, which is a polynomial with those coefficients); an incorrect guess is quite unlikely to have coefficients limited to that range.