Ericpony's Blog: A predictable PRG is exactly an insecure PRG

Sunday, June 30, 2013

A predictable PRG is exactly an insecure PRG

Let $A(x)$ be an algorithm and $n=|x|$ denote the length of the binary encoding of its input $x$. We say that an event regarding $A$ and $x$ holds with negligible probability if it holds with probability $O(1/poly(n))$. We say the event holds with significant probability if it holds with probability $1-O(1/poly(n))$. Note that the property of being negligible or significant is closed under polynomial additions and multiplications. That is, the probability of the union and the interception of a polynomial number of negligible (resp.$\ $significant) events is still negligible (resp.$\ $significant).

We say that a PRG $G$ is predictable if given a random seed $k$, there is an efficient algorithm that can predict some $i$th bit of the output according to first $i-1$ bits with non-negligible probability.

We say that a PRG $G$ is insecure if given a random seed $k$, there is an efficient algorithm that can distinguish it from a true random number generator with non-negligible probability.

Example. Let $G:\mathcal K \rightarrow \{0,1\}^n$ be a PRG such that one can compute its first $n/2$ bits from the last $n/2$ bits with non-negligible probability. We shall show that $G$ is predictable. Suppose that we can compute the first half of bits from the last half via algorithm $A:\{0,1\}^{n/2} \rightarrow \{0,1\}^{n/2}$. Let $s\in\{0,1\}^n$ be an output of $G$ and write $s[n/2+1...k]$ as $s_k$ for notational simplicity. If $$P\{A(s_{n-1}||1) \neq A(s_{n-1}||0)\}$$ is non-negligible, then we can compute $s[n]$ from $s[1...n-1]$ with non-negligible probability. Otherwise, the value of $s_n$ has negligible effects in the computation of $A$. Therefore, we can assume $s[n]=0$ and consider $$P\{A(s_{n-2}||10) \neq A(s_{n-2}||00)\}$$ instead in previous argument. If $P\{A(s_k||10...0) \neq A(s_k||0...0)\}$ is negligible for all $n/2\le k< n$, then the first half of the output of $G$ is equal to $s[1...n/2]$ with non-negligible probability, which implies $G$ is predictable. Thus without loss of generality we can assume that there exists $k$ with $n/2\le k< n$, such that $s[k+1]$ can be computed from $s[1...k]$ with non-negligible probability. Now, if we let $S$ be the output of $G$ , then the probability that at least one bit of $S$ is predictable is $$\sum_{s,k} P\{S=s\}P\{s[k+1] \mbox{ is computable from } s[1...k]\},$$ which is non-negligible. Hence we can conclude that $G$ is predictable. #

It is clear that if a PRG is predictable, then it cannot be secure. The reverse direction, however, is a nontrivial result that together with the previous one establishes the equivalence between predictability and security.
Theorem. [Yao'82] If a PRG is unpredictable, then it is secure.
Hence the PRG defined in the previous example is predictable by Yao's Theorem, as the first half of a random string is independent of the rest of the string and thus can be guessed only with negligible probability.

Computational indistinguishability

Let $\mathcal X$ and $\mathcal Y$ be two distributions. We say $\mathcal X$ and $\mathcal Y$ are computational indistinguishable, written as $\mathcal X\approx_p \mathcal Y$, if for all PPT algorithms $A$, $$Adv_A(\mathcal X, \mathcal Y)=|P\{A(x)| x\leftarrow \mathcal X\}-P\{A(y)| y\leftarrow \mathcal Y\}| \le \frac{1}{poly(n)},$$ where $n$ is the secure parameter (eg. the length of the input, the amount of time or memory needed for the computation, etc), and $Adv_A(\mathcal X, \mathcal Y)$ is called the advantage of adversary $A$ over $\mathcal X$ and $\mathcal Y$. Below we list some useful properties of computationally indistinguishable distributions.
(Transitivity) If $\mathcal X \approx_p \mathcal Y$ and $\mathcal Y \approx_p \mathcal Z$, then $\mathcal X \approx_p \mathcal Z$.
(Independent component) If $\mathcal X \approx_p \mathcal Y$ and there is a PPT algorithm whose output distribution is $\mathcal Z$, then $\mathcal X\times \mathcal Z \approx_p \mathcal Y\times \mathcal Z$ and $\mathcal Z\times \mathcal X \approx_p \mathcal Z\times \mathcal Y$.
(Multiple sampling) If there are PPT algorithms whose output distributions are $\mathcal X$ and $\mathcal Y$, then for any $m\in\mathbb N$, $\mathcal X \approx_p \mathcal Y$ iff $\mathcal X^m \approx_p \mathcal Y^m$.
The first two properties are clear and can be easily proven by contradiction. The most straightforward approach to prove last property is to apply a simple form of a commonly used proof technique called the hybrid argument, which we shall show here as an illustration of how the argument works. To simply the presentation we let $m=3$, though the same idea works for $m=poly(n)$. Suppose that $\mathcal X \approx_p \mathcal Y$ and $Adv_A(\mathcal X^3, \mathcal Y^3) > \epsilon$. Using telescope sum we have
$$Adv_A(\mathcal X^3, \mathcal X^2 \times \mathcal Y) + Adv_A(\mathcal X^2 \times \mathcal Y, \mathcal X \times \mathcal Y^2) + Adv_A(\mathcal X \times \mathcal Y^2, \mathcal Y^3) ≥ Adv_A(\mathcal X^3, \mathcal Y^3) > \epsilon.$$ However, at least one of the three summands on the LHS of the above inequality must be no less than $\epsilon/3$. When $\epsilon=1/poly(n)$ this implies that $\mathcal X \not\approx_p \mathcal Y$ by the property of independent component, a contradiction.

Statistical randomness and cryptographically security

The judge of randomness are tests for statistical randomness, among which are the Diehard tests and the more stringent TestU01 Crush randomness tests. A PRG would look truly random if it passes most of the tests. Efficient statistically random PRG such as the Mersenne twister is good for use with simulation, and play important roles in fields such as cellular biology and computational finance.
However, a statistically good PRG may cause a disaster when it is used in cryptographic purpose. One example is the RAND Corporation's 1950s publication of a million random digits; it has passed every statistical test for randomness thus far and is thought to be actually random. But, having been published, it is fully predictable and is thus insecure. Another example is the Mersenne twister; it has a very long period, passes numerous tests of statistical randomness, and is very fast. Unfortunately, it is predictable, as observing a sufficient number of iterations allows one to predict all future iterations. In practice, a PRG suitable for cryptographic purpose usually has a proof that reduces its security to a computationally difficult problem such as prime number factorisation.

Ericpony's Blog

Sunday, June 30, 2013

A predictable PRG is exactly an insecure PRG

Computational indistinguishability

Statistical randomness and cryptographically security

No comments:

Post a Comment