Open Source Security Podcast Episode 8 Listener Feedback

This is a lengthy one, so grab yourself a beverage and something to eat before reading on.

In episode 8 of the Open Source Security Podcast, the hosts talked about the issue of having a small chance of false-positives for probability-tested prime candidates, and they expressed their desire to have either a means to prove the primality after generation and/or to generate primes that come with a "certificate of authenticity".

This article tries to address those concerns.

First off, let's talk about probabilistic primality tests.

Random tests and Fermat

The classic since the 80's is the Miller-Rabin test, or, more precisely, the Rabin unconditional probabilistic extension [Rab80] to Miller's ERH (Extended Riemann Hyptothesis) conditional primality test [Mil75].

The test is a typical Fermat-residual test that has been proven, for a random number N to be tested, to yield a probability of less than (1)/(4) for a false positive (i.e., claims of primality for a composite number) per independently chosen a.

A while after that test, better bounds have been determined for Miller-Rabin, see [DamLanPom93]. You can also read more about those bounds in this post of mine and the results in that particular section.

The problem with Miller-Rabin, and virtually any other Fermat-based primality test is the issue of Carmichael Numbers.

Carmichael Numbers are composite numbers (of at least 3 primes) such that for all a with gcd(a, n) = 1, the Fermat-residual tests will still hold true.

Even worse, it has been proven that there are infinitely many Carmichael Numbers [AlfAndPom94].

So if someone supplies one with a number n and claims that n is prime, then given those Fermat-based primality tests, the tests would fail to detect that. In this scenario, an attacker would have a backdoored number that appears to be prime while it isn't, which means that the attacker could gain several bits about the actual secret exponent in a discrete logarithm setting as it is typical for Diffie-Hellman amongst other algorithms, or even use the backdoor to compute the discrete logarithm as this effectively reduces the size of the subgroup in which the computation takes place.

Non-random tests and Fermat

If the ERH were to be proven true, then the classic Miller test would suffice and one would have to merely run 2⋅(log₂n)² iterations (i.e. try that many different, independently chosen, values for a) [Bach90].

Verified Primality

The years after the introduction of Miller-Rabin saw a boom of new and improved primality tests. Amongst those, two stick out.

First off is the Cohen-Lenstra modification to the APR test, now commonly known as APR-CL [AdlPomRum83], [CohLen84]. Even on late-80's hardware (albeit the back-then equivalent to a supercomputer), this algorithm enabled the primality proving of numbers with up to 213 decimal digits ( ≈ 708 bit) within about 10 minutes [CohLen87]. The other major new algorithm was the Elliptic Curve Primality Proving (or ECPP) Algorithm [AtkMor93], [GolKil86]. This one is even faster than APR-CL, but involves a bit more fancy mathematics.

So after scouring the Internet for a bit, the author of this text found an implementation of ECPP by one of its principal authors, yet that implementation is now closed-source and the binaries are still targeting Pentium Pro as the latest architecture.

Then he found Primo [Mar01] which is a blazingly fast ECPP implementation. The problem, however, is that it is also closed-source, yet the binary runs at least on Linux.

After some more time, by pure chance, he found a mailing-list entry for Sagemath [Sag] that introduced a patch for a provable primality test -- it turns out that N.is_prime() will prove the primality if and only if one sets a certain runtime config flag. This yielded

with proof.WithProof('arithmetic', True):
    pari(x).is_prime(2)

And the following runtimes:

length (bit)	runtime (s)
1536	46.028165
2048	143.709494
3072	1068.515684
4096	1865.429281
6144	8885.877632
7680	20387.676015

There's also a ticket on Sage's bugtracker that would add ECPP to it.

Readers with a mathematical background might recognize the word pari in the above code snippet, and rightfully so. Sagemath uses Pari/GP's [Par] isprime(n, flags) function for primality testing.

So the author of this piece downloaded Pari/GP, compiled it with CFLAGS="-fPIC -O3 -Wall -mtune=native -fno-strict-aliasing -fomit-frame-pointer" ./Configure --tune --graphic=Qt --with-gmp --with-qt and came up with the following little python script that takes a hexadecimal integer to be tested for primality as a command-line argument, and spits out a gp script that can then be used with Pari/GP:

import sys


N = int(sys.argv[1], base=16)

template = '''\o 2
n=%d;
print(if(isprime(n,2),"prime","composite"));
quit;
'''

print(template % N)

Mind the flag of 2 in the call to isprime -- without it, it won't prove the primality necessarily.

Invoking the resulting script looks something like this: gp --fast --quiet -s 1000000000 test_n.gp

A little fun fact is that Pari/GP's original author is the C in APR-CL ;-)

length (bit)	runtime Sage (s)	runtime Pari/GP (s)
1536	46.028165	17.92
2048	143.709494	55.94
3072	1068.515684	380.41
4096	1865.429281	868.51
6144	8885.877632	3282.09
7680	20387.676015	7680 (sic!)
8192	?	10285

And as one can clearly see, using an optimized Pari/GP over default binary Sage yielded a massive performance boost -- and at the same time reduced the codebase that would need to be audited for truly verified primality by most likely several orders of magnitude.

In fact, these times are good enough to warrant the primality proving of any modulus that is shipped by default for openssh as we can now check every 8192-bit modulus in about 3 hours time. Best of all, Pari/GP is single-threaded, so we can make this easily more parallelized via the aforementioned python script when it comes to batch-processing of multiple moduli.

Better ways to generate primes

Ueli Maurer named algorithms to generate verifiable primes more than 20 years ago [Mau95], and there's a CPAN Module that seems to implement the algorithm.

For the idea of having a good means to generate nothing-up-my-sleeve primes, you may want to take a look at the Million Dollar Curve and the SafeCurves projects that do something similar for elliptic curves.

RFC 2631 in section 2.2.1 actually defines an algorithm for creating the parameters p, q, and later on defines a means to create a generator, too.

And FIPS-186-4 C.6 has yet another method to construct a provable prime.

But even with that in place, there are other issues. For example the issue of identifying the library/hardware used for the generation of RSA moduli [SNSKFKM16] and even worse, deviations from uniform randomness that were sometimes rather extreme.

Finally, another big issue can be found in the following section.

Bonus: Recommended lengths and pitfalls

Another issue that finds little attention is the length of the moduli.

The following assumes that one wants to achieve a security-level that is comparable to a symmetric key of 128 bit.

A length of 1536 bit, the shortest that is part of f.e. Fedora's openssh moduli, doesn't hold much weight anymore. The german BSI, an organization similar to the NSA, f.e. recommends a length of at least 3200 bit and requires that the generated subgroup has at least 2²⁵⁰ elements [BSI16, "Tabelle 3.2" and p. 36] (In "Tabelle 3.1", they list 2000 bit but with the recommendation of 3000 if the value should be used past the year 2016).

Based on the older RFC 3766, Table in Section 5, one would need more than 2000 bit for the preferred security level.

Schneier, Ferguson, and Kohno in [SchFerKoh10, Chapter 11.7] recommend a length of at least 6800 bit for the modulus, assuming a symmetric-key security equivalent of 128 bit. They also recommend that p is actually a secure prime, i.e. (p − 1)/(2) ∈ ℙ.

Lastly, given recent developments for computing Discrete Logarithms [BarGauJouTho2014], it only seems prudent to up the minimal lengths to at least those values that are recommended by the aforementioned sources.

There's also this handy website allows you to compare different standard recommendations.

Remarks

1

There seems to be a bit of confusion between primes for discrete logarithm, and assumed hardness of factoring into primes for the likes of RSA. Primality can be proven (see also [AKS], [Bor03]) for a sufficiently small number ≤ 10⁴ bit as shown above, way longer with ECPP). So primes aren't hard to determine, nor is compositeness (as it is the mere inverse to primality), but given a composite n of a large enough size, it is assumed to be hard to find all the factors of it, for as long as those factors aren't too small (f.e. even n = has a factor of 2). However, this (mostly) only matters for factoring-based asymmetric crypto.

2

There's randomness, randomness, and randomness, namely:

"Proper randomness" as ought to occur in nature.
Deterministically generated randomness in Software/Hardware, and of which
Cryptographically Secure Pseudorandom Number Generators (CSPRNG), such as Fortuna [SchFerKoh10, Chapter 9]

and as far as tests go, even more demanding than Dieharder is TestU01.

For CSPRNGs, there's also academic papers, f.e. about CryptGenRandom, the Linux (CS)PRNG, etc. [1], [2], [3] , [4], [5].

3

The reason for finding a primality test in music-related software was most likely due to Fast-Fourier-Transforms (FFTs) that in their most basic form require a prime for them to work. FFTs in music have lots of uses, ranging from identifying notes and overtones to actually having a portrait of the artist in the song (Aphex Twin -- Strange Formula).

4

There's a good reason for the sparseness of crypto error messages -- different error message strings can lead to oracle and timing attacks (see. f.e https://web-in-security.blogspot.com/2014/08/old-attacks-on-new-tls-implementations.html ).

5

In regards to "were the primes in f.e. TLS proven to be prime". RFC 2412 by Schroeppel defines some groups and claims that the values have been rigorously proven to be prime, but it lacks the actual proof as well as the means of generation (the errata link to a very interesting pdf btw).

The curves in RFC 5114 were, as section 2 therein states, taken from NIST/NSA standards, so one might indeed need to worry about (S)NFS'ed values, given the history of the now infamous DUAL_EC_DRBG (see esp. section 8 therein).

PS

(non-rechargeable) Lithium batteries reach ≤ 1.8 MJ/kg while TNT reaches 4.6 MJ/kg -- the original statement on xkcd was a bit different and still mostly wrong ;)

Oh and Germany also celebrates Thanksgiving in October.

Hope this helps and am looking forward to the "That crazy dude who gave feedback" episode.

References

[AdlPomRum83] Adleman, Leonard M., Carl Pomerance, and Robert S. Rumely.: "On Distinguishing Prime Numbers from Composite Numbers." Annals of Mathematics, Second Series, 117, no. 1 (1983): 173-206.
[AKS] Manindra Agrawal, Neeraj Kayal, Nitin Saxena.: "Primes is in P." Annals of Mathematics, Vol. 160 (2004), no. 2: 781-793.
[AlfAndPom94] Alford, W. R., Andrew Granville, and Carl Pomerance.: "There Are Infinitely Many Carmichael Numbers." Annals of Mathematics, Second Series, 139, no. 3 (1994): 703-22.
[AtkMor93] A. O. L. Atkin and François Morain.: "Elliptic curves and primality proving." Mathematics of Computation 61, no. 203 (1993): 29-68.
[Back90] Eric Bach: "Explicit bounds for primality testing and related problems." Mathematics of Computation 55, no. 191 (1990): 355–380.
[BarGauJouTho2014] Razvan Barbulescu, Pierrick Gaudry, Antoine Joux, and Emmanuel Thomé.: "A Heuristic Quasi-Polynomial Algorithm for Discrete Logarithm in Finite Fields of Small Characteristic". In Proceedings of Advances in Cryptology -- EUROCRYPT 2014: 1-16.
[Bor03] Folkmar Bornemann.: "PRIMES Is in P: A Breakthrough for 'Everyman'" in Notices of the AMS, Vol. 50 (2003), no. 5: 545-552
[BSI16] Bundesamt für Sicherheit in der Informationstechnik (Publ.),: BSI TR-02102-1 "Kryptographische Verfahren: Empfehlungen und Schlüssellängen" (Cryptographic Methods: Recommendations and Keylengths), version 2016-01.
[CohLen84] Henri Cohen, and Hendrik Willem Lenstra Jr.: "Primality testing and Jacobi sums." Mathematics of Computation 42, no. 165 (1984): 297-330.
[CohLen87] Henri Cohen, and Arjen Klaas Lenstra.: "Implementation of a new primality test." Mathematics of Computation 48, no. 177 (1987): 103-121.
[DamLanPom93] Damgård, Ivan, Peter Landrock, and Carl Pomerance.: "Average Case Error Estimates for the Strong Probable Prime Test." Mathematics of Computation 61, no. 203 (1993): 177-94.
[GolKil86] S Goldwasser and J Kilian.: "Almost all primes can be quickly certified." Proceedings of the eighteenth annual ACM symposium on Theory of computing (STOC '86): 316-329.
[Mar01] Marcel Martin: "Primo" https://www.ellipsa.eu/public/primo/primo.html
[Mau95] Ueli M. Maurer.: "Fast generation of prime numbers and secure public-key cryptographic parameters." Journal of Cryptology 8, no. 3 (1995): 123-155.
[Mil75] Gary L. Miller.: "Riemann's Hypothesis and Tests for Primality." Proceedings of Seventh Annual ACM Symposium on Theory of Computing (STOC '75): 234--239
[Par] The Pari Group.: "Pari/GP version 2.7.6" https://pari.math.u-bordeaux.fr/
[Rab80] Michael O. Rabin.: "Probabilistic algorithm for testing primality." Journal of Number Theory, Vol 12, no 1 (1980): 128-138
[Sag] W.A. Stein & The Sage Development Team.: "Sage Mathematics Software, version 7.3" https://www.sagemath.org/
[SchFerKoh10] Bruce Schneier, Niels Ferguson, Tadayoshi Kohno.: "Cryptography Engineering." Wiley Pub. Inc., 2010.
[SNSKFKM16] Petr Švenda, Matúš Nemec, Peter Sekan, Rudolf Kvašňovský, David Formánek, David Komárek, and Vashek Matyáš.: The Million-Key Question --- Investigating the Origins of RSA Public Keys. Proceedings of the 25th USENIX Security Symposium (USENIX Security 16): 893-910.

[TestU01] https://simul.iro.umontreal.ca/testu01/tu01.html