This is going to be a short one, so this is more of a Q&A style – It implicitly assumes OpenSSH.

Why Custom SSH Moduli?

  1. Because it is fun to do
  2. You get way better probability guarantees for the primality if doing it right
  3. Custom moduli = spy agencies have to work a little harder

How?

The short version

ssh-keygen -G moduli.candidates -b <len> -M 127
ssh-keygen -T moduli -f moduli.candidates -M 127 -a <#tests>

-G specifies to generate moduli candidates and puts the results into the given filename. -b is the length of the prime + 1 bit – so if you set it to 4096, you would get 4095 bit. -M increases the memory used to speed things up a bit – 127 (MiB) is the maximum, though.

-T tells it to test the candidates and put the results into the given filename. -f specifies the file with the candidates. -a sets the number of tries for Miller-Rabin per candidate – 100 by default.

Anything said for the short version can be found in this stackoverflow thread and the manpage.

Another gotcha is that OpenSSH is sequentially filtering candidates for a given invocation, so if you want to spread things out a bit, you need to run it multiple times.

Do not use 100 (or do)

While it might be tempting to cut down on the number of tries, it is actually counter-productive.

While virtually every textbook on primality testing will tell you that the odds for Rabin-Miller to return a false positive after $k$ independent tests is $\leq 2^{-2k}$, one can do actually way better than that.

In the paper Average case error estimates for the strong probable prime test, by Damgård, Landrock, and Pomerance, the authors give better conditional bounds.

Notation

As we will quote from the paper, we’ll use $k$ as the size of the prime in bit, and $t$ as the number of tries. $p_{k,t}$ denotes probability that a composite number of the given length $k$ has survived $t$ rounds of Miller-Rabin.

NOTE: This, including the textbook bound, only applies to randomly generated numbers that you generated. If an adversary sends you a «random» number, then NO BOUND WILL HOLD as said adversary could always send a Carmichael Number.

The first bound

The first bound occurs in section 5, Theorem 3 of the aforementioned paper and states:

For $k, t$ with $k\geq 21,\, 3 \leq t \leq \frac{k}{9}$ we have $p_{k,t} < k^\frac{3}{2}\frac{2^t}{\sqrt{t}}4^{2-\sqrt{tk}}$

So this even applies to the default 100 tries as used in OpenSSH

The mighty bound

This one occurs in section 6, Theorem 6 of the paper and reads:

For $k, t$ with $k\geq 21$ and $t\geq \frac{k}{9}$ we have $p_{k,t} < \frac{7}{20}k2^{-5t} + \frac{1}{7}k^\frac{15}{4}2^{-\frac{k}{2}\,-2t} + 12k2^{-\frac{k}{4}\,-3t}$.

So, we need our number of tries to be at least 1/9th of the bitlength of the number we want to test. Even for shitty-security 1024-bit numbers, we would need 114 tries at least.

So what’s the difference between those 3 guarantees

Glad you’re asking! I wrote a simple Python 3 script that computes the bounds (please note that it’ll abort with value errors as most intermediate values are outside of the double-precision floating points that Python’s providing, so I also used SageMath to compute the other ones):

from gmpy2 import mpz, mpfr, log, sqrt, ceil


def ld(x):
    return ceil(log(mpfr(x)) / log(2))


def tb(t):
    '''the textbook bound'''
    return pow(mpfr(2), -2*t)


def t3(k, t):
    '''the theorem 3 bound'''
    return pow(mpfr(k), 1.5) * (2 ** t) / sqrt(t) * pow(4, -sqrt(t*k))


def t6(k, t):
    '''the 'mighty bound' '''
    chunk_a = mpfr(7)/20 * k * (2 ** (-5 * t))
    chunk_b = mpfr(1)/7 * pow(k, 15/4) * pow(2, -k/2 - 2*t)
    chunk_c = mpfr(12) * k * pow(2, -k/4 - 3*t)

    return chunk_a + chunk_b + chunk_c


for k in range(2048, 8192 + 1, 1024):
    print(';'.join(str(x) for x in (k, -ld(tb(100)), -ld(t3(k, 100)), -ld(t6(k, ceil(k/9))))))

which after some processing yields the following table:

length (bit) textbook (100 tests) Theorem 3 (100 tests) Theorem 6 (‘mighty bound’)
2048 200 791 1130
3072 200 994 1699
4096 200 1165 2269
5120 200 1315 2834
6144 200 1452 3403
7168 200 1577 3973
8192 200 1694 4543

Note that the values are $-log_2$, so a value of $x$ means $2^{-x}$ or $1:2^x$.