This is going to be a short one, so this is more of a Q&A style – It implicitly assumes OpenSSH.

## Why Custom SSH Moduli?

- Because it is fun to do
- You get
**way**better probability guarantees for the primality if doing it right - Custom moduli = spy agencies have to work a little harder

## How?

### The short version

```
ssh-keygen -G moduli.candidates -b <len> -M 127
ssh-keygen -T moduli -f moduli.candidates -M 127 -a <#tests>
```

`-G`

specifies to generate moduli *candidates* and puts the results into the given filename.
`-b`

is the length of the prime + 1 bit – so if you set it to 4096, you would get 4095 bit.
`-M`

increases the memory used to speed things up a bit – 127 (MiB) is the maximum, though.

`-T`

tells it to test the candidates and put the results into the given filename.
`-f`

specifies the file with the candidates.
`-a`

sets the number of tries for Miller-Rabin per candidate – 100 by default.

Anything said for the short version can be found in this stackoverflow thread and the manpage.

Another gotcha is that OpenSSH is sequentially filtering candidates for a given invocation, so if you want to spread things out a bit, you need to run it multiple times.

### Do not use 100 (or do)

While it might be tempting to cut down on the number of tries, it is actually counter-productive.

While virtually every textbook on primality testing will tell you that the odds for Rabin-Miller to return a false
positive after $k$ *independent* tests is $\leq 2^{-2k}$, one can do actually way better than that.

In the paper *Average case error estimates for the strong
probable prime test*, by Damgård, Landrock, and Pomerance,
the authors give better conditional bounds.

### Notation

As we will quote from the paper, we’ll use $k$ as the size of the prime in bit, and $t$ as the number of tries. $p_{k,t}$ denotes probability that a composite number of the given length $k$ has survived $t$ rounds of Miller-Rabin.

NOTE: This, including the textbook bound, only applies to randomly generated numbers that **you** generated. If an adversary sends you a «random»
number, then *NO BOUND WILL HOLD* as said adversary could always send a Carmichael Number.

### The first bound

The first bound occurs in section 5, Theorem 3 of the aforementioned paper and states:

For $k, t$ with $k\geq 21,\, 3 \leq t \leq \frac{k}{9}$ we have $p_{k,t} < k^\frac{3}{2}\frac{2^t}{\sqrt{t}}4^{2-\sqrt{tk}}$

So this even applies to the default 100 tries as used in OpenSSH

### The mighty bound

This one occurs in section 6, Theorem 6 of the paper and reads:

For $k, t$ with $k\geq 21$ and $t\geq \frac{k}{9}$ we have $p_{k,t} < \frac{7}{20}k2^{-5t} + \frac{1}{7}k^\frac{15}{4}2^{-\frac{k}{2}\,-2t} + 12k2^{-\frac{k}{4}\,-3t}$.

So, we need our number of tries to be at least 1/9th of the bitlength of the number we want to test. Even for shitty-security 1024-bit numbers, we would need 114 tries at least.

## So what’s the difference between those 3 guarantees

Glad you’re asking! I wrote a simple Python 3 script that computes the bounds (please note that it’ll abort with value errors as most intermediate values are outside of the double-precision floating points that Python’s providing, so I also used SageMath to compute the other ones):

```
from gmpy2 import mpz, mpfr, log, sqrt, ceil
def ld(x):
return ceil(log(mpfr(x)) / log(2))
def tb(t):
'''the textbook bound'''
return pow(mpfr(2), -2*t)
def t3(k, t):
'''the theorem 3 bound'''
return pow(mpfr(k), 1.5) * (2 ** t) / sqrt(t) * pow(4, -sqrt(t*k))
def t6(k, t):
'''the 'mighty bound' '''
chunk_a = mpfr(7)/20 * k * (2 ** (-5 * t))
chunk_b = mpfr(1)/7 * pow(k, 15/4) * pow(2, -k/2 - 2*t)
chunk_c = mpfr(12) * k * pow(2, -k/4 - 3*t)
return chunk_a + chunk_b + chunk_c
for k in range(2048, 8192 + 1, 1024):
print(';'.join(str(x) for x in (k, -ld(tb(100)), -ld(t3(k, 100)), -ld(t6(k, ceil(k/9))))))
```

which after some processing yields the following table:

length (bit) | textbook (100 tests) | Theorem 3 (100 tests) | Theorem 6 (‘mighty bound’) |
---|---|---|---|

2048 | 200 | 791 | 1130 |

3072 | 200 | 994 | 1699 |

4096 | 200 | 1165 | 2269 |

5120 | 200 | 1315 | 2834 |

6144 | 200 | 1452 | 3403 |

7168 | 200 | 1577 | 3973 |

8192 | 200 | 1694 | 4543 |

Note that the values are $-log_2$, so a value of $x$ means $2^{-x}$ or $1:2^x$.