Counting the ways to express a number as a sum of primes
Given a natural number N greater than unity, one might wonder how many ways exist to write it as a sum of primes.
The answer is trivial for small values of N. There is only a single way to write numbers 2, 3 and 4: 2 = 2, 3 = 3 and 4 = 2 + 2. There are two ways to write either 5 or 6 as sums of primes: 5 = 5 and 5 = 2 + 3, as well as 6 = 3 + 3 and 6 = 2 + 2 + 2. For the number 7 we get three ways: 7 = 7, 7 = 5 + 2 and 7 = 3 + 2 + 2. After that things gain a bit of momentum, and e.g. for N = 17 there are 17 ways to write it as a sum of primes.
(The latter observation by John D. Cook and the resulting discussion with other Twitter user known by a nickname Maths Is a Language has actually motivated me to get an in-depth look at the subject.)
What happens for larger values of N? We might expect that there would be still more ways to write N as a sum of primes. At this point it’s practical to write up some code and count those ways. The result is shown on Figure 1.
On a first look, the function shown on Fig. 1 seems to grow exponentially. But the rate of growth also seems to slow down.
In the rest of this article we will try to obtain an estimate for the number of ways to write a number N as a sum of primes, that should be easier to compute than counting all such ways directly. We will also need a name for that function. So let’s say, we are going to find an estimate of G(N), which is the number of ways to write N as a sum of primes.
How do we approach it?
Let’s start with some definitions.
For a given number N, there would be n prime numbers less than or equal to N. Let’s denote them as f(1), f(2), …, f(n).
There might be ways to express N as a sum involving a single prime number (such as 8 = 2 + 2 + 2 + 2), or as a sum of two distinct prime numbers (such as 19 = 5 + 5 + 3 + 3 + 3), or, speaking generally, N could be expressed as a sum of q distinct prime numbers, where q≤n.
Suppose we are considering a single prime number f(i), where i = 1…n. What is the chance that N is a multiple of f(i)? If we divide N by f(i), the remainder could be any number between 0 and f(i)-1 with equal probability, so the chance the remainder is equal to zero is 1 / f(i). Equivalently, we could say that there are 1 / f(i) sums involving only the prime number f(i), that equal to N.
Now let’s say we have two prime numbers f(i) and f(j), where i, j = 1…n, i≠j. How many sums involving the multiples of f(i) and f(j) would equal to N? There are N/f(j) possible factors for f(j), and for each of those, there is an 1/f(i) chance to find an appropriate factor for f(i) for the sum to equal N. Overall, there are about N/(f(i)×f(j)) such sums.
What changes when we are to use three prime numbers, f(i), f(j) and f(k), where i, j, k = 1…n, i≠j, j≠k, k≠i? Quite similarly, we need to try all possible factors for f(k) and find the number of sums for each:
Continuing in the same manner, we get the following formula for the number of sums which equal N and involve q prime numbers f(i), …, f(l):
Now we are ready to count all such sums. Let’s say there are S(q) sums of q different prime numbers which equal N. All we need to do is cycle through all possible combinations of q different primes. Suppose that the set of q prime numbers involves primes f(i), f(j), …, f(k), f(l). So we get:
That is not really useful. To get somewhere from here it would be really helpful if indices i, j, …, l would not depend on each other but would just run freely from 1 to n. (Looks like a dream world, doesn’t it?) The first thing to observe is that such a simplified sum would involve exactly q! instances of S(q), because there are q! possible permutations of q different indices. That’s good news. The bad news is that it would also involve the cases in which two or more indices coincide, which are meaningless for us. Accounting for all such cases is not quite a trivial problem! So let’s just introduce a scaling factor E(q, n) to (at least) get the right number of combinations:
Note how we have neatly collapsed the q consecutive sums into a single exponent. But now we need to produce some estimate for E(q, n). By definition, it equals the ratio of all possible combinations of q different primes chosen from a set of n primes, to the value we get by dividing the number of all possible combinations of q (not necessarily different) primes by q!:
In principle, that’s how far we get. But for values of n >> 1, we would only be interested in the values of q much smaller than n. Which justifies one further approximation:
Collecting all the stuff together, we get the final expression for the desired number of ways to express the number N as a sum of primes:
The final estimate for G(N) we have obtained looks pretty much like Taylor series, except for the exponential term. However, since the exponential term is less than 1, the series converges at least as fast as Taylor series.
One more thing to observe is than in our expression for G(N), n is the number of prime numbers less than or equal to N. That value is referred to as the prime-counting function π(N). So let’s just rewrite our estimate using the standard notation:
where f(i) is an i-th prime number.
While not the handiest of expressions, our formula for G(N) is much easier to compute than directly counting the number of sums of primes which equal N. Also, the sum in the expression for G(N) converges much sooner than q = π(N), so calculating all the terms is not necessary.
Now, how does it stand up to the reality?
Figure 2 compares the number of ways to express a number N as a sum of primes, obtained either by a direct computation (the solid black line), or by using our estimate for G(N) (the dashed blue line), for the values of N below 10000.
Since the value of G(N) rises almost exponentially, it’s hard to assess the accuracy of our formula for G(N) while looking at Figure 2. So let’s divide our estimate of G(N) by its true value, and have a look at the resulting error, which is shown on Figure 3.
To sum it all up (no pun intended), in this article we have derived an estimate for G(N) — the number of ways to express a natural number N as a sum of prime numbers — and compared it to the true value of G(N) obtained by directly computing the number of such ways. We hope that you have got some fun while reading it. And, oh, if you need the raw data for the exact values of G(N), here it is.
Thanks for your attention!