Candidates for the Berry Paradox Number
(You can also read this post on Substack, where you can sign up for email subscriptions if you'd like.)
The Berry Paradox
The Berry Paradox describes the following number:
The least integer not nameable in fewer than nineteen syllables
In this post, I argue that the best candidate for the Berry Paradox number (that I could find) is: 10,001,000,770,777.
I’ll explain my reasoning and give some additional candidates for this Berry Paradox Number below.
Candidate #1: Nothing
It’s obviously a paradox, so there can be no answer.
This was first proposed by Oxford University librarian Mr. G. Berry as noted by Bertrand Russell in 1908. Russel noted: “‘The least integer not nameable in fewer than nineteen syllables’ is itself a name consisting of eighteen syllables; hence the least integer not nameable in fewer than nineteen syllables can be named in eighteen syllables, which is a contradiction.”
But this is a boring answer. What if we can ignore the paradox and think about the problem anyway? If we ignore this specific 18-syllable description, objecting that it’s recursive and paradoxical and makes no sense, we can still wonder what other numbers are difficult to describe in fewer than 19 syllables. Of course there’s nothing really deep going on here; the amount of information in a fixed amount of text is obviously finite, and therefore there must be a largest possible describable number, and it sounds like a fun question to try to answer.
Candidate #2: Nothing again but for a different reason
Integers can be positive or negative, and “least” can refer to large negative numbers. Arbitrarily enormous negative integers are impossible to describe in small numbers of syllables. So you could argue that this question is equivalent to determining “the least integer” which clearly has no answer. But again, that seems like a dumb answer, so I’ll assume we’re talking about positive integers from now on.
Candidate #3: 10,001,000,770,777
One way we could describe numbers concisely is by just stating each digit one after the other. This is often more compact than the proper number spoken name (consider “seventy seven” with five syllables vs “seven seven” with four syllables). And I think it’s still clear enough to function as a description without explanation beforehand.
Note that each digit from zero to nine is one syllable, except for zero and seven. I will also allow “oh” to be used in place of zero, for example “one oh one” meaning 101 (but I’m not allowing “sev” to be used in place of “seven”, that seems too unclear). This means that counting up every number from oh, one, two, etc. by listing out each digit, we first reach nineteen syllables at the following number: “one, seven seven seven, seven seven seven, seven seven seven”, or 1,777,777,777. Now, we have to consider whether there is any more compact description of this number using fewer syllables.
“One then nine sevens” is only five syllables, and unambiguously describes the same number. For any other 1,XXX,XXX,XXX number, there will be fewer sevens and therefore fewer syllables in the simple digit-saying description. So replacing just this one number, we can keep counting higher.
The next time we reach a problem is at “seven, oh seven seven, seven seven seven, seven seven seven” or 7,077,777,777 for 19 syllables. Again, we can compact this to “seven oh then eight sevens” for 7 syllables. We follow the same idea all the way through the ten digit numbers; we will always have either fewer then nine sevens, or at least three sevens in a row which can be condensed as described above (“then three sevens then” vs “seven seven seven”) for fewer syllables.
For the 11 digit numbers, the same ideas apply. Here, we reach a problem at “one oh, oh seven seven, seven seven seven, seven seven seven” or 10,077,777,777 with 19 syllables. For 11 digit numbers with eight sevens, there are almost always at least three sevens in a row which can be condensed. But eventually, we reach “seven seven oh seven seven oh seven seven oh seven seven” or 77,077,077,077 with 19 syllables and no condensed version following the patterns above.
Let’s try to think of more creative ways to compact 77,077,077,077 into fewer syllables (remembering that unclear communications aren’t allowed). “Let A be seven seven. A oh A oh A oh A” is 14 syllables. That takes a bit of thinking to figure out, but I think it seems pretty clear.
In fact, using this type of pattern with “A” replacing “seven” lets us get all the way through the 13 digit numbers. “Let A be seven. A, A A A, A A A, A A A, A A A” is 18 syllables, and any “A” can be replaced by a non-seven digit for the same number of syllables.
Now tackling the 14 digit numbers. Here, we will be fine if we have four or fewer sevens, since we could then save a syllable by neglecting the “Let A be seven” prefix. We will also be fine if we have any five digits in a row, since we could replace “A A A A A” with “then five As then” to save a syllable (I assert that neglecting a “then” makes the number description unclear). The first case that violates both of these is 10,000,100,707,777. Here, we can save a syllable by doing “Let A be seven. Ten trillion plus one oh oh, A oh A, A A A” for 18 syllables, our first time using addition in our descriptions. (Some could argue that this is ambiguous because the traditional British long scale would name 10^12 “billion” rather than trillion, but I consider that system outdated and bad and ignorable.)
The next problem case is 10,001,000,707,777, where now there aren’t enough starting zeros for the “trillion” trick to save a syllable. With four sevens at the end, we can save a syllable by neglecting one “then” word: “Let A be seven. one oh, oh oh one, oh oh oh, A oh then four As” for 18 syllables.
The next problem case is 10,001,000,770,777. Now that the sevens rearranged a bit, the three sevens at the end can’t be condensed. “Let A be seven. one oh, oh oh one, oh oh oh, A A oh, A A A” is 19 syllables.
I don’t see an easy way to condense this further. If we can’t condense the digits, there’s still potentially a case for some way to reference this number a different way. Googling it with and without commas produces no results. This number isn’t in the OEIS. Perhaps a further condensation is possible, but I’ll stop here for now.
I would describe this as our most pure candidate. Candidates 1 and 2 were just helping explain the rules really, and candidates 4 and 5 below change the rules to add some more fun.
Candidate #4: ≈15,831^18
What if you’re allowed to pre-arrange a communication system beforehand? The number of possible syllables in English has been estimated as 15,831. So with 18 syllables, that would give us an estimate of 15,831^18. Of course, this estimate is approximate, and would be very difficult or impossible to implement in practice. Adding different vocal frequencies and durations could make this potentially much larger, but that would make the system even more convoluted and impractical, and I’ll leave it as an exercise to the reader.
Candidate #5: ≈36^53
As far as I can tell, this 19-syllable phrasing is the original description of Berry’s paradox, but it’s not the only definition that people use. Currently the Wikipedia page instead defines it like this:
“The smallest positive integer not definable in under sixty letters”
This phrase has 57 letters, and does not count spaces. If we use this definition, we could go through the above logic chain again, counting letters rather than syllables, but it would basically be a similar exercise. Instead, what if we allow numerical digits in addition to letters?
Here, we can clearly do 99,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999 or 10^59-1, which is 59 characters (where I’ve added commas for clarity, but they wouldn’t show up in the official version). This would make the Berry Paradox number one higher, 10^59. But can we do better?
What about “Base 36 ZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ,ZZZ”, where it’s fairly clear than base 36 consisists of the digits 0-9 followed by the letters A-Z. This is 59 characters, and numerically equal to 36^53-1, giving us a Berry Paradox number of 36^53.
This specific number could again be condensed, but I doubt you could go much higher without being unclear or leaving any gaps, so I’ll stop here and leave it as an exercise to the reader.
Other potential spinoffs
What if you’re not allowed to use “numbery” words? No words or characters that by themselves imply a specific number. That means no digits and no words like first, even, cube, quartet, etc. I’ll allow superlatives, comparatives, and the word “multiple”. For fun, I’ll also require that these are mathy-definitions, rather than trivia about non-mathy things (no “number of fingers on a typical human” etc.). I’ll also require that the mathy-definitions are common knowledge (no “the number of Heegner numbers that exist” etc.).
1: smallest positive integer
2: smallest prime number
3: smallest number bigger than the smallest prime number
4: smallest composite number
5: smallest number bigger than the smallest composite number
6: smallest number with multiple prime factors
I honestly expected to get much further, but after a bit of thought, couldn’t find a solution for 7 that follows the strict rules above and is less than 60 characters.
What if you’re even stricter about not using anything numbery in your description, to the extent that you have to define the integers yourself? Using a set theory definition, you can’t get very far, and five already has too many digits:
0: {}
1: {{}}
2: {{}, {{}}}
3: {{{}, {{}}}, {{{}, {{}}}}}
4: {{{{}, {{}}}, {{{}, {{}}}}}, {{{{}, {{}}}, {{{}, {{}}}}}}}
5: {{{{{}, {{}}}, {{{}, {{}}}}}, {{{{}, {{}}}, {{{}, {{}}}}}}}, {{{{{}, {{}}}, {{{}, {{}}}}}, {{{{}, {{}}}, {{{}, {{}}}}}}}}}
I was originally imagining a test where you can prove that a description implies a number by sampling only the most probable numerical tokens output by a large language model at zero temperature. This would allow you to answer more vaguely than a normal human would accept, while still proving that you communicate exactly one specific number. But I realized this would probably just allow adversarial attacks where random words imply certain numbers for no discernible reason, and it might not be very interesting.