Know your (large) numbers

Spread the love

As a child, I was fascinated by large numbers. I recall being unbelievably excited when my dad gave me a clipping of a newspaper trivia column which listed the names of numbers from a million up to decillion. While most of us had some practical knowledge of millions, billions, and trillions, the numbers beyond that were far more obscure. It felt good to be privy to even a small part of the arcana of numerical nomenclature.

Not long after that, I stumbled upon a list of large numbers up to vigintillion in the phone book-sized dictionary in my elementary school classroom. This was incredible! Now I could give precise names to any number up to about 1065. Still, knowing that there were numbers even bigger than that, I was not satisfied.

Much later, I was able to turn to the internet to resolve many previous research roadblocks. There I found references to far larger numbers using extensions to the established Latin-based conventions — trigintillion, quadragintillion, and so on. By then, the early magic had faded, yet my idle curiosity about large numbers never quite went away. In fact, one of the first programs I wrote while learning C# was a “numbers to words” converter which supported thousands of digits.

Since I’m feeling nostalgic, let’s recreate that program now! First the boilerplate to read the number (as a string — since we will quickly shoot past the limit of a primitive integral type) and handle cases like negative and zero:

namespace n2w
{
    using System;
    using System.Collections.Generic;
    using System.Linq;

    internal sealed class Program
    {
        private const string Negative = "negative";
        private const string Zero = "zero";

        public static void Main(string[] args)
        {
            string words;
            try
            {
                words = ToWords(Num(args));
            }
            catch (FormatException e)
            {
                words = "ERROR: " + e.Message;
            }

            Console.WriteLine(words);
        }

        private static string Num(string[] args)
        {
            if (args.Length != 1)
            {
                throw new FormatException("Invalid arguments.");
            }

            return args[0].Trim();
        }

        private static string ToWords(string number)
        {
            string prefix = null;
            if ((number.Length > 0) && (number[0] == '-'))
            {
                prefix = Negative + ' ';
                number = number.Substring(1);
            }

            if (number.Length == 0)
            {
                throw new FormatException("No digits.");
            }

            string words = string.Join(" ", Groups(number).Where(g => g != null));
            if (words.Length == 0)
            {
                return Zero;
            }

            return prefix + words;
        }

        private static IEnumerable<string> Groups(string number)
        {
            yield return null; // TODO
        }
    }
}

We’ve laid the groundwork here to separate the number into groups of digits and then name each group. Right now, though, it simply calls every number “zero.” Let’s extend this to name every number up to 999,999 — what I’ll call the “small numbers.”

To name small numbers in English, we need to know only a few dozen unique words and a few rules, to be applied in digit groupings of three. So, we end up with the single digits, the numbers for the multiples of 10 up to 90, the special cases for 11-19, and the suffixes for 100 and 1000. We combine names of 1000s, 100s, 10s and digits to build up the word in groups, omitting any portion if it is zero (since we don’t say “two thousand zero hundred” for 2000). Add to that the one orthographical peculiarity where we hyphenate 10s names like “twenty-one” and we have a complete algorithm:

//...
private static IEnumerable<string> Groups(string number)
{
    int start = number.Length % 3;
    int n = (number.Length - 1) / 3;
    if (start == 1)
    {
        yield return Group('0', '0', number[0], n--);
    }
    else if (start == 2)
    {
        yield return Group('0', number[0], number[1], n--);
    }

    for (int i = start; i < number.Length; i += 3)
    {
        yield return Group(number[i], number[i + 1], number[i + 2], n--);
    }
}

private static string Group(char h, char t, char u, int n)
{
    string g = Small.Hundreds(D(h), D(t), D(u));
    if (g == null)
    {
        return null;
    }

    return Join(g, ' ', Small.Suffix(n != 0));
}

private static string Join(string a, char s, string b) => (b == null) ? a : a + s + b;

private static int D(char c)
{
    int d = c - '0';
    if ((d < 0) || (d > 9))
    {
        throw new FormatException("Bad digit '" + c + "'.");
    }

    return d;
}

private static class Small
{
    private static readonly string[] Units = new string[]
    {
        null, "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"
    };

    private static readonly string[] Teens = new string[]
    {
        "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen", "eighteen", "nineteen"
    };

    private static readonly string[] Tens = new string[]
    {
        null, null, "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"
    };

    private static readonly string[] Names = new string[] { null, "hundred", "thousand" };

    public static string Suffix(bool thousand) => Names[thousand ? 2 : 0];

    public static string Hundreds(int h, int t, int u)
    {
        switch (h)
        {
            case 0: return TensPart(t, u);
            default: return Join(Units[h] + ' ' + Names[1], ' ', TensPart(t, u));
        }
    }

    private static string TensPart(int t, int u)
    {
        switch (t)
        {
            case 0: return Units[u];
            case 1: return Teens[u];
            default: return Join(Tens[t], '-', Units[u]);
        }
    }
}
//...

This is a start, but remember that we’re talking about large numbers here! The program so far will name every grouping “thousand” so we need to teach it the rules of big numbers. Above I summarized what are sometimes called “dictionary numbers” containing the names from million up to vigintillion. The extension to this system by Conway and Guy is explained fairly well by Sbiis.ExE (whose large numbers journey bears more than a passing similarity to my own). But basically, it uses the dictionary numbers as a starting point, and extrapolates Latin names for the units, tens, and hundreds places of ever larger number groups. For example, 1039 = 103N+3 where N=12, which gives us duodecillion — the ones portion is duo = 2 and the tens portion is deci = 10. Similarly, for 101002, N=333, or tre (3) + triginti (30) + trecent(i) (300) + illion (note that we drop the last vowel before adding the -illion suffix).

Now for the changes to make the program observe these rules. First, we need to edit the Group function to handle the big suffixes:

// OLD CODE:
//   return Join(g, ' ', Small.Suffix(n != 0));
// NEW CODE:
return Join(g, ' ', (n < 2) ? Small.Suffix(n != 0) : Big.Suffix(n - 1));

And now the rules above, codified into a private inner class Big:

private static class Big
{
    private const string Ending = "illion";

    private static readonly string[] Ones = new string[]
    {
        null, "m", "b", "tr", "quadr", "quint", "sext", "sept", "oct", "non"
    };

    private static readonly string[] Units = new string[]
    {
        null, "un", "duo", "tre", "quattuor", "quinqua", "sex", "septem", "octo", "novem"
    };

    private static readonly string[] Tens = new string[]
    {
        null, "deci", "viginti", "triginti", "quadraginta", "quinquaginta", "sexaginta", "septuaginta", "octoginta", "nonaginta"
    };

    private static readonly string[] Hundreds = new string[]
    {
        null, "centi", "ducenti", "trecenti", "quadringenti", "quingenti", "sescenti", "septingenti", "octingenti", "nongenti"
    };

    public static string Suffix(int n)
    {
        string prefix;
        if (n < 10)
        {
            prefix = Ones[n];
        }
        else
        {
            prefix = Units[n % 10] + Tens[(n / 10) % 10] + Hundreds[n / 100];
            prefix = prefix.Substring(0, prefix.Length - 1);
        }
        
        return prefix + Ending;
    }
}

It is possible to extend this naming system even further, but I think 103002 is a good stopping point for today’s program.

Leave a Reply

Your email address will not be published. Required fields are marked *