Science

The Infamous Collatz Conjecture: 3n + 1

Naomi Blair

23 Oct 2024 — 8 min read

The Collatz Conjecture was named after and introduced by Lothar Collatz in 1937. The Collatz Conjecture is one of the most infamous unsolved problems in mathematics. It concerns a sequence of integers in which each term is obtained from the previous term following this process: if the previous term is even, the next term is one half of the previous term. If the previous term is odd, the next term is 3 times the previous term plus 1. The conjecture is that these sequences always reach 1, no matter which positive integer is chosen to start the sequence. The conjecture has been shown to hold for all positive integers up to 2.95×10²⁰, but no general proof has been found.

Statement and Example of the Problem

Pick a number, any number. Let us say that that number is 5. Since the number is odd, we multiply it by three and add one. 5 times 3 is 15, plus 1 is 16. Since 16 is even, we divide it in half, 16 divided by 2 is 8. 8 is even so we also divide it by 2, this gets us 4. 4 is even, so yet again, we divide it by 2, this gets us 2. And finally, since 2 is even we divide it by 2, resulting in 1. However if we continue this sequence we will notice something interesting: since 1 is an odd number, we multiply it by 3 and add a 1, which will get us 4. 4 is an even number so we divide it by 2, and since 2 is an even number as well, we divide by 2 again, which gets us back to 1. Does this sound familiar? Well, we have now found ourselves in a never-ending loop of 1, 4, 2, 1, 4, 2, 1 .... going on to infinity. That is the conjecture. As simple as that, yet his astounded mathematicians for decades.

We can go back a few steps to truly understand the mechanism behind this sequence.

In modular arithmetic notation, we define the function f as follows:

What we are essentially stating here is that if the remainder is 0 when n is divided by 2, we apply the first rule to n: n/2. To simplify it even more: if n is even, divide n by 2. As for the second statement, if the remainder is 1 when n is divided by 2, we apply the second rule to n: 3n+1. That is to say, if n is odd, we multiply it by 3 and add 1. This notation establishes the rules in producing the sequence.

The following notation shows that a_i is the value of f applied to n recursively i times; a_i = f ⁱ(n)).

The Collatz conjecture states that: This process will eventually reach the number 1, regardless of which positive integer is chosen initially. That is, for each n there is some i with a_i = 1.

Stopping time is a concept used in the analysis of the Collatz Conjecture. For a given starting number n, the stopping time is the smallest number of times you need to apply the function to get a result that is less than n. For example, if you start with the number 12 and apply the function, you get the sequence: 12, 6, 3, 10, 5, 16, 8, 4, 2, 1. The stopping time for 12 is 3, because it takes three applications of the function to reach a number smaller than 12. There is also a related concept called total stopping time, which is the smallest number of times you need to apply the function to reach 1.

Plot of numbers 1-9999 against their Collatz total stopping time. - Image by Cirne on Wikimedia Commons

If the conjecture is false, it can only be because the starting number gives rise to a sequence that does not contain 1, ever. Such a sequence would either enter a repeating cycle that excludes 1, (another repeating loop other than 4, 2, 1) or simply increase without bound. However, no such sequence has been found.

Now, at first glance it may seem like the sequence should continue to grow indefinitely, right? After all, the odd numbers are tripled, while the even numbers are merely split in half. Well, the fact of the matter is, if you triple an odd number and add one, you will always end up with an even number. 50% of the time, dividing by two brings you back to an odd number. 25% of the time, you can divide by four (divide by 2 two times) before you get to the next odd number. 12.5% of the time, you can divide by eight to get to the next odd number. 1/16 of the time, you can divide by 16 and so on. So, in reality, the sequence experiences an overall downward trend.

Attempts at Analyzing the Conjecture

Another name for the Collatz conjecture is the Hailstone Sequence because the numbers go up and down like hailstones in a storm cloud, at some point eventually plummeting to the ground.

Mathematicians decided to look at the paths of the hailstone numbers to identify any patterns. Sure, the obvious pattern is that they all end up at one, but what about the paths they took to get there?

Histogram of total stopping times for the numbers 1 to 10⁸. Total stopping time is on the x axis, frequency on the y axis. - Image by AllenMcC. on Wikimedia Commons

Here is a graph of natural numbers from 1 to 10⁸. The graph peaks and drops so low that any changes can be deemed as negligible. But if you take the logarithm, you will find that the graph fluctuates around a downward trend. It resembles, in a way, the stock market on "a bad day". This is no coincidence, however, as both are examples of geometric Brownian motion.

A Geometric Brownian Motion (GBM) is a mathematical model used to describe the random behavior of stock prices or other financial assets over time. It's a type of stochastic process, meaning it incorporates randomness into the model.

Another way to analyse the sequence is by looking at the leading digit of each number. You can count how many numbers start with the digits 1, 2, 3, ... up to 9, and creating a histogram to visualise it. For the first billion sequences, the number 1 is the most common leading digit. 30% of all numbers start with 1, 17.47% start with 2, 12.09% start with 3, and the frequency decreases for higher digits with under 5% of all numbers starting with 9. This distribution is known as Benford's law.

Benford's Law works best when the numbers involved span several orders of magnitude as the 3n+1 does. However, it does not disclose whether the numbers will all end up in the 4, 2, 1 loop or not. That requires a different sort of analysis.

Directed Graph

Another way to visualise the Collatz Conjecture is through a directed graph. A directed graph is defined as a type of graph where the edges have a direction associated with them.

Such a graph is constructed for each natural number as follows:

Node: In the graph, every natural number (like 1, 2, 3, etc.) is a "node."
Parent Node: Each node has a "parent node" (a node that points to it). The rule to find the parent depends on whether the number is even or odd:
- If the number (n) is even, the parent is n/2.
- If the number is odd, the parent is (3n+1)/2
Left Child: The left child of a node is always 2n (just multiply the number by 2).
Right Child: The right child follows a special rule:
- If n ≡ 2 mod 3 (this means when you divide n by 3, the remainder is 2), the right child is (2n-1)/3.
- If n ≠ 2 mod 3 (the remainder is not 2), then there is no right child.

There are four basic types of graphs based on these rules. Each graph is formed differently depending on whether the numbers (nodes) are even, odd, or satisfy the mod 3 condition for the right child.

To apply this to all the sequences in the Collatz Conjecture, we must connect the basic directed graphs. A simple rule to connect two basic directed graphs is that these two basic directed graphs must have a common edge as shown below:

If the conjecture is true it would mean that every single number is connected to this group, every tiny stream out to infinity eventually flows into the massive river of 4, 2, 1.

Directed graph showing the orbits of small numbers under the Collatz map, skipping even numbers. The Collatz conjecture states that all paths eventually lead to 1. - Image by Keenan Pepper on Wikimedia Commons

Concluding Thoughts

As with most of the math topics I discuss here, I have not even reached the tip of the iceberg when it comes to the Collatz Conjecture. There are still so many proofs out there and different ways of looking at it, involving a level of mathematics that I have yet to comprehend.

Even though it "is an extraordinarily difficult problem, completely out of reach of present day mathematics" (Jeffrey Lagarias, 2010) I believe it still something worth investigating and learning about, given how a simple set of rules could spark so much confusion and ultimately stump some of the world's best mathematicians.

What also interested me the most was the visually-striking diagrams used to portray the Collatz conjecture. For example: