Subpages:

Big O Notation

Word count: < Retrieving word count... >


Time for me to type 2000 words on a random tech subject because I'm bored. This time, I'm going to talk the basics of Big O notation and why it (might) matter to you.


Big O is a tool we use to answer a simple question:

“About how bad is this thing going to get when the pile of stuff it has to cope with grows?”

The “thing” can be an algorithm, a database query, a React component, whatever. The “pile of stuff” might be the number of customers, photos, tweets, torrented movies you can't bear to delete even though you really didn't like watching them, whatever. Big O is your way to tell if your server will blow up.

Before we talk about it, though, you may have a few questions.

The Questioner

"Why not just time it?"

We could just run the code and look at a stopwatch afterwards, but:

  1. Your phone, my MacBook, and the Azure us-east-2c instance all have different opinions about time. And latency. And consistency.
  2. “Works fine on 20 items” is a lie you tell yourself to feel better about your impending scrum deadline. “Works fine on 200,000 items” is too. You just can't make a realistic estimate.
  3. Unlike you, who presumably must have an 8-digit budget to be asking such questions (a. because you're not concerned with spinning up an entire AWS instance just to play craps with a test database, b. because then, presumably, you are a C-suite exec, which explains why you're reading this instead of listening to your subordinates), most people either cannot or do not have the ability to test suitable real-world scenarios. This could be hardware, software, or just plain time limitations.

Big O lets us forget the hardware for a second and just focus on how the algorithm itself behaves as input size grows to infinity, or, in regular terms, “how the thing behaves as we shove increasingly large piles of junk into it”. (Look, if you take offense to me calling your inputs 'junk', you have not met the average end user. I have seen entire Slack threads pasted inside Search columns.)

"How are we going to count to infinity?"

Patience and a really big abacus. Hope you're willing to work off the clock!

Just kidding. If you took first-year calculus, you can probably skip this part since you already know the answer. See, turns out a lot of algorithms behave like functions, and if we take the limit of an algorithm $\lim_{ n \to \infty } {O}(n)$, we can study how it behaves as it approaches the big infinity. This also lets us drop constant terms, small terms, and terms that don't increase quickly. For example, we can say stuff like $O(100k \cos^{2}{n} \cdot n) \approx O(n)$. If you're a big-words guy, this is called asymptotic behavior. But, as you may see later, quite a bit of the time we won't even need to go there- there is often a mathy way to get around needing to do calculus (eww!).

"But our input stack will never be infinity!"

Well, it works mighty well as a placeholder for x really big value. Let's say you're making an app. You test some filtering algorithm with 100 items, then 1,000, then you even stretch it out to 100,000 items. You declare this functional, push to production, and all is well. Then, a week later, your app blows up from TikTok or however all the trendy apps blow up these days, and you've got 80,000 users waiting at the front door! Turns out that algorithm of yours was $O(n^{2})$, and now you can't capitalize on your five minutes of fame because your server instance keeps blowing up every time a user so much as breathes on it. Cue stress, anxiety, stroke at 30, death, no more Taco Tuesdays. Don't let this happen to you! Test using Big O Notation™ today!

Anyways, as ridiculous as that sounds, it does happen. Just ask Cloudflare.

The Big O

In any case, let's talk about the notation itself.

It looks something like this: $$O(n)$$

This would be called "the Big $O$ of $n$". $n$ is the size of your input stack, and $O(n)$ represents the runtime of the algorithm, a.k.a how much time it takes to run the algorithm as $n$ grows. This one here is the simplest and a very common one, so common we have a name for it: linear time complexity. We call it that because the function is $n$. If we have an input stack of size $n=10$, we will need to execute $10$ steps in order to complete the task. $n=1000$, $1000$ steps, and so on. Quite a few of your favorite operations in any given programming language will be $O(n)$, like looping through an array:


   numbers = [1, 2, 3, 4, 5]
       for num in numbers:  # O(n): One operation per element
       print(num)
   

Common $O$ Values

That's the basics of O notation. I've written a bit more on some questions and considerations you may have.


How do we pick the “n”?

Whichever thing is growing fastest and matters most. The $n$ is just a stand-in for the variable that measures the size or complexity of your input, and different problems have different definitions.

Big O must consider which variable will cause resource problems first.

Asterisks and Other Considerations

The Blind Spot

Let me tell you a story.

I was testing a sorting algorithm once, it was this thing that would let you inject multiple typed desiderata into a filtering/sort system to semi-randomly seed your results according to multiple factors. Don't ask, it didn't pan out. Anyways, according to my calculations, it would be $O(n \log{n})$ in complexity: it worked kind of the same as heap sort, I guess. But, for some reason, when I was testing it, it didn't act like that at all. When I added $50$, $500$, $5000$ values, it would always take around $10$ seconds or so, only when I bumped the dummy load up to $100,000$ values did it start to take a little longer. What the hell, man, I was thinking, what was happening?

Digging into some logs, I started spotting a pattern. Every time I'd start a test, the system would quietly shoot a bunch of GET requests to a dummy API (some vestigial bit from early on where I thought I'd benchmark runs against a database in a Cloudflare D1 bucket). Each request waited for a timeout before continuing. Guess how long the timeout was. If the cloud didn't answer (and it never did, because I hadn't gotten around to setting it up yet), the code would hang and slow down the code.

That's the lesson.

Big O tells you how the algorithm scales, but it doesn't account for everything that can cause a slowdown- slow networks, cache misses, hardware limitations, anything other than the software itself. In this case, my sorting algorithm always had a 10-second weight dragging along on it. Only when $n$ was so big it generated a second, bigger weight (thanks, $O(n \log n)$! You did your job great :D) did the expected curve appear.

Appendix A

Let's break down how we get $n \log{ n}$ for a very common sorting algorithm- Merge Sort.

But first, a quick TL;DR of how merge sort works: We split the array into halves, sort each, then merges it back together.

Splitting

We are splitting the array until we get subarrays of length $1$. To do this, it turns out we need $\log_{2}{n}$ steps. But how?

Isn't that neat? (Explore recurrence trees for a more scientific explanation of this topic.)

Merging

Going up each level of the recursion, we need to merge the sorted subarrays. Merging two sorted subarrays of length $n$ takes $O(n)$ time.

Intuitively, at each level, we thus need to do two kinds of work: Splitting (on the 'way down'), and merging (on the 'way up'). We can combine these works by multiplying:

$$\boxed{\text{ Total Work } = O(n) \cdot O(\log{n}) = O(n \log{n})} $$

Note that the base $2$ on the logarithm was removed because in Big O notation, we concern ourselves only with the behavior of the function, and the difference between base $10$ and base $x$ is only a constant factor. We use the change of base formula as proof:

$\log_{x}{n}= \frac{\log_{10}n}{\log_{10}{x}} = \frac{1}{\log{x}}\log{n} \approx \log{n}$




Return home.