How can Async Code Work on a Dumb Computer?
I was wondering recently about how async code actually works.
I can use it alright, but I wanted to have a more intuitive grasp of how the code is actually executed, i.e., what the machine is actually doing and how can it accomplish that async flow.
Computers are dumb after all and they only do what they’re instructed to do… so I wanted to know what the trick was.
Note: this is not a primer on async programming – you can check out this excellent section on Python’s FastAPI docs for that – nor is it a tutorial on how to write async code in any particular language. Plenty of resources already exist for those topics.
This article is about bridging the gap between the low-level details of machine code execution and the very high-level details of async code execution.
A review of code execution models
First of all, let’s start with a high-level review of execution models.
First question: how many processors do we have?
Two options, really:
-
single processor
-
multi-processor
Parallelism – intended as doing multiple things at the same exact time – is only truly possible on multi-processor/core. Since what happens on one processor can be replicated on multiple processors, and since async execution doesn’t have much to do with parallel execution, we’ll just focus on a single processor.
A processor executes one instruction at a time, in sequence, one after the other. If we define a task (or job) as a sequence of instructions forming a complete unit of work, in the context of a single processor we can then have two execution models:
-
sequential execution: a job/task is completed fully before proceeding with the next,
-
concurrent execution (aka multi-tasking): running multiple tasks together in parallel (but not in the multi-processor sense). Since we’re in the context of a single processor, concurrent execution essentially means multiple tasks time-sharing the same single processor. Concurrent execution can then be further divided into
-
pre-empted multi-tasking, aka uncooperative, in which tasks are interrupted and interleaved by a scheduler,
-
cooperative multi-tasking, in which tasks willingly yield control.
-
Note that async (or event-based) execution is a form of cooperative multi-tasking!
The concepts are simple per se. Confusion arises, however, because of overlapping abstractions.
Enter the Operating System
The OS is the runtime system of our code. That is, it is the system and environment in which our code is executed.
On the OS, the execution model is pre-empted multi-tasking. In particular, we have concurrent execution of uncooperative, independent processes. Each process runs a bit before being interrupted (pre-empted) by the OS. The OS then lets the next process run a bit, and then the next… and so on. It does it all super quick so things look as if they’re running in parallel (but it’s an illusion).
Do note that the OS concurrency is completely transparent to the processes. So when we write our code we can abstract all that away and simply ignore it, just focusing on the single process running our code.
What about threads?
Threads are process segments: smaller units of execution inside of a process. They are run concurrently, just like processes by the OS. And just like processes, they’re useful to give the illusion that different portions of a process are running in parallel.
They’re a lower-level detail that doesn’t add much to our discussion about execution models. So that’s all I want to say about them in this article.
So how does async work?
If you have a bit of a CS background like me, then all the stuff above is probably not new to you. You know that the OS takes care of scheduling processes on the CPU. You know that the execution model is pre-emptive concurrency. And you know that computers are dumb and they just execute one instruction after the other. You know that there’s no magic in the box.
So when you read that async or event-based execution is a cooperative execution model in which code yields control while waiting for an event to happen… you may freak out and disbelieve the whole thing.
I did.
“How can it yield control? How can it know when the event happened? Is it just a glorified busy-waiting/polling?”
The trick is in the runtime. The key is realizing that our code is not necessarily the only thing running. Even in our dear independent process.
C was the first programming language I learned. In C, our source code gets compiled into binary code. And this binary code gets executed directly from the machine. Being a pretty low-level programming language, there’s little to no magic involved. You could have a look at the corresponding assembly code of a simple “Hello, World!” and you’d find it pretty straightforward.
Most modern programming languages, however, are a very different story: our code actually has its own runtime system and environment. Consider for example the Python interpreter, the JavaScript engine, the Java Virtual Machine, etc. The runtime – which is also a process on the OS! – runs our code and generally offers various amenities.
In particular, each runtime can then offer various execution models. These are “on top” of the concurrent execution of processes operated by the OS. The OS gives each process a certain slot of time it can use to do stuff. Then what each process does with that time is entirely up to the process itself. The runtime, in particular, uses its time slot to execute our code.
The runtime is actually running the show, including our code! It’s effectively reading our code and executing it. And that’s how we “yield control”. BTW, that’s also how some languages offer garbage collection and other niceties.
Here’s the gist: when it wants to yield control (e.g., waiting for an IO operation) the code uses a certain syntax to tell the runtime to pause it and resume execution once a certain event occurred.
The runtime actually has a framework to run our code in an async/event-based fashion! Usually, such a framework basically consists of
-
a callback mechanism, a way for the code to specify what instructions to execute when the event it’s interested in happens
-
a message queue, on which callback messages are posted once the event happens
-
a so-called event-loop, which is just a fancy name to describe a while loop reading messages from the queue and calling back the specified code.
And that’s it!
Some languages add some syntactic sugar on top to make the development experience nicer and more efficient. But that’s the gist of it.
And the trick behind it all is in the runtime. That’s the trick and that’s where the magic happens ✨
Stay in the loop
Get notified when I publish something new.