How Python computes 2 + 5 under the hood (part 2)

(see also Part 1 of this post)

Representing Python objects at C level

In CPython, Python objects are represented as C structs. While struct members can vary depending on the object type, all PyObject instances contain at least the following two members, i.e. the so-called PyObject_HEAD:

  • ob_refcnt – the number of references to the object. Used for garbage
    collection purposes, since the objects that are not referred by anything anymore
    should be cleaned up to avoid memory leaks.
  • ob_type – a pointer to a type object, which is a special object describing
    the referencing object’s type.

The segment of the interpreter code for the BINARY_ADD instruction that was omitted for brevity in Part 1 is the following:

if (PyUnicode_CheckExact(left) &&
         PyUnicode_CheckExact(right)) {
    sum = unicode_concatenate(left, right, f, next_instr);
    /* unicode_concatenate consumed the ref to left */
else {
    sum = PyNumber_Add(left, right);

Here Python checks if the left and right operands are both Unicode instances, i.e. strings. It does that by inspecting their type objects. If both operands are indeed strings, it performs string concatenation on them, but for anything else the PyNumber_Add() function gets called. Since the operands 2 and 5 in our case are integers, this is exactly what happens. There is also some reference count management (the Py_DECREF() macro), but we will not dive into that.

PyNumberAdd() first tries to perform the add operation on the given operands v and w (two pointers to PyObject) by invoking binary_op1(v, w, NB_SLOT(nb_add)). If the result of that call is Py_NotImplemented, it further tries to concatenate the operands as sequences. This is not the case with integers, however, so let’s have a look at the binary_op1() function located in Objects/abstract.c file:

static PyObject *
binary_op1(PyObject *v, PyObject *w, const int op_slot)
    PyObject *x;
    binaryfunc slotv = NULL;
    binaryfunc slotw = NULL;

    if (v->ob_type->tp_as_number != NULL)
        slotv = NB_BINOP(v->ob_type->tp_as_number, op_slot);
    if (w->ob_type != v->ob_type &&
        w->ob_type->tp_as_number != NULL) {
        slotw = NB_BINOP(w->ob_type->tp_as_number, op_slot);
        if (slotw == slotv)
            slotw = NULL;
    if (slotv) {
        if (slotw && PyType_IsSubtype(w->ob_type, v->ob_type)) {
            x = slotw(v, w);
            if (x != Py_NotImplemented)
                return x;
            Py_DECREF(x); /* can't do it */
            slotw = NULL;
        x = slotv(v, w);
        if (x != Py_NotImplemented)
            return x;
        Py_DECREF(x); /* can't do it */
    if (slotw) {
        x = slotw(v, w);
        if (x != Py_NotImplemented)
            return x;
        Py_DECREF(x); /* can't do it */
Delegating the work to the right function

The binary_op1() function expects references to two Python objects and the binary operation that should be performed on them. The actual function that will perform this operation is obtained with the following:

NB_BINOP(v->ob_type->tp_as_number, op_slot)

Remember how each PyObject contains a reference to another object describing the former’s type, i.e. the ob_type struct member? For integers this is the PyLong_Type located in Objects/longobject.c.

PyLong_Type has the tp_as_number member, a reference to a structure holding pointers to all “number” methods available on Python int objects (integers in Python 3 are what is known as the long type in Python 2):

static PyNumberMethods long_as_number = {
    (binaryfunc)long_add,       /*nb_add*/
    (binaryfunc)long_sub,       /*nb_subtract*/
    (binaryfunc)long_mul,       /*nb_multiply*/
    long_mod,                   /*nb_remainder*/

Finally there is the NB_BINOP(nb_methods, slot) macro that picks a particular method from this list. Since in our case binary_op1() is invoked with NB_SLOT(nb_add) as the third argument, the function for adding two integers is returned.

Now, with two operands in the expression left + right, a decision needs to be made which operand should be used to pick the addition function from to compute the result. As explained in a helpful comment above the binary_op1() function, the order is as follows:

  • If right is a strict subclass of left, right.add(left, right) is tried.
  • left.add(left, right) is tried.
  • right.add(left, right) is tried (unless it hast already been tried in the first step).

Python tries to do its best to obtain a meaningful result, i.e. something other than NotImplemented, and if one of the operands does not support the operation, the other one is tried, too.

Nailing it

So which function is the one that actually computes the sum of 2 and 5 in the end?

It’s the long_add() function implemented in Objects/longobject.c. It is perhaps a bit more complex than expected, because it needs to support the addition of integers of arbitrary length, and still performing fast for integers small enough to fit into a CPU register.

Whoa! After all the digging down the rabbit hole we finally found the right function. Quite a lot of extra work for such a simple operation the addition is, but that’s the price we have to pay in order to get the Python’s dynamic nature in exchange. Remember that the same add(x, y) function we wrote in Part 1 of this post works out of the box with different operand types, and I hope the mechanisms behind the scenes that allow for this are now more clear.

>>> add(2, 5)
>>> add('2', '5')
>>> add([2], [5])
[2, 5]

As always, comments, suggestions, praise, and (constructive) criticism are all welcome. Thanks for reading!

How Python computes 2 + 5 under the hood (part 1)

Suppose we have a very simple Python function that accepts two arguments and returns their sum, and let’s name this function with an (un)imaginative name add:

def add(x, y):
    return x + y

>>> add (2, 5)

As a bonus, since Python is a dynamic language, the function also works with (some) other argument types out of the box. If given, say, two sequences, it returns their concatenation:

>>> add([1, 2], [3, 4])
[1, 2, 3, 4]
>>> add('foo', 'bar')

How does this work, you might ask? What happens behind the scenes when we invoke add()? We will see this in a minute.

Python 3.7 will be used in the examples (currently in beta 1 at the time of writing).

The dis module

For a start, we will inspect the add() function by using a handy built-in module dis.

>>> import dis
>>> bytecode = dis.Bytecode(add)
>>> print(
Name:              add
Filename:          <stdin>
Argument count:    2
Kw-only arguments: 0
Number of locals:  2
Stack size:        2
   0: None
Variable names:
   0: x
   1: y

Besides peeking into function’s metadata, we can also disassemble it:

>>> dis.dis(add)
  2           0 LOAD_FAST                0 (x)
              2 LOAD_FAST                1 (y)
              4 BINARY_ADD
              6 RETURN_VALUE

Disassembling shows that the function is comprised of four primitive bytecode instructions, which are understood and interpreted by the Python virtual machine.

Python is a stack-based machine

In CPython implementation, the interpreter is a stack-based machine, meaning that it does not have registers, but instead uses a stack to perform the computations.

The first bytecode instruction, LOAD_FAST, pushes a reference to a particular local variable onto the stack, and the single argument to the instruction specifies which variable that is. LOAD_FAST 0 thus picks a reference to x, because x is the first local variable, i.e. at index 0, which can be also seen from the function’s metadata presented just above.

Similarly, LOAD_FAST 1 pushes a reference to y onto the stack, resulting in the following state after the first two bytecode instructions have been executed:

    +---+  <-- TOS (Top Of Stack)
    | y |
    | x |

The next instruction, BINARY_ADD, takes no arguments. It simply takes the top two elements from the stack, performs an addition on them, and pushes the result of the operation back onto the stack.

At the end, RETURN_VALUE takes whatever the remaining element on the stack is, and returns that element to the caller of the add() function.

Going even deeper (enter the C level)

The bytecode instructions themselves are also just an abstraction, and something needs to make sense of them. That “something” is the Python interpreter. In CPython, its reference implementation, this is a program written in C that loops through the bytecode given to it, and interprets the instructions in it one by one.

The heart of this machinery is implemented in Python/ceval.c file. It runs an infinite loop that contains a giant switch statement with each case (target) handling on of the possible bytecode operations.

This is how the code for the instruction BINARY_ADD looks like:

    PyObject *right = POP();
    PyObject *left = TOP();
    PyObject *sum;

    /* the computation of the "sum" omitted */

    if (sum == NULL)
        goto error;

POP(), TOP(), and SET_TOP() are convenience C macros that perform primitive interpreter stack operations such as popping the top value from the stack, or replacing the current TOS (Top Of Stack) value with a different one.

The code above is actually pretty straightforward. It pops the right-hand operand from the top of the stack, which is a reference to y in our case, and stores it under the name right. It then also stores a reference to the left-hand side operand, i.e. x, that became the new TOS.

After performing the calculation, it sets the sum, i.e. a reference to the result, as the new TOS, performs a quick error check, and dispatches the control to the next bytecode instruction in the line.

In Part 2 the representation of Python objects at C level is explained, and how adding two such objects is done.

%d bloggers like this: