(see also Part 1 of this post)

##### Representing Python objects at C level

In CPython, Python objects are represented as C structs. While struct members can vary depending on the object type, all `PyObject`

instances contain at least the following two members, i.e. the so-called `PyObject_HEAD`

:

`ob_refcnt`

– the number of references to the object. Used for garbage

collection purposes, since the objects that are not referred by anything anymore

should be cleaned up to avoid memory leaks.`ob_type`

– a pointer to a*type object*, which is a special object describing

the referencing object’s type.

The segment of the interpreter code for the `BINARY_ADD`

instruction that was omitted for brevity in Part 1 is the following:

if (PyUnicode_CheckExact(left) && PyUnicode_CheckExact(right)) { sum = unicode_concatenate(left, right, f, next_instr); /* unicode_concatenate consumed the ref to left */ } else { sum = PyNumber_Add(left, right); Py_DECREF(left); } Py_DECREF(right);

Here Python checks if the `left`

and `right`

operands are both Unicode instances, i.e. strings. It does that by inspecting their type objects. If both operands are indeed strings, it performs string concatenation on them, but for anything else the PyNumber_Add() function gets called. Since the operands `2`

and `5`

in our case are integers, this is exactly what happens. There is also some reference count management (the `Py_DECREF()`

macro), but we will not dive into that.

`PyNumberAdd()`

first tries to perform the `add`

operation on the given operands `v`

and `w`

(two pointers to `PyObject`

) by invoking `binary_op1(v, w, NB_SLOT(nb_add))`

. If the result of that call is `Py_NotImplemented`

, it further tries to concatenate the operands as sequences. This is not the case with integers, however, so let’s have a look at the binary_op1() function located in `Objects/abstract.c`

file:

static PyObject * binary_op1(PyObject *v, PyObject *w, const int op_slot) { PyObject *x; binaryfunc slotv = NULL; binaryfunc slotw = NULL; if (v->ob_type->tp_as_number != NULL) slotv = NB_BINOP(v->ob_type->tp_as_number, op_slot); if (w->ob_type != v->ob_type && w->ob_type->tp_as_number != NULL) { slotw = NB_BINOP(w->ob_type->tp_as_number, op_slot); if (slotw == slotv) slotw = NULL; } if (slotv) { if (slotw && PyType_IsSubtype(w->ob_type, v->ob_type)) { x = slotw(v, w); if (x != Py_NotImplemented) return x; Py_DECREF(x); /* can't do it */ slotw = NULL; } x = slotv(v, w); if (x != Py_NotImplemented) return x; Py_DECREF(x); /* can't do it */ } if (slotw) { x = slotw(v, w); if (x != Py_NotImplemented) return x; Py_DECREF(x); /* can't do it */ } Py_RETURN_NOTIMPLEMENTED; }

##### Delegating the work to the right function

The `binary_op1()`

function expects references to two Python objects and the binary operation that should be performed on them. The actual function that will perform this operation is obtained with the following:

NB_BINOP(v->ob_type->tp_as_number, op_slot)

Remember how each `PyObject`

contains a reference to another object describing the former’s type, i.e. the `ob_type`

struct member? For integers this is the PyLong_Type located in `Objects/longobject.c`

.

`PyLong_Type`

has the `tp_as_number`

member, a reference to a structure holding pointers to all “number” methods available on Python `int`

objects (integers in Python 3 are what is known as the `long`

type in Python 2):

static PyNumberMethods long_as_number = { (binaryfunc)long_add, /*nb_add*/ (binaryfunc)long_sub, /*nb_subtract*/ (binaryfunc)long_mul, /*nb_multiply*/ long_mod, /*nb_remainder*/ ... }

Finally there is the NB_BINOP(nb_methods, slot) macro that picks a particular method from this list. Since in our case `binary_op1()`

is invoked with `NB_SLOT(nb_add)`

as the third argument, the function for *adding* two integers is returned.

Now, with two operands in the expression `left + right`

, a decision needs to be made which operand should be used to pick the addition function from to compute the result. As explained in a helpful comment above the `binary_op1()`

function, the order is as follows:

- If
`right`

is a strict subclass of`left`

,`right.add(left, right)`

is tried. `left.add(left, right)`

is tried.`right.add(left, right)`

is tried (unless it hast already been tried in the first step).

Python tries to do its best to obtain a meaningful result, i.e. something other than `NotImplemented`

, and if one of the operands does not support the operation, the other one is tried, too.

##### Nailing it

So which function is the one that actually computes the sum of `2`

and `5`

in the end?

It’s the `long_add()`

function implemented in `Objects/longobject.c`

. It is perhaps a bit more complex than expected, because it needs to support the addition of integers of arbitrary length, and still performing fast for integers small enough to fit into a CPU register.

Whoa! After all the digging down the rabbit hole we finally found the right function. Quite a lot of extra work for such a simple operation the addition is, but that’s the price we have to pay in order to get the Python’s dynamic nature in exchange. Remember that the same `add(x, y)`

function we wrote in Part 1 of this post works out of the box with different operand types, and I hope the mechanisms behind the scenes that allow for this are now more clear.

>>> add(2, 5) 7 >>> add('2', '5') '25' >>> add([2], [5]) [2, 5]

As always, comments, suggestions, praise, and (constructive) criticism are all welcome. Thanks for reading!