Distinguishing between None and missing values in dictionaries with a sentinel

Whoa, it’s been quite a while since my last blog post. This is, of course, not my fault (it never is 😛), let’s instead blame the architect of this world that only gave us 24 hours in a single day. That’s not nearly enough to do all the things one wants! Anyhow, now that I optimistically deluded myself that the excuse I just gave is convincing, I would like to share a tip when dealing with maybe-missing dictionary data.

This situation is quite common when receiving JSON data from an external APIs, which is normally handed to you, the programmer, parsed in a dictionary. Reading values from such dictionaries requires some caution to avoid unnecessary errors:

data = dict(foo=1, bar=2)
baz_value = data['baz']  # KeyError

The obvious, but somewhat naive, way of dealing with this is to first check if baz exists in the first place:

baz_value = data['baz'] if 'baz' in data else None

It works, but it’s verbose, and there is already a dict.get(key[, default]) method that does exactly that – return the value stored under key, if such key exists, otherwise return the default value (which defaults to None). It is thus a common practice to just use get() instead:

baz_value = data.get('baz')

However, sometimes there are cases when one actually needs to distinguish between the missing values and the values explicitly set to None. If get() returns None, it is not clear why it returned it without an additional check. It is also not recommended to use a different default value that can “never” possibly represent an actual value in the dictionary, as that assumption can change and break our code.

baz_value = data.get('baz', 'NO VALUE')

if baz_value == 'NO VALUE':
    # handle the "missing" case
else:
    # handle the "normal" case

If baz somehow happens to end up with the value 'NO VALUE' in data, the code above will not work as intended. One solution is to again use an explicit membership test, and only read the value if the latter succeeds:

if 'baz' not in data:
    # handle the "missing" case
else:
    baz_value = data['baz']
    # handle the "normal" case

The downside is that we need to do two dictionary lookups, one for the membership test, and another to actually retrieve the baz value.

Fortunately, there is an elegant fix to this – sentinels. A value picked for a sentinel must be something that is always uniquely distinguishable from the data. One option is to generate a unique string (with, say, uuid.uuid4()), which is “unique enough” for practical purposes, another is to instantiate a new object and test for identity:

MISSING = object()  # a sentinel
baz_value = data.get('baz', MISSING)

if baz_value is MISSING:
    # handle the "missing" case
elif baz_value is None:
    # handle the "normal" case when None
else:
    # handle all other "normal" cases

The explicit check for None is optional and you might not even need it, but it is nevertheless included in the above snippet to demonstrate all three possible outcomes, and how they can be handled with just a single dictionary lookup.

Dropping into a debugger with the built-in debugger() function

One of the nice little things Python 3.7 brought us is the new built-in function breakpoint(), which was proposed in PEP 553. As its name suggests, it allows us to create a breakpoint in Python source code.

Now, creating breakpoints in the source is nothing new, but it was always a bit tedious:

name = input("Enter name: ")
import pdb; pdb.set_trace()
user = find_user(name=name)

Quite a lot of typing, and just like the author of the linked PEP, I also mistype it quite often. When deep into the bug tracking, such typos and script re-runs are moderately annoying and unnecessarily contribute to the overall cognitive load. Typing breakpoint() is slightly easier.

Additionally, if the project is set up to use auto code formatting tools such as black on every build, these tools are happy to refactor this debugging snippet to follow the style guide, adding insult to injury:

name = input("Enter name: ")
import pdb

pdb.set_trace()  # argh!!
user = find_user(name=name)

On the other hand, the following looks considerably cleaner and does not suffer from the same problem:

name = input("Enter name: ")
breakpoint()
user = find_user(name=name)

When called, the breakpoint() function calls the sys.breakpointhook() hook. The latter drops you into the build-in debugger pdb by default, but you can override the hook to do something else, such as invoking a completely different debugger or trolling:

import sys

def trolling():
    raise RuntimeError("Debugging not allowed!")

sys.breakpointhook = trolling  # for the lulz

...

breakpoint()  # RuntimeError: Debugging not allowed!

The default implementation of the hook also allows customizing the breakpoint() behavior through the PYTHONBREAKPOINT environment variable (provided that the hook was not overridden as above):

  • If PYTHONBREAKPOINT is not set or set to the empty string, pdb.set_trace() is called.
  • If set to "0", breakpoint() returns immediately and does not do anything. Useful for quickly disabling all breakpoints without modifying the source.
  • If set to anything else, e.g. "ipdb.set_trace", the value is treated as the name of the function to import and run. If importing fails, a warning is issued and the breakpoint is a no-op.

Useful? Tell me what you think!

Enums to replace hardcoded string constants

TL; DR – If you just want to see how to make the following work with string enums:

FooEnum.BAR == "Bar"
# True, without having to say `FooEnum.BAR.value`

… scroll down to the Trick™ section. Otherwise keep reading.


You might have already seen an application that used string literals for common constant values. For example, in an app with business objects that can be in different states, the code such as the following can be found:

if obj.state == "Active":
    # do something

...

if all_done():
  obj.state = "Completed"

Object states are represented by strings such as "Open", "Active", and "Completed", and there are many of these scattered around the code. Needless to say this implementation is not the best – it is susceptible to typos, and renaming a state requires a find & replace operation that can never go wrong (right?). A better approach is thus to store state names into constants (“constants” by convention at least), so that any future renamings can be done in a single place:

STATE_NEW = "Open"
STATE_ACTIVE = "Active"
STATE_DONE = "Completed"
...

if obj.state == STATUS_ACTIVE:
    # do something

...

if all_done():
  obj.state = STATE_DONE

If there are more than a few such constants defined in the application, it makes sense to group the related ones into namespaces. The most straightforward way is to defining them as class members:

class State:
    NEW = "Open"
    ACTIVE = "Active"
    DONE = "Completed"


if obj.state == State.ACTIVE:
    # etc.

Neat.

The State class has several drawback, however. Its members can be modified. Its members can be deleted. It is not iterable, thus compiling a list of all possible states is not elegant (one needs to peek into the class __dict__).

>>> State.NEW = "Completed"
>>> del State.ACTIVE
>>> list(
        (key, val) for key, val in State.__dict__.items()
        if not key.startswith('__')
    )
[('NEW', 'Open'), ('ACTIVE', 'Active'), ('DONE', 'Completed')]
>>> list(State)
...at's
TypeError: 'type' object is not iterable

Canonical solution: Enums

Starting with Python 3.4, the standard library provides the enum module that addresses these shortcomings (there is also a backport for older Python versions).

from enum import Enum

class State(Enum):
    NEW = "Open"
    ACTIVE = "Active"
    DONE = "Completed"

The only change is that the State class now inherits from Enum, suddenly making it more robust:

>>> State.NEW = "Completed"
# AttributeError: Cannot reassign members.
>>> del State.NEW
# AttributeError: State: cannot delete Enum member.
>>> list(State)
[<State.NEW: 'Open'>, <State.ACTIVE: 'Active'>, <State.DONE: 'Completed'>]

Each enum member is an object that has a name and a value:

>>> type(State.NEW)
<enum 'State'>
>>> State.NEW.name
'NEW'
>>> State.NEW.value
'Open'

There is a caveat, however – enum members can no longer be directly compared to string values:

>>> state = fetch_object_state(obj)  # assume "Open"
>>> State.NEW == state
False  # !!!

In order to work as expected, an enum member’s value must be compared:

>>> State.NEW.value == state
True

This is unfortunate, because the extra .value part makes the expression more verbose, and people might (rightfully) start complaining about readability. Not to mention that it represents a trap, it is too easy to forget about the .value suffix.

The standard library provides the IntEnum that makes the following work:

from enum import IntEnum

class Color(IntEnum):
    WHITE = 5
    BLACK = 10

>>> Color.WHITE == 5
True

Sadly, there is not “StringEnum” class, and it seems that you are on your own if you have string members. This sole reason can make some developers even think about ditching an enum altogether in favor of a plain class (first hand experience).

The trick™

And now for the primary motivation for this post. Thank you for reading it to here. 🙂

It is possible to use an enum while still preserving the convenience of a plain class when comparing the enum members to plain values. The trick is to subclass the type of enum members!

class State(str, Enum):  # <-- look, here
    NEW = "Open"
    ACTIVE = "Active"
    DONE = "Completed"

>>> State.NEW == "Open"
True
>>> State.NEW.value == "Open"
True

Even though this is described in the enum docs, one has to scroll down quite a lot towards the last quarter of the page to find it, thus you cannot blame yourself if you missed it the first time when you were just looking for a quick recipe.

With some creativity it is even possible to construct enums with types other than just the typical boring integers or strings. In a chess program, one could find the following enum useful to represent the corners of the board:

class Corner(tuple, Enum):
    TOP_LEFT = ('A', 8)
    TOP_RIGHT = ('H', 8)
    BOTTOM_LEFT = ('A', 1)
    BOTTOM_RIGHT = ('H', 1)

>>> rook_position = ('H', 8)
>>> is_top_corner = rook_position in (Corner.TOP_LEFT, Corner.TOP_RIGHT)
>>> is_top_corner
True

If you learned something new and found this trick useful, feel free to drop me a note. Thank you for reading!

Python attribute lookup explained in detail

A few months ago I gave a talk at the local Python meetup on how attribute lookup on an object works in Python. It is not as straightforward as it looks like on the surface, and thought that it might be an interesting topic to present.

I received highly positive feedback from the listeners, confirming that they learned something new and potentially valuable. I used a Jupyter Notebook for the presentation, and if you prefer running the code examples to reading, you can jump straight into it – just download the notebook and play around with it. It contains quite a few comments, thus the examples should hopefully be self-explanatory.

Storing attributes on an object

Say we have the following instance:

class Foo(object):  # a new-style class
    x = 'x of Foo'

foo_inst = Foo()

We can inspect its attributes by peeking into the instances __dict__, which is currently empty, because the x from above belong to the instance’s class:

>>> foo_inst.__dict__
{}

Nevertheless, an attempt to retrieve x from the instance succeeds, because Python finds it in the instance’s class. The lookup is dynamic and a change to a class attribute is also reflected on the instance:

>>> foo_inst.x
'x of Foo'
>>> Foo.x = 'new x of Foo'
>>> foo_inst.x
'new x of Foo'

But what happens when both the instance and its class contain an attribute with the same name? Which one takes precedence? Let’s inject x into the instance and observe the result:

>>> foo_inst.__dict__['x'] = 'x of foo_inst'
>>> foo_inst.__dict__
{'x': 'x of foo_inst'}
>>> foo_inst.x
'x of foo_inst'

No surprises here, the x is looked up on the instance first and found there. If we now remove the x, it will be picked from the class again:

>>> del foo_inst.__dict__['x']
>>> foo_inst.__dict__
{}
>>> foo_inst.x
'new x of Foo'

As demonstrated, instance attributes take precedence over class attributes – with a caveat. Contrary to what a quite lot of people think, this is not always the case, and sometimes class attributes shadow the instance attributes. Enter descriptors.

Descriptors

Descriptors are special objects that can alter the interaction with attributes. For an object to be a descriptor, it needs to define at least one of the following special methods: __get__(), __set__(), or __delete__().

class DescriptorX(object):

    def __get__(self, obj, obj_type=None):
        if obj is None:
            print('__get__(): Accessing x from the class', obj_type)
            return self

        print('__get__(): Accessing x from the object', obj)
        return 'X from the descriptor'

    def __set__(self, obj, value):
        print('__set__(): Setting x on the object', obj)
        obj.__dict__['x'] = '{0}|{0}'.format(value)

The class DescriptorX conforms to the given definition, and we can instantiate it to turn the attribute x into a descriptor:

>>> Foo.x = DescriptorX()
>>> Foo.__dict__['x']
<__main__.DescriptorX at 0x7fa0b2ff3790>

Accessing a descriptor does not simply return it as it is the case with non-descriptor attributes, but instead invokes its __get__() method and return the result.

>>> Foo.x
# prints: __get__(): Accessing x from the class <class '__main__.Foo'>
<__main__.DescriptorX at 0x7fa0b2ff3790>

Even though the result is actually the descriptor itself, the extra line printed to output tells us that its __get__() method was indeed invoked, returning the descriptor.

The __get__() method receives two arguments – the instance on which an attribute was looked up (can be None if accessing an attribute on a class), and the “owner” class, i.e. the class containing the descriptor instance.

Let’s see what happens if we access a descriptor on an instance of a class, and that instance also contains an attribute with the same name:

>>> foo_inst.__dict__['x'] = 'x of foo_inst is back'
>>> foo_inst.__dict__
{'x': 'x of foo_inst is back'}
>>> foo_inst.x
# prints: __get__(): Accessing x from the object <__main__.Foo object at 0x7fe2bc613350>
'X from the descriptor'

The result might surprise you – the descriptor (defined on the class) took precedence over the instance attribute!

Overriding and non-overriding descriptors

The story does not end here, however – sometimes a descriptor does not take precedence:

>>> del DescriptorX.__set__
>>> foo_inst.x
'x of foo_inst is back'

It turns out there are actually two kinds of descriptors:

  • Data descriptors (overriding) – they define the __set__() and/or the __delete__() method (but normally __set__() as well) and take precedence over instance attributes.
  • Non-data descriptors (non-overriding) – they define only the __get__() method and are shadowed by an instance attribute of the same name.

If descriptor behavior seems similar to a property to you, it is because properties are actually implemented as descriptors behind the scenes. The same goes for class methods, ORM attributes on data models, and several other constructs.

from sqlalchemy import Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class SomeClass(Base):
    __tablename__ = 'some_table'
    id = Column(Integer, primary_key=True)  # a descriptor
    name =  Column(String(50))  # a descriptor

    @property  # creates a descriptor under the name "foo"
    def foo(self):  
        return 'foo'

    @classmethod  # creates a descriptor, too
    def make_instance(cls, **kwargs):
        return cls(**kwargs)

Traversing the inheritance hierarchy

Sometimes an attribute does not exist on a class/instance, but Python does not give up just yet. It continues searching the parent classes, as the attribute might be found there.

Consider the following hierarchy:

class A(object): pass

class B(object):
    x = 'x from B'

class C(A, B): pass

class D(B):
    x = 'x from D'

class E(C, D): pass

Or in a picture, because a picture is worth a thousand words:

Class hierarchy

The attribute x is defined both on the class D and class B. If accessing it on an instance of E that does not have it, the lookup still succeeds:

>>> e_inst = E()
>>> e_inst.x
'x from D'

The thing to observe here is that, apparently, the lookup algorithm is not depth-first search, otherwise x would be first found on B. The algorithm used is also not breadth-first search, otherwise x would still be picked from D in the following example, but it is instead retrieved from A:

>>> A.x = 'x from A'
>>> e_inst.x
'x from A'

The actual lookup order can be seen by inspecting the method resolution order (MRO) of a class:

>>> E.__mro__
(__main__.E, __main__.C, __main__.A, __main__.D, __main__.B, object)

Python uses the C3 linearization algorithm to construct it and to decide how to traverse the class hierarchy.

What about metaclasses?


Interlude – what is a metaclass?
Putting it simply, a metaclass is a “thing” that can create a new class in the same way a class can be used to create new objects, i.e. instances of itself:

  • metaclass() —> a new Class (an instance of metaclass)
  • Class() —> a new object (an instance of Class)

Or by example:

# "AgeMetaclass" inherits from the metaclass "type",
# thus "AgeMetaclass" is a metaclass, too
class AgeMetaclass(type):
    age = 18

# create an instance of a metaclass to produce a class
Person = AgeMetaclass('Person', (object,), {'age': 5})  # name, base classes, class attributes

# the above is the same as using the standard class definition syntax:
class Person(object):
    __metaclass__ = AgeMetaclass
    age = 5

# NOTE: in Python 3 the metaclass would be specified differently:
class Person(metaclass=AgeMetaclass):
   age = 5  

If an attribute is found on a class, its metaclass does not interfere, nor does it interfere when looking up an attribute on an instance:

>>> Person.age
5
>>> john_doe = Person()
>>> john_doe.age
5

On the other hand, if an attribute is not found on the class, it s looked up on its metaclass:

>>> del Person.age
>>> Person.age
18

There is a caveat, however – a metaclass is not considered when accessing an attribute on a class instance:

>>> john_doe.age
# AttributeError: 'Person' object has no attribute 'age'

The lookup only goes one layer up. It inspects the class of an instance, or a metaclass of a class, but not an “indirect metaclass”1 of a class instance.

What happens if an attribute is not found?

Python does not give up just yet. If implemented, it uses the __getattr__() hook on the class as a fallback.

class Product(object):
    def __init__(self, label):
        self.label = label

    def __getattr__(self, name):
        print('attribute "{}" not found, but giving you a foobar tuple!'.format(name))
        return ('foo', 'bar')

Let’s access an attribute that exists, and then an attribute that does not:

>>> chair = Product('dining chair DC-745')
>>> chair.label
'dining chair DC-745'
>>> chair.manufacturer
# prints: attribute "manufacturer" not found, but giving you a foobar tuple!
('foo', 'bar')

Because of the fallback, the AttributeError was not raised. Just keep in mind that defining __getattr__() on an instance instead of on a class will not work:

>>> del Product.__getattr__
>>> chair.__getattr__ = lambda self, name: 'instance __getattr__'
>>> chair.unknown_attr
# AttributeError: 'Product' object has no attribute 'unknown_attr'
NOTE: __getattr__() or __getattribute__() ?

__getattr__() should not be confused with __getattribute__(). The former is a fallback for missing attributes as demonstrated above, while the latter is the method that gets invoked on attribute access, i.e. when using the “dot” operator. It implements the lookup algorithm explained in this post, but can be overriden and customized. The default implementation is in C function _PyObject_GenericGetAttrWithDict().

Most of the time, however, it is probably the __getattr__() method that you want to override.

Summary

Accessing an attribute on a (new-style) class instance invokes the __getattribute__() method that performs the following:

  • Check the class hierarchy using MRO (but do not examine metaclasses):
    • If a data (overriding) descriptor is found in class hierarchy, call its __get__() method;
  • Otherwise check the instance __dict__ (assuming no __slots__ for the sake of example). If an attribute is there, return it;
  • If attribute not in instance.__dict__ but found in the class hierarchy:
    • If (non-data) descriptor, call its __get__() method;
    • If not a descriptor, return the attribute itself;
  • If still not found, invoke __getattr__(), if implemented on a class;
  • Finally give up and raise AttributeError.

  1. I totally made this term up, do not use it in a conversation when trying to sound smart. :) 

How Python computes 2 + 5 under the hood (part 2)

(see also Part 1 of this post)

Representing Python objects at C level

In CPython, Python objects are represented as C structs. While struct members can vary depending on the object type, all PyObject instances contain at least the following two members, i.e. the so-called PyObject_HEAD:

  • ob_refcnt – the number of references to the object. Used for garbage
    collection purposes, since the objects that are not referred by anything anymore
    should be cleaned up to avoid memory leaks.
  • ob_type – a pointer to a type object, which is a special object describing
    the referencing object’s type.

The segment of the interpreter code for the BINARY_ADD instruction that was omitted for brevity in Part 1 is the following:

if (PyUnicode_CheckExact(left) &&
         PyUnicode_CheckExact(right)) {
    sum = unicode_concatenate(left, right, f, next_instr);
    /* unicode_concatenate consumed the ref to left */
}
else {
    sum = PyNumber_Add(left, right);
    Py_DECREF(left);
}
Py_DECREF(right);

Here Python checks if the left and right operands are both Unicode instances, i.e. strings. It does that by inspecting their type objects. If both operands are indeed strings, it performs string concatenation on them, but for anything else the PyNumber_Add() function gets called. Since the operands 2 and 5 in our case are integers, this is exactly what happens. There is also some reference count management (the Py_DECREF() macro), but we will not dive into that.

PyNumberAdd() first tries to perform the add operation on the given operands v and w (two pointers to PyObject) by invoking binary_op1(v, w, NB_SLOT(nb_add)). If the result of that call is Py_NotImplemented, it further tries to concatenate the operands as sequences. This is not the case with integers, however, so let’s have a look at the binary_op1() function located in Objects/abstract.c file:

static PyObject *
binary_op1(PyObject *v, PyObject *w, const int op_slot)
{
    PyObject *x;
    binaryfunc slotv = NULL;
    binaryfunc slotw = NULL;

    if (v->ob_type->tp_as_number != NULL)
        slotv = NB_BINOP(v->ob_type->tp_as_number, op_slot);
    if (w->ob_type != v->ob_type &&
        w->ob_type->tp_as_number != NULL) {
        slotw = NB_BINOP(w->ob_type->tp_as_number, op_slot);
        if (slotw == slotv)
            slotw = NULL;
    }
    if (slotv) {
        if (slotw && PyType_IsSubtype(w->ob_type, v->ob_type)) {
            x = slotw(v, w);
            if (x != Py_NotImplemented)
                return x;
            Py_DECREF(x); /* can't do it */
            slotw = NULL;
        }
        x = slotv(v, w);
        if (x != Py_NotImplemented)
            return x;
        Py_DECREF(x); /* can't do it */
    }
    if (slotw) {
        x = slotw(v, w);
        if (x != Py_NotImplemented)
            return x;
        Py_DECREF(x); /* can't do it */
    }
    Py_RETURN_NOTIMPLEMENTED;
}
Delegating the work to the right function

The binary_op1() function expects references to two Python objects and the binary operation that should be performed on them. The actual function that will perform this operation is obtained with the following:

NB_BINOP(v->ob_type->tp_as_number, op_slot)

Remember how each PyObject contains a reference to another object describing the former’s type, i.e. the ob_type struct member? For integers this is the PyLong_Type located in Objects/longobject.c.

PyLong_Type has the tp_as_number member, a reference to a structure holding pointers to all “number” methods available on Python int objects (integers in Python 3 are what is known as the long type in Python 2):

static PyNumberMethods long_as_number = {
    (binaryfunc)long_add,       /*nb_add*/
    (binaryfunc)long_sub,       /*nb_subtract*/
    (binaryfunc)long_mul,       /*nb_multiply*/
    long_mod,                   /*nb_remainder*/
    ...
}

Finally there is the NB_BINOP(nb_methods, slot) macro that picks a particular method from this list. Since in our case binary_op1() is invoked with NB_SLOT(nb_add) as the third argument, the function for adding two integers is returned.

Now, with two operands in the expression left + right, a decision needs to be made which operand should be used to pick the addition function from to compute the result. As explained in a helpful comment above the binary_op1() function, the order is as follows:

  • If right is a strict subclass of left, right.add(left, right) is tried.
  • left.add(left, right) is tried.
  • right.add(left, right) is tried (unless it hast already been tried in the first step).

Python tries to do its best to obtain a meaningful result, i.e. something other than NotImplemented, and if one of the operands does not support the operation, the other one is tried, too.

Nailing it

So which function is the one that actually computes the sum of 2 and 5 in the end?

It’s the long_add() function implemented in Objects/longobject.c. It is perhaps a bit more complex than expected, because it needs to support the addition of integers of arbitrary length, and still performing fast for integers small enough to fit into a CPU register.

Whoa! After all the digging down the rabbit hole we finally found the right function. Quite a lot of extra work for such a simple operation the addition is, but that’s the price we have to pay in order to get the Python’s dynamic nature in exchange. Remember that the same add(x, y) function we wrote in Part 1 of this post works out of the box with different operand types, and I hope the mechanisms behind the scenes that allow for this are now more clear.

>>> add(2, 5)
7
>>> add('2', '5')
'25'
>>> add([2], [5])
[2, 5]

As always, comments, suggestions, praise, and (constructive) criticism are all welcome. Thanks for reading!

How Python computes 2 + 5 under the hood (part 1)

Suppose we have a very simple Python function that accepts two arguments and returns their sum, and let’s name this function with an (un)imaginative name add:

def add(x, y):
    return x + y

>>> add (2, 5)
7

As a bonus, since Python is a dynamic language, the function also works with (some) other argument types out of the box. If given, say, two sequences, it returns their concatenation:

>>> add([1, 2], [3, 4])
[1, 2, 3, 4]
>>> add('foo', 'bar')
'foobar'

How does this work, you might ask? What happens behind the scenes when we invoke add()? We will see this in a minute.

Python 3.7 will be used in the examples (currently in beta 1 at the time of writing).

The dis module

For a start, we will inspect the add() function by using a handy built-in module dis.

>>> import dis
>>> bytecode = dis.Bytecode(add)
>>> print(bytecode.info())
Name:              add
Filename:          <stdin>
Argument count:    2
Kw-only arguments: 0
Number of locals:  2
Stack size:        2
Flags:             OPTIMIZED, NEWLOCALS, NOFREE
Constants:
   0: None
Variable names:
   0: x
   1: y

Besides peeking into function’s metadata, we can also disassemble it:

>>> dis.dis(add)
  2           0 LOAD_FAST                0 (x)
              2 LOAD_FAST                1 (y)
              4 BINARY_ADD
              6 RETURN_VALUE

Disassembling shows that the function is comprised of four primitive bytecode instructions, which are understood and interpreted by the Python virtual machine.

Python is a stack-based machine

In CPython implementation, the interpreter is a stack-based machine, meaning that it does not have registers, but instead uses a stack to perform the computations.

The first bytecode instruction, LOAD_FAST, pushes a reference to a particular local variable onto the stack, and the single argument to the instruction specifies which variable that is. LOAD_FAST 0 thus picks a reference to x, because x is the first local variable, i.e. at index 0, which can be also seen from the function’s metadata presented just above.

Similarly, LOAD_FAST 1 pushes a reference to y onto the stack, resulting in the following state after the first two bytecode instructions have been executed:

    +---+  <-- TOS (Top Of Stack)
    | y |
    +---+
    | x |
 ---+---+---

The next instruction, BINARY_ADD, takes no arguments. It simply takes the top two elements from the stack, performs an addition on them, and pushes the result of the operation back onto the stack.

At the end, RETURN_VALUE takes whatever the remaining element on the stack is, and returns that element to the caller of the add() function.

Going even deeper (enter the C level)

The bytecode instructions themselves are also just an abstraction, and something needs to make sense of them. That “something” is the Python interpreter. In CPython, its reference implementation, this is a program written in C that loops through the bytecode given to it, and interprets the instructions in it one by one.

The heart of this machinery is implemented in Python/ceval.c file. It runs an infinite loop that contains a giant switch statement with each case (target) handling on of the possible bytecode operations.

This is how the code for the instruction BINARY_ADD looks like:

TARGET(BINARY_ADD) {
    PyObject *right = POP();
    PyObject *left = TOP();
    PyObject *sum;

    /* the computation of the "sum" omitted */

    SET_TOP(sum);
    if (sum == NULL)
        goto error;
    DISPATCH();
}

POP(), TOP(), and SET_TOP() are convenience C macros that perform primitive interpreter stack operations such as popping the top value from the stack, or replacing the current TOS (Top Of Stack) value with a different one.

The code above is actually pretty straightforward. It pops the right-hand operand from the top of the stack, which is a reference to y in our case, and stores it under the name right. It then also stores a reference to the left-hand side operand, i.e. x, that became the new TOS.

After performing the calculation, it sets the sum, i.e. a reference to the result, as the new TOS, performs a quick error check, and dispatches the control to the next bytecode instruction in the line.

In Part 2 the representation of Python objects at C level is explained, and how adding two such objects is done.

Giving Python slices a name

Python makes it easy for a developer to work with sequence types such as lists, strings, tuples, and others. This is especially true when extracting sub-sequences from a given sequence.

>>> vowels = ['A', 'E', 'I', 'O', 'U']
>>> vowels[1:3]
['E', 'I']
>>> vowels[3:5]
['O', 'U']
>>> vowels[-4:-2]
['E', 'I']

Out of bound indexes are gracefully handled:

>>> vowels[2:99]
['I', 'O', 'U']
>>> vowels[-5:2]
['A', 'E']

Omitted start/end indexes default to the beginning/end for the sequence, respectively:

>>> vowels[:2]
['A', 'E']
>>> vowels[-2:]
['O', 'U']

If given a step n, only every n-th item in the specified range is included in the result:

>>> vowels[::2]
['A', 'I', 'U']

Step can also be a negative number:

>>> vowels[4:1:-1]
['U', 'O', 'I']
Slice objects

When using the “extended indexing” syntax (made up the name) from above, what actually happens behind the scenes is that a slice() object is created and passed to the sequence object being sliced. The following two expressions are thus equivalent:

>>> vowels[4:2:-1]
['U', 'O']
>>> vowels[slice(4, 2, -1)]
['U', 'O']

This is great, because it allows us to assign descriptive names to slices, and possibly reusing them if the same sub-slice is used at more than a single place:

>>> FIRST_THREE = slice(0, 3)
>>> ODD_ITEMS = slice(1, None, 2)
>>> vowels[FIRST_THREE]
['A', 'E', 'I']
>>> 'abcdef'[FIRST_THREE]
'abc'
>>> vowels[ODD_ITEMS]
['E', 'O']
>>> 'abcdef'[ODD_ITEMS]
'bdf'
Adding support for slicing to custom objects

It’s worth noting that object slicing is not something that is automatically given to us, Python merely allows us to implement support for it ourselves, if we want so.

When the square brackets notation ([]) is used, Python tries to invoke the __getitem__() magic method on the object, passing the given key to it as an argument. That method can be overridden to define custom indexing behavior.

As an example, let’s try to create a class whose instances can be queried for balance. Even if an instance itself does not contain anything, it will somehow caclulate the required amount out of thin air and return that made up number to us. We will call that class a Bank.

class Bank(object):
    """Can create money out of thin air."""

    def __getitem__(self, key):
        if not isinstance(key, (int, slice)):
            raise TypeError('Slice or integer index expected')

        if isinstance(key, int):
            return idx

        # key is a slice() instance
        start = key.start if isinstance(key.start, int) else 0
        stop = key.stop if isinstance(key.stop, int) else 0
        step = key.step if isinstance(key.step, int) else 1
        return sum(range(start, stop, step))

If we query a Bank instance (by indexing it) with a single integer, it will simply return us the amount equal to the given index. If queried by a range (slice), however, it will return the sum of all indices contained in it:1

>>> b = Bank()                                                                                                                                                                                                                                                                                                                                     
>>> b[7]                                                                                                                                                                                                                                                                                                                                           
7
>>> b[-3:0]                                                                                                                                                                                                                                                                                                                                      
-6
>>> b[0:7:2]                                                                                                                                                                                                                                                                                                                                      
12
>>> bank[::]
0 
>>> bank['':5:{}]
10

As the last example demonstrates, slices can contain just about any value, not just integers and None, thus the Bank class must check for these cases and use defaults if needed.

Just a word of caution – if sub-classing built-in types in Python 2, and want to implement custom slicing behavior, you need to override the deprecated __getslice__() method (documentation).


  1. Not saying that this is actually the best way to run a bank in real life, nor that (ab)using slices like this will make you popular with people using your sliceable class… 

Handling application settings hierarchy more easily with Python’s ChainMap

Say you have an application that, as almost every more complex application, exposes some settings that affect how that application behaves. These settings can be configured in several ways, such as specifying them in a configuration file, or by using the system’s environment variables. They can also be provided as command line arguments when starting the application.

If the same setting is provided at more than a single configuration layer, the value from the layer with the highest precedence is taken. This could mean that, for example, configuration file settings override those from environment variables, but command line arguments have precedence over both. If a particular setting is not specified in any of these three layers, its default value is used.

Ordered by priority (highest first), our hierarchy of settings layers looks as follows:

  • command line arguments
  • configuration file
  • environment variables
  • application defaults

A natural way of representing settings is by using a dictionary, one for each layer. An application might thus contain the following:

defaults = {
    'items_per_page': 100,
    'log_level': 'WARNING',
    'max_retries': 5,
}

env_vars = {
    'max_retries': 10
}

config_file = {
    'log_level': 'INFO'
}

cmd_args = {
    'max_retries': 2
}

Putting them all together while taking the precedence levels into account, here’s how the application settings would look like:

>>> settings  # all settings put together
{
    'items_per_page': 100,  # a default value
    'log_level': 'INFO',  # from the config file
    'max_retries': 2,  # from a command line argument
}

There is a point in the code where a decision needs to be made on what setting value to consider. One way of of determining that is by examining the dictionaries one by one until the setting is found:1

setting_name = 'log_level'

for settings in (cmd_args, config_file, env_vars, defaults):
    if setting_name in settings:
        value = settings[setting_name]
        break
else:
    raise ValueError('Setting not found: {}'.format(setting_name))

# do something with value...

A somewhat verbose approach, but at least better than a series of if-elif statements.

An alternative approach is to merge all settings into a single dictionary before using any of their values:

settings = defaults.copy()
for d in (env_vars, config_file, cmd_args):
    settings.update(d)

Mind that here the order of applying the settings layers must be reversed, i.e. the highest priority layers get applied last, so that no lower-level layers can override them. This works quite well, with a possibly only downside that if any of the underlying settings dictionaries get updated, the “main” settings dictionary must be rebuilt, because the changes are not reflected in it automatically:

>>> config_file['items_per_page'] = 25
>>> settings['items_per_page']
100  # change not propagated

Now, I probably wouldn’t be writing about all this, if there didn’t already exist an elegant solution in the standard library. Python 3.32 brought us collections.ChainMap, a handy class that can transparently group multiple dicts together. We just need to pass it all our settings layers (higher-priority ones first), and ChainMap takes care of the rest:

>>> from collections import ChainMap
>>> settings = ChainMap(cmd_args, config_file, env_vars, defaults)
>>> settings['items_per_page']
100
>>> env_vars['items_per_page'] = 25
>>> settings['items_per_page']
25  # yup, automatically propagated

Pretty handy, isn’t it?


  1. In case you spot a mis-indented else block – the indentation is actually correct. It’s a for-else statement, and the else block only gets executed if the loop finishes normally, i.e. without break-ing out of it. 
  2. There is also a polyfill for (some) older Python versions. 

Keyword-only arguments in Python

Python allows you to pass arguments in two ways – either by their position or by their name (keyword):1

def foo(x, y):
    print(x, y)

# the following all print 5 10
foo(5, 10)
foo(5, y=10)
foo(x=5, y=10)  

It works the same if a function accepts keyword arguments:2

def bar(x=5, y=10):
    print(x, y)

# the following all print 4 8
bar(4, 8)
bar(4, y=8)
bar(x=4, y=8)

If a particular keyword argument is not provided, its default value is used:

>>> bar(2)  # y is omitted
2 10

It is quite common to use keyword arguments to define “extra” options that can be passed to a callable, and if omitted, their default value is used. That can improve readability, because it is enough to only pass the options whose value differs from the default:

def cat(x, y, to_upper=False, strip=True):
    """Concatenate given strings."""
    if strip:
       x, y = x.strip(), y.strip()

    result = x + y

    if to_upper:
        result = result.upper()

    return result


# returns 'foobar'
cat('  foo ', 'bar ')

# the following both return '  foo bar '
cat('  foo ', 'bar ', False, False)
cat('  foo ', 'bar ', strip=False)

You will probably agree that the second form is indeed more clean.

Positional arguments pitfalls

The ability to pass keyword arguments by position as demonstrated in the introduction can, if not careful, bite you, especially if you do not have a thorough test coverage in place as a safety net. Let’s say that there is a piece of code which invokes the cat() function using positional arguments only:

cat('  foo ', 'bar ', True)  # returns 'FOOBAR'

Let’s also say that suddenly one of the team members gets an inspiration and decides that it would be great to sort all keyword parameters alphabetically. You know, for readability. Before you can express your doubt, he eagerly refactors the function, swapping the two keyword parameters:

def cat(x, y, strip=True, to_upper=False):
    ...

If you have proper tests in place, good for you, but if you don’t, you might not realize, that this change just introduced a bug:

cat('  foo ', 'bar ', True)  # now returns 'foobar'

The poor 'FOOBAR' return value just got demoted to its lowercase version. This would not have happened if the option would be passed as a keyword argument, i.e. to_upper=True.

Another source of potential errors is accidentally passing an option value through a positional argument. Let’s imagine another contrived scenario where a new team member uses intuition to deduce how the cat() function works. Of course – it’s just a version of sum() adjusted to work with strings!

>>> cat('  foo ', 'bar ', 'baz')  # the original cat
'FOOBAR'

Erm…

The option to_upper was assigned the value 'baz' which is truthy, but it is probably not what the caller intended to achieve.

It can be argued that this behavior is a bit unintuitive, and that it would be nice if we could somehow force the callers to explicitly pass keyword arguments by their name (keyword), and not their position.

Making the arguments keyword-only (Python 3)

The trick is to swallow any redundant positional arguments, preventing them from filling the keyword arguments:

def demo(x, y, *args, separator='___'):
    print(x, y, args, sep=separator)


>>> demo(10, 20, 30, 40)
10___20___(30, 40)

>>> demo(10, 20, 30, 40, separator='---')
10---20---(30, 40)

Any positional arguments beyond the first two (30 and 40) get swallowed by the args tuple, and the only way to specify a different separator is through an explicit keyword argument. To complete the picture, we just need to prevent the callers to pass in too many positional arguments, and we can do this with a simple check if args is not empty:

if args:
    raise TypeError('Too many positional arguments given.')

What’s more, if we omit the variable arguments tuple’s name altogether, we get the above check for free!
Plus a useful error message on top of it, demo:

def demo2(x, y, *, separator='___'):
    print(x, y, sep=separator)                                                                                                                                                                                                                                                                                           


>>> demo2(1, 2, 3, separator=';')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: demo2() takes 2 positional arguments but 3 positional arguments (and 1 keyword-only argument) were given
Keyword-only arguments in Python 2

Unfortunately, if we try the same approach in Python 2, it will complain and raise a syntax error. We cannot specify individual keyword arguments after *args, but we can specify that a function accepts a variable number of keyword arguments, and then manually unpack it:

def foo(x, y, *args, **kwargs):
    option_a = kwargs.pop('option_a', 'default_A')
    option_b = kwargs.pop('option_b', 'default_B')

    if args or kwargs:
        raise TypeError('Too many positional and/or keyword arguments given.')

We also have to manually check if the caller has passed any unexpected positional and/or keyword arguments by testing the args tuple and the kwargs dict (after popping all expected items from it) – both should be empty.

Cumbersome indeed, not to mention that the function signature is not as pretty as it could be. But that’s what we have in Python 2.


  1. The following few examples assume either Python 3, or that print_function is imported from the __future__ module if using Python 2. 
  2. The fourth invocation option, i.e. saying foo(x=5, 10), is not listed, because it is a syntax error – positional arguments must precede any keyword arguments. 

Applying a decorator to a class method results in an error

You might have seen this one before – you wrote a decorator in Python and tried to apply it to a class method (or static method, for that matter), only to see an error.

from functools import wraps

def logged(func):
    """A decorator printing a message before invoking the wrapped function."""
    @wraps(func)
    def wrapped_func(*args, **kwargs):
        print('Invoking', func)
        return func(*args, **kwargs)
    return wrapped_func


class Foo(object):
    @logged
    @classmethod
    def get_name(cls):
        return cls.__name__

As the docstring explains, the logged decorator simply prints a message before invoking the decorated function, and it is applied to the get_name() class method of the class Foo. The @wraps decorator makes sure the original function’s metadata is copied to the wrapper function returned by the decorator (docs).

But despite this essentially being a textbook example of a decorator in Python, invoking the get_name() method results in an error (using Python3 below):

>>> Foo.get_name()
Invoking <classmethod object at 0x7f8e7473e0f0>
Traceback (most recent call last):
    ...
TypeError: 'classmethod' object is not callable

If you just want to quickly fix this issue, because it annoys you, here’s the TL;DR fix – just swap the order of the decorators, making sure that the @classmethod decorator is applied last:

class Foo(object):
    @classmethod
    @logged
    def get_name(cls):
        return cls.__name__

>>> Foo.get_name()
Invoking <function Foo.get_name at 0x7fce90356c80>
'Foo'

On the other hand, if you are curious what is actually happening behind the scenes, please keep reading.

The first thing to note is the output in each example immediately after calling Foo.get_name(). Our decorator prints the object that is about to invoke in the very next line, and in the non-working example that object is actually not a function!

Invoking <classmethod object at 0x7f8e7473e0f0>

Instead, the thing that our decorator tries to invoke is a “classmethod” object, but the latter is not callable, causing the Python interpreter to complain.

Meet descriptors

Let’s take a closer look at a stripped-down version of the Foo class:

class Foo(object):
    @classmethod
    def get_name(cls):
        return cls.__name__

>>> thing = Foo.__dict__['get_name']
>>> thing
<classmethod object at 0x7f295ffc6d30>
>>> hasattr(thing, '__get__')
True
>>> callable(thing)
False

As it turns out, get_name is an object which is not callable, i.e. we can not say get_name() and expect it to work. By the presence of the __get__ attribute we can also see, that it is a descriptor.

Descriptors are object that behave differently than “normal” attributes. When accessing a descriptor, what happens is that its __get__() method gets called behind the scenes, returning the actual value. The following two expressions are thus equivalent:

>>> Foo.get_name
<bound method Foo.get_name of <class '__main__.Foo'>>
>>> Foo.__dict__['get_name'].__get__(None, Foo)
<bound method Foo.get_name of <class '__main__.Foo'>>

__get__() gets called with two parameters – the object instance the attribute belongs to (None here, because accessing the attribute through a class), and the owner class, i.e. the one the descriptor is defined on (Foo in this case)1.

What the classmethod descriptor does is binding the original get_name() function to its class (Foo), and returning a bound method object. When the latter gets called, it invokes get_name(), passing class Foo as the first argument (cls) along with any other arguments the bound method was originally called with.

Armed with this knowledge it is now clear why our logged decorator from the beginning does not always work. It assumes that the object passed to it is directly callable, and does not take the descriptor protocol into account.

Making it right

Describing how to adjust the logged decorator to work correctly is quite a lengthy topic, and out of scope of this post. If interested, you should definitely read the blog series by Graham Dumpleton, as it addresses many more aspects than just working well with classmethods. Or just use his wrapt library for writing decorators:

import wrapt

@wrapt.decorator
def logged(wrapped, instance, args, kwargs):
    print('Invoking', wrapped)
    return wrapped(*args, **kwargs)

class Foo(object):
    @logged
    @classmethod
    def get_name(cls):
        return cls.__name__

>>> Foo.get_name()
Invoking <bound method Foo.get_name of <class 'main2.Foo'>>
'Foo'

Yup, it works.


  1. On the other hand, if retrieving a descriptor object directly from the class’s __dict__, the descriptor’s __get__() method is bypassed, and that’s why we used Foo.__dict__['get_name'] at a few places in the examples. 
%d bloggers like this: