Distinguishing between None and missing values in dictionaries with a sentinel

Whoa, it’s been quite a while since my last blog post. This is, of course, not my fault (it never is 😛), let’s instead blame the architect of this world that only gave us 24 hours in a single day. That’s not nearly enough to do all the things one wants! Anyhow, now that I optimistically deluded myself that the excuse I just gave is convincing, I would like to share a tip when dealing with maybe-missing dictionary data.

This situation is quite common when receiving JSON data from an external APIs, which is normally handed to you, the programmer, parsed in a dictionary. Reading values from such dictionaries requires some caution to avoid unnecessary errors:

data = dict(foo=1, bar=2)
baz_value = data['baz']  # KeyError

The obvious, but somewhat naive, way of dealing with this is to first check if baz exists in the first place:

baz_value = data['baz'] if 'baz' in data else None

It works, but it’s verbose, and there is already a dict.get(key[, default]) method that does exactly that – return the value stored under key, if such key exists, otherwise return the default value (which defaults to None). It is thus a common practice to just use get() instead:

baz_value = data.get('baz')

However, sometimes there are cases when one actually needs to distinguish between the missing values and the values explicitly set to None. If get() returns None, it is not clear why it returned it without an additional check. It is also not recommended to use a different default value that can “never” possibly represent an actual value in the dictionary, as that assumption can change and break our code.

baz_value = data.get('baz', 'NO VALUE')

if baz_value == 'NO VALUE':
    # handle the "missing" case
else:
    # handle the "normal" case

If baz somehow happens to end up with the value 'NO VALUE' in data, the code above will not work as intended. One solution is to again use an explicit membership test, and only read the value if the latter succeeds:

if 'baz' not in data:
    # handle the "missing" case
else:
    baz_value = data['baz']
    # handle the "normal" case

The downside is that we need to do two dictionary lookups, one for the membership test, and another to actually retrieve the baz value.

Fortunately, there is an elegant fix to this – sentinels. A value picked for a sentinel must be something that is always uniquely distinguishable from the data. One option is to generate a unique string (with, say, uuid.uuid4()), which is “unique enough” for practical purposes, another is to instantiate a new object and test for identity:

MISSING = object()  # a sentinel
baz_value = data.get('baz', MISSING)

if baz_value is MISSING:
    # handle the "missing" case
elif baz_value is None:
    # handle the "normal" case when None
else:
    # handle all other "normal" cases

The explicit check for None is optional and you might not even need it, but it is nevertheless included in the above snippet to demonstrate all three possible outcomes, and how they can be handled with just a single dictionary lookup.

%d bloggers like this: