## Giving Python slices a name

Python makes it easy for a developer to work with sequence types such as lists, strings, tuples, and others. This is especially true when extracting sub-sequences from a given sequence.

```>>> vowels = ['A', 'E', 'I', 'O', 'U']
>>> vowels[1:3]
['E', 'I']
>>> vowels[3:5]
['O', 'U']
>>> vowels[-4:-2]
['E', 'I']
```

Out of bound indexes are gracefully handled:

```>>> vowels[2:99]
['I', 'O', 'U']
>>> vowels[-5:2]
['A', 'E']
```

Omitted start/end indexes default to the beginning/end for the sequence, respectively:

```>>> vowels[:2]
['A', 'E']
>>> vowels[-2:]
['O', 'U']
```

If given a step `n`, only every `n`-th item in the specified range is included in the result:

```>>> vowels[::2]
['A', 'I', 'U']
```

Step can also be a negative number:

```>>> vowels[4:1:-1]
['U', 'O', 'I']
```
##### Slice objects

When using the “extended indexing” syntax (made up the name) from above, what actually happens behind the scenes is that a slice() object is created and passed to the sequence object being sliced. The following two expressions are thus equivalent:

```>>> vowels[4:2:-1]
['U', 'O']
>>> vowels[slice(4, 2, -1)]
['U', 'O']
```

This is great, because it allows us to assign descriptive names to slices, and possibly reusing them if the same sub-slice is used at more than a single place:

```>>> FIRST_THREE = slice(0, 3)
>>> ODD_ITEMS = slice(1, None, 2)
>>> vowels[FIRST_THREE]
['A', 'E', 'I']
>>> 'abcdef'[FIRST_THREE]
'abc'
>>> vowels[ODD_ITEMS]
['E', 'O']
>>> 'abcdef'[ODD_ITEMS]
'bdf'
```
##### Adding support for slicing to custom objects

It’s worth noting that object slicing is not something that is automatically given to us, Python merely allows us to implement support for it ourselves, if we want so.

When the square brackets notation (`[]`) is used, Python tries to invoke the __getitem__() magic method on the object, passing the given key to it as an argument. That method can be overridden to define custom indexing behavior.

As an example, let’s try to create a class whose instances can be queried for balance. Even if an instance itself does not contain anything, it will somehow caclulate the required amount out of thin air and return that made up number to us. We will call that class a `Bank`.

```class Bank(object):
"""Can create money out of thin air."""

def __getitem__(self, key):
if not isinstance(key, (int, slice)):
raise TypeError('Slice or integer index expected')

if isinstance(key, int):
return idx

# key is a slice() instance
start = key.start if isinstance(key.start, int) else 0
stop = key.stop if isinstance(key.stop, int) else 0
step = key.step if isinstance(key.step, int) else 1
return sum(range(start, stop, step))
```

If we query a `Bank` instance (by indexing it) with a single integer, it will simply return us the amount equal to the given index. If queried by a range (slice), however, it will return the sum of all indices contained in it:1

```>>> b = Bank()
>>> b[7]
7
>>> b[-3:0]
-6
>>> b[0:7:2]
12
>>> bank[::]
0
>>> bank['':5:{}]
10
```

As the last example demonstrates, slices can contain just about any value, not just integers and `None`, thus the `Bank` class must check for these cases and use defaults if needed.

Just a word of caution – if sub-classing built-in types in Python 2, and want to implement custom slicing behavior, you need to override the deprecated `__getslice__()` method (documentation).

1. Not saying that this is actually the best way to run a bank in real life, nor that (ab)using slices like this will make you popular with people using your sliceable class…

## Handling application settings hierarchy more easily with Python’s ChainMap

Say you have an application that, as almost every more complex application, exposes some settings that affect how that application behaves. These settings can be configured in several ways, such as specifying them in a configuration file, or by using the system’s environment variables. They can also be provided as command line arguments when starting the application.

If the same setting is provided at more than a single configuration layer, the value from the layer with the highest precedence is taken. This could mean that, for example, configuration file settings override those from environment variables, but command line arguments have precedence over both. If a particular setting is not specified in any of these three layers, its default value is used.

Ordered by priority (highest first), our hierarchy of settings layers looks as follows:

• command line arguments
• configuration file
• environment variables
• application defaults

A natural way of representing settings is by using a dictionary, one for each layer. An application might thus contain the following:

```defaults = {
'items_per_page': 100,
'log_level': 'WARNING',
'max_retries': 5,
}

env_vars = {
'max_retries': 10
}

config_file = {
'log_level': 'INFO'
}

cmd_args = {
'max_retries': 2
}
```

Putting them all together while taking the precedence levels into account, here’s how the application settings would look like:

```>>> settings  # all settings put together
{
'items_per_page': 100,  # a default value
'log_level': 'INFO',  # from the config file
'max_retries': 2,  # from a command line argument
}
```

There is a point in the code where a decision needs to be made on what setting value to consider. One way of of determining that is by examining the dictionaries one by one until the setting is found:1

```setting_name = 'log_level'

for settings in (cmd_args, config_file, env_vars, defaults):
if setting_name in settings:
value = settings[setting_name]
break
else:

# do something with value...
```

A somewhat verbose approach, but at least better than a series of if-elif statements.

An alternative approach is to merge all settings into a single dictionary before using any of their values:

```settings = defaults.copy()
for d in (env_vars, config_file, cmd_args):
settings.update(d)
```

Mind that here the order of applying the settings layers must be reversed, i.e. the highest priority layers get applied last, so that no lower-level layers can override them. This works quite well, with a possibly only downside that if any of the underlying settings dictionaries get updated, the “main” settings dictionary must be rebuilt, because the changes are not reflected in it automatically:

```>>> config_file['items_per_page'] = 25
>>> settings['items_per_page']
100  # change not propagated
```

Now, I probably wouldn’t be writing about all this, if there didn’t already exist an elegant solution in the standard library. Python 3.32 brought us collections.ChainMap, a handy class that can transparently group multiple dicts together. We just need to pass it all our settings layers (higher-priority ones first), and ChainMap takes care of the rest:

```>>> from collections import ChainMap
>>> settings = ChainMap(cmd_args, config_file, env_vars, defaults)
>>> settings['items_per_page']
100
>>> env_vars['items_per_page'] = 25
>>> settings['items_per_page']
25  # yup, automatically propagated
```

Pretty handy, isn’t it?

1. In case you spot a mis-indented `else` block – the indentation is actually correct. It’s a for-else statement, and the `else` block only gets executed if the loop finishes normally, i.e. without `break`-ing out of it.
2. There is also a polyfill for (some) older Python versions.

## Keyword-only arguments in Python

Python allows you to pass arguments in two ways – either by their position or by their name (keyword):1

```def foo(x, y):
print(x, y)

# the following all print 5 10
foo(5, 10)
foo(5, y=10)
foo(x=5, y=10)
```

It works the same if a function accepts keyword arguments:2

```def bar(x=5, y=10):
print(x, y)

# the following all print 4 8
bar(4, 8)
bar(4, y=8)
bar(x=4, y=8)
```

If a particular keyword argument is not provided, its default value is used:

```>>> bar(2)  # y is omitted
2 10
```

It is quite common to use keyword arguments to define “extra” options that can be passed to a callable, and if omitted, their default value is used. That can improve readability, because it is enough to only pass the options whose value differs from the default:

```def cat(x, y, to_upper=False, strip=True):
"""Concatenate given strings."""
if strip:
x, y = x.strip(), y.strip()

result = x + y

if to_upper:
result = result.upper()

return result

# returns 'foobar'
cat('  foo ', 'bar ')

# the following both return '  foo bar '
cat('  foo ', 'bar ', False, False)
cat('  foo ', 'bar ', strip=False)
```

You will probably agree that the second form is indeed more clean.

##### Positional arguments pitfalls

The ability to pass keyword arguments by position as demonstrated in the introduction can, if not careful, bite you, especially if you do not have a thorough test coverage in place as a safety net. Let’s say that there is a piece of code which invokes the `cat()` function using positional arguments only:

```cat('  foo ', 'bar ', True)  # returns 'FOOBAR'
```

Let’s also say that suddenly one of the team members gets an inspiration and decides that it would be great to sort all keyword parameters alphabetically. You know, for readability. Before you can express your doubt, he eagerly refactors the function, swapping the two keyword parameters:

```def cat(x, y, strip=True, to_upper=False):
...
```

If you have proper tests in place, good for you, but if you don’t, you might not realize, that this change just introduced a bug:

```cat('  foo ', 'bar ', True)  # now returns 'foobar'
```

The poor `'FOOBAR'` return value just got demoted to its lowercase version. This would not have happened if the option would be passed as a keyword argument, i.e. `to_upper=True`.

Another source of potential errors is accidentally passing an option value through a positional argument. Let’s imagine another contrived scenario where a new team member uses intuition to deduce how the `cat()` function works. Of course – it’s just a version of `sum()` adjusted to work with strings!

```>>> cat('  foo ', 'bar ', 'baz')  # the original cat
'FOOBAR'
```

Erm…

The option `to_upper` was assigned the value `'baz'` which is truthy, but it is probably not what the caller intended to achieve.

It can be argued that this behavior is a bit unintuitive, and that it would be nice if we could somehow force the callers to explicitly pass keyword arguments by their name (keyword), and not their position.

##### Making the arguments keyword-only (Python 3)

The trick is to swallow any redundant positional arguments, preventing them from filling the keyword arguments:

```def demo(x, y, *args, separator='___'):
print(x, y, args, sep=separator)

>>> demo(10, 20, 30, 40)
10___20___(30, 40)

>>> demo(10, 20, 30, 40, separator='---')
10---20---(30, 40)
```

Any positional arguments beyond the first two (30 and 40) get swallowed by the `args` tuple, and the only way to specify a different separator is through an explicit keyword argument. To complete the picture, we just need to prevent the callers to pass in too many positional arguments, and we can do this with a simple check if `args` is not empty:

```if args:
raise TypeError('Too many positional arguments given.')
```

What’s more, if we omit the variable arguments tuple’s name altogether, we get the above check for free!
Plus a useful error message on top of it, demo:

```def demo2(x, y, *, separator='___'):
print(x, y, sep=separator)

>>> demo2(1, 2, 3, separator=';')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: demo2() takes 2 positional arguments but 3 positional arguments (and 1 keyword-only argument) were given
```
##### Keyword-only arguments in Python 2

Unfortunately, if we try the same approach in Python 2, it will complain and raise a syntax error. We cannot specify individual keyword arguments after `*args`, but we can specify that a function accepts a variable number of keyword arguments, and then manually unpack it:

```def foo(x, y, *args, **kwargs):
option_a = kwargs.pop('option_a', 'default_A')
option_b = kwargs.pop('option_b', 'default_B')

if args or kwargs:
raise TypeError('Too many positional and/or keyword arguments given.')
```

We also have to manually check if the caller has passed any unexpected positional and/or keyword arguments by testing the `args` tuple and the `kwargs` dict (after popping all expected items from it) – both should be empty.

Cumbersome indeed, not to mention that the function signature is not as pretty as it could be. But that’s what we have in Python 2.

1. The following few examples assume either Python 3, or that `print_function` is imported from the `__future__` module if using Python 2.
2. The fourth invocation option, i.e. saying `foo(x=5, 10)`, is not listed, because it is a syntax error – positional arguments must precede any keyword arguments.

## Applying a decorator to a class method results in an error

You might have seen this one before – you wrote a decorator in Python and tried to apply it to a class method (or static method, for that matter), only to see an error.

```from functools import wraps

def logged(func):
"""A decorator printing a message before invoking the wrapped function."""
@wraps(func)
def wrapped_func(*args, **kwargs):
print('Invoking', func)
return func(*args, **kwargs)
return wrapped_func

class Foo(object):
@logged
@classmethod
def get_name(cls):
return cls.__name__
```

As the docstring explains, the `logged` decorator simply prints a message before invoking the decorated function, and it is applied to the `get_name()` class method of the class `Foo`. The `@wraps` decorator makes sure the original function’s metadata is copied to the wrapper function returned by the decorator (docs).

But despite this essentially being a textbook example of a decorator in Python, invoking the `get_name()` method results in an error (using Python3 below):

```>>> Foo.get_name()
Invoking <classmethod object at 0x7f8e7473e0f0>
Traceback (most recent call last):
...
TypeError: 'classmethod' object is not callable
```

If you just want to quickly fix this issue, because it annoys you, here’s the TL;DR fix – just swap the order of the decorators, making sure that the `@classmethod` decorator is applied last:

```class Foo(object):
@classmethod
@logged
def get_name(cls):
return cls.__name__

>>> Foo.get_name()
Invoking <function Foo.get_name at 0x7fce90356c80>
'Foo'
```

On the other hand, if you are curious what is actually happening behind the scenes, please keep reading.

The first thing to note is the output in each example immediately after calling `Foo.get_name()`. Our decorator prints the object that is about to invoke in the very next line, and in the non-working example that object is actually not a function!

```Invoking <classmethod object at 0x7f8e7473e0f0>
```

Instead, the thing that our decorator tries to invoke is a “classmethod” object, but the latter is not callable, causing the Python interpreter to complain.

##### Meet descriptors

Let’s take a closer look at a stripped-down version of the `Foo` class:

```class Foo(object):
@classmethod
def get_name(cls):
return cls.__name__

>>> thing = Foo.__dict__['get_name']
>>> thing
<classmethod object at 0x7f295ffc6d30>
>>> hasattr(thing, '__get__')
True
>>> callable(thing)
False
```

As it turns out, `get_name` is an object which is not callable, i.e. we can not say `get_name()` and expect it to work. By the presence of the `__get__` attribute we can also see, that it is a descriptor.

Descriptors are object that behave differently than “normal” attributes. When accessing a descriptor, what happens is that its `__get__()` method gets called behind the scenes, returning the actual value. The following two expressions are thus equivalent:

```>>> Foo.get_name
<bound method Foo.get_name of <class '__main__.Foo'>>
>>> Foo.__dict__['get_name'].__get__(None, Foo)
<bound method Foo.get_name of <class '__main__.Foo'>>
```

`__get__()` gets called with two parameters – the object instance the attribute belongs to (`None` here, because accessing the attribute through a class), and the owner class, i.e. the one the descriptor is defined on (`Foo` in this case)1.

What the classmethod descriptor does is binding the original `get_name()` function to its class (`Foo`), and returning a bound method object. When the latter gets called, it invokes `get_name()`, passing class `Foo` as the first argument (`cls`) along with any other arguments the bound method was originally called with.

Armed with this knowledge it is now clear why our `logged` decorator from the beginning does not always work. It assumes that the object passed to it is directly callable, and does not take the descriptor protocol into account.

##### Making it right

Describing how to adjust the `logged` decorator to work correctly is quite a lengthy topic, and out of scope of this post. If interested, you should definitely read the blog series by Graham Dumpleton, as it addresses many more aspects than just working well with classmethods. Or just use his wrapt library for writing decorators:

```import wrapt

@wrapt.decorator
def logged(wrapped, instance, args, kwargs):
print('Invoking', wrapped)
return wrapped(*args, **kwargs)

class Foo(object):
@logged
@classmethod
def get_name(cls):
return cls.__name__

>>> Foo.get_name()
Invoking <bound method Foo.get_name of <class 'main2.Foo'>>
'Foo'
```

Yup, it works.

1. On the other hand, if retrieving a descriptor object directly from the class’s `__dict__`, the descriptor’s `__get__()` method is bypassed, and that’s why we used `Foo.__dict__['get_name']` at a few places in the examples.

## Comparing objects of different types in Python 2

On the project I currently work on, we were recently (briefly) bitten by an interesting bug that initially slipped through the tests and code review. We had a custom database-mapped type that had two attributes defined, let’s call them `value_low` and `value_high`. As their names suggest, the value of the former should be lower than the value of the latter. The model type also defined a validator that would enforce this rule, and assure internal consistency of the model instances.

A simplified version of the type is shown below:

```from sqlalchemy import Column, Integer
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import validates

Base = declarative_base()

class MyModel(Base):
__tablename__ = 'my_models'

id = Column(Integer, primary_key=True)
value_low = Column(Integer)
value_high = Column(Integer)

@validates('value_low', 'value_high')
def validate_thresholds(self, key, value):
"""Assure that relative ordering of both values remains consistent."""
if key == 'value_low' and value >= self.value_high:
raise ValueError('value_low must be strictly lower than value_high')
elif key == 'value_high' and value <= self.value_low:
raise ValueError('value_high must be strictly higher than value_low')

return value

```

The model definition seems pretty sensible, and it works, too:

```>>> instance = MyModel(value_low=2, value_high=10)
>>> instance.value_low, instance.value_high
(2, 10)
```

Except when you run it on another machine with the same Python version:

```>>> instance = MyModel(value_low=2, value_high=10)
Traceback (most recent call last):
...
ValueError: value_low must be strictly lower than value_high
```

Uhm, what?

After a bit of research, the following behavior can be noticed, even on the machine where the code initially seemed to work just fine:

```>>> instance = MyModel()
>>> instance.value_high = 10
>>> instance.value_low = 2
>>> instance.value_low, instance.value_high
(2, 10)

# now let's try to reverse the order of assignments
>>> instance = MyModel()
>>> instance.value_low = 2
# ValueError ...
```

What’s going on here? Why is the order of attribute value assignments important?

This, by the way, explains why the first example worked on one machine, but failed on another. The values passed to `MyModel()` as keyword arguments were applied in a different order1.

Debugging showed that if `value_low` is set first when `value_high` is still `None`, the following expression evaluates to `True` (`key` equals to `value_low`), resulting in a failed validation:

```# MIND: value == 2
if key == 'value_low' and value >= self.value_high:
raise ValueError(...)
```

On the other hand, if we first set `value_high` when `value_low` is None, the corresponding `if` statement condition evaluates to `False` and the error does not get raised. Stripping it down to the bare essentials gives us the following2:

```>>> 2 >= None
True
>>> 10 <= None
False
```

If we further explore this, it gets even better:

```>>> None < -1
True
>>> None < -float('inf')
True
>>> None < 'foobar'
True
>>> 'foobar' > -1
True
>>> 'foobar' > 42
True
>>> tuple >= (lambda x: x)
True
>>> type > Ellipsis
True
```

Oh… of course. It’s Python2 that we use on the project (doh).

You see, Python2 is perfectly happy to compare objects of different types, even when that does not make much sense. Peeking into the `CPython` source code reveals that `None` is smaller than anything:

```...
/* None is smaller than anything */
if (v == Py_None)
return -1;
if (w == Py_None)
return 1;
```

Different types are ordered by name, with number types being smaller than other types3.

Now that we know this, the bug from the beginning makes perfect sense. It was just a coincidence that `high_value` could be set on an instance, because the validator’s check for an error (`value` <= `self.value_low`) would never return `True` when `self.value_low` is `None`, because the latter is smaller than everything.

In the end the issue was, fortunately, quickly discovered, and fixing it was straightforward. We just needed to add an extra check:

```if (
key == 'value_low' and
self.value_high is not None and  # <-- THIS
value >= self.value_high):
...
# and the same for the other if...
```

And the application worked correctly and happily ever after…

1. That’s probably because when creating a new instance of the model, SQLAlchemy loops over the `kwargs` dict somewhere down the rabbit hole, but the order of the dictionary keys is arbitrary.
2. Python2 only. In Python3 we would get `TypeError: unorderable types`
3. Provided that types do not define their own custom rich comparison methods, of course.

## Hello, World!

Hey, look, I now have my own technical blog, too! It’s technical, because its posts (will) contain snippets of program code like the one below:

```from __future__ import print_function

print('Hello, World!')
```

See?

The code above has been carefully crafted to run both under Python 21 and Python 3.

Expect more of this in the future, but for now I will just cut this post short, because hello world programs posts should be short and simple, right?

1. Well, 2.6+ at least when `print_function` was added to the `__future__` module (yeah, I know, it would also work without that import in this particular case…).