On the project I currently work on, we were recently (briefly) bitten by an interesting bug that initially slipped through the tests and code review. We had a custom database-mapped type that had two attributes defined, let’s call them value_low
and value_high
. As their names suggest, the value of the former should be lower than the value of the latter. The model type also defined a validator that would enforce this rule, and assure internal consistency of the model instances.
A simplified version of the type is shown below:
from sqlalchemy import Column, Integer from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import validates Base = declarative_base() class MyModel(Base): __tablename__ = 'my_models' id = Column(Integer, primary_key=True) value_low = Column(Integer) value_high = Column(Integer) @validates('value_low', 'value_high') def validate_thresholds(self, key, value): """Assure that relative ordering of both values remains consistent.""" if key == 'value_low' and value >= self.value_high: raise ValueError('value_low must be strictly lower than value_high') elif key == 'value_high' and value <= self.value_low: raise ValueError('value_high must be strictly higher than value_low') return value
The model definition seems pretty sensible, and it works, too:
>>> instance = MyModel(value_low=2, value_high=10) >>> instance.value_low, instance.value_high (2, 10)
Except when you run it on another machine with the same Python version:
>>> instance = MyModel(value_low=2, value_high=10) Traceback (most recent call last): ... ValueError: value_low must be strictly lower than value_high
Uhm, what?
After a bit of research, the following behavior can be noticed, even on the machine where the code initially seemed to work just fine:
>>> instance = MyModel() >>> instance.value_high = 10 >>> instance.value_low = 2 >>> instance.value_low, instance.value_high (2, 10) # now let's try to reverse the order of assignments >>> instance = MyModel() >>> instance.value_low = 2 # ValueError ...
What’s going on here? Why is the order of attribute value assignments important?
This, by the way, explains why the first example worked on one machine, but failed on another. The values passed to
MyModel()
as keyword arguments were applied in a different order1.
Debugging showed that if value_low
is set first when value_high
is still None
, the following expression evaluates to True
(key
equals to value_low
), resulting in a failed validation:
# MIND: value == 2 if key == 'value_low' and value >= self.value_high: raise ValueError(...)
On the other hand, if we first set value_high
when value_low
is None, the corresponding if
statement condition evaluates to False
and the error does not get raised. Stripping it down to the bare essentials gives us the following2:
>>> 2 >= None True >>> 10 <= None False
If we further explore this, it gets even better:
>>> None < -1 True >>> None < -float('inf') True >>> None < 'foobar' True >>> 'foobar' > -1 True >>> 'foobar' > 42 True >>> tuple >= (lambda x: x) True >>> type > Ellipsis True
Oh… of course. It’s Python2 that we use on the project (doh).
You see, Python2 is perfectly happy to compare objects of different types, even when that does not make much sense. Peeking into the CPython
source code reveals that None
is smaller than anything:
... /* None is smaller than anything */ if (v == Py_None) return -1; if (w == Py_None) return 1;
Different types are ordered by name, with number types being smaller than other types3.
Now that we know this, the bug from the beginning makes perfect sense. It was just a coincidence that high_value
could be set on an instance, because the validator’s check for an error (value
<= self.value_low
) would never return True
when self.value_low
is None
, because the latter is smaller than everything.
In the end the issue was, fortunately, quickly discovered, and fixing it was straightforward. We just needed to add an extra check:
if ( key == 'value_low' and self.value_high is not None and # <-- THIS value >= self.value_high): ... # and the same for the other if...
And the application worked correctly and happily ever after…
-
That’s probably because when creating a new instance of the model, SQLAlchemy loops over the
kwargs
dict somewhere down the rabbit hole, but the order of the dictionary keys is arbitrary. ↩ -
Python2 only. In Python3 we would get
TypeError: unorderable types
. ↩ - Provided that types do not define their own custom rich comparison methods, of course. ↩