Collection like objects in python
Collections are an important part of every programming language. In python, there is a couple of built-in collections like list, set, dictionary but I’m not going to dig into them now. In this post, I’ll explore what it takes to implement collection like objects on your own using collections protocol.
Before we start implementing (or emulating) collection like objects we should get familiar with slice object. It does exactly what the name suggests - helps with slicing of collections. Beside obvious attributes like start, stop and step which you can provide in the constructor it provides indices method. It can be used to determine start, stop and step attributes that are tailored to collection size provided as an argument.
s = slice(2, 10, 2) s.start, s.stop, s.step # (2, 10, 2) s.indices(5), list(range(*s.indices(5))) # ((2, 5, 2), [2, 4])
Notice how easy it is to convert the result of
slice.indices to a range object which produces indexes of items requested in a slice.
class SlotsMachine: def __init__(self, slots_count): self._slots = list(range(1, slots_count+1)) def __len__(self): return len(self._slots) def __getitem__(self, key): return self._slots[key] def __setitem__(self, key, value): self._slots[key] = value def __delitem__(self, key): del self._slots[key] def __iter__(self): return iter(self._slots) def __reversed__(self): return iter(reversed(self._slots)) def __contains__(self, value): return value in self._slots slot_machine = SlotsMachine(5)
There is a lot that’s happening in there. I’m delegating all operations to the “private” list instance so we’ll be able to focus on what methods do instead of how they are implemented.
# __len__ len(slot_machine) # 5 # __getitem__ slot_machine, slot_machine[1:] # (1, [2, 3, 4, 5]) # __setitem__ slot_machine = 100 # __setitem__ with slice is also possible slot_machine[1:2] = (20, 30) # __delitem__ del slot_machine # __delitem__ with slice del slot_machine[:2]
In __getitem__, __setitem__, __delitem__ key passed as an argument doesn’t have to be int or slice. It can be a string or any other object and you can implement your class in a way that simulates more dictionary-like behavior. If you want to handle negative indices in those methods it’s up to you to implement it (with the list as an underlying collection I can delegate all heavy lifting to it).
# __iter__ list(slot_machine), \ list(filter(lambda s: s > 10, slot_machine)), \ [s for s in slot_machine if s % 10 == 0] # ([100, 20, 30, 3, 4, 5], [100, 20, 30], [100, 20, 30]) # __reveresed__ list(reversed(slot_machine)), list(slot_machine) ([5, 4, 3, 30, 20, 100], [100, 20, 30, 3, 4, 5])
Notice how implementing __iter__ method allows your objects to feel like native Python objects. With it, you can now do list comprehensions and all functional operations on it.
# __contains__ 4 in slot_machine, 1000 in slot_machine (True, False)
A similar time-saver can be __contains__ method which allows us to easily check if a particular object is inside.
Last on to mention is __missing__ method which doesn’t make a lot of sense unless you are extending dict class and is useful only when you don’t override __getitem__:
class DefaultDictLike(dict): def __init__(self, default_value): super().__init__() self.default_value = default_value def __missing__(self, key): return self.default_value(key) d = DefaultDictLike(lambda key: "missing " + key) d["asd"] # 'missing asd'
__missing__ method is called from dict.__getitem__ method when item corresponding to a key is not found.
To give you another example on how you can use those methods to expand your DSL you can implement following class:
from collections import namedtuple Book = namedtuple("Book", "name, isbn, author") class Library: def __init__(self, *books): self.books = [*books] def __getitem__(self, key): book = next((b for b in self.books if Library._matches(b, key)), None) if book is None: raise KeyError(key) return book def __contains__(self, key): return any(book for book in self.books if Library._matches(book, key)) def __iter__(self): return iter(self.books) @staticmethod def _matches(book, key): return book.name.lower() == key.lower() or book.isbn == key.lower() library = Library( Book("First", "isbn1", "author1"), Book("Second", "isbn2", "author2")) library["isbn1"], library["second"] # (Book(name='First', isbn='isbn1', author='author1'), # Book(name='Second', isbn='isbn2', author='author2')) library["unknown isbn"] # KeyError... "isbn1" in library, "first" in library, "unknown" in library # (True, True, False) [book.name for book in library] # ['First', 'Second']
Library instances can be used almost like something built into python. As we don’t have operator overloading in java it’s refreshing to have the possibility to add such synthetic sugar without problems or weird methods.
If you've enjoyed or found this post useful you might also like:
15 Apr 2020 #basics