Special Methods

"Special methods" is a technical term referring to methods that get called automatically. In Python, they usually begin and end with double underscores.

We've already seen one special method, the constructor: __init__.

In [1]:
class C:
    def __init__(self):
        print("I'm inside __init__")

It's special, because it will run whenever an object is created, even though it is never explicitly called (note that __init__ doesn't appear in the following snippet, even though clearly it runs):

In [2]:
obj = C()
I'm inside __init__

We'll cover the following special methods here:

  • for strings: __str__, __repr__, _repr_html_
  • for comparison: __eq__, __lt__
  • for sequences: __len__, __getitem__
  • for context managers: __enter__, __exit__

Strings

When we print or view an object, it must be automatically converted to a string first. There are two ways.

For example, let's create a datetime object.

In [3]:
import datetime 
today = datetime.datetime.now()

print automatically converts it to a string, as you can see:

In [4]:
print(today)
print(str(today))
2021-01-25 15:23:02.067837
2021-01-25 15:23:02.067837

If we make it the last line in a cell, it gets converted to a string too, but differently:

In [5]:
today
Out[5]:
datetime.datetime(2021, 1, 25, 15, 23, 2, 67837)

The idea behind two styles is that there are both non-programmers (who probably prefer the former) and programmers (who might prefer the latter) in the world. Programmers like the latter format because it can be copy/pasted to create new objects:

In [6]:
obj = datetime.datetime(2021, 1, 25, 14, 15, 50, 951625)

Python deals with these two styles by giving every object two special methods: __repr__ and __str__:

In [7]:
print("For coders:", today.__repr__())
print("For others:", today.__str__())
For coders: datetime.datetime(2021, 1, 25, 15, 23, 2, 67837)
For others: 2021-01-25 15:23:02.067837

Immediately above, we're calling the special methods explicitly; they're still special because sometimes they get called explicitly (e.g., print(today) earlier called __str__).

Let's say we have a class for email messages. We can implement the special methods ourselves.

In [8]:
class Email:
    def __init__(self, to, frm, subject, message, attachment=None):
        self.to = to
        self.frm = frm
        self.subject = subject
        self.message = message
        self.attachment = attachment
        if self.attachment == None and "attached" in message:
            print("WARNING: did you forgot to attach a file?")
            
    def __str__(self):
        return f"TO: {self.to}\nFROM: {self.frm}\nSUBJECT: {self.subject}\n\n{self.message}\n\nATTACHMENT: {self.attachment}"
        
    def __repr__(self):
        return f"Email({repr(self.to)}, {repr(self.frm)}, {repr(self.subject)}, {repr(self.message)}, {repr(self.attachment)})"

em = Email("jobs@example.com", "somebody@gmail.com", "please hire me!", "Please see attached resume")
WARNING: did you forgot to attach a file?
In [9]:
# implicit call to __repr__
em
Out[9]:
Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'Please see attached resume', None)
In [10]:
# implicit call to __str__
print(em)
TO: jobs@example.com
FROM: somebody@gmail.com
SUBJECT: please hire me!

Please see attached resume

ATTACHMENT: None

Comparison

We normally use == to tell if two objects are the same. We need to do a little more work for this to work with our own types:

In [11]:
s1 = "hi"
s2 = "hi"
s1 == s2
Out[11]:
True
In [12]:
e1 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'Thanks!', None)
e2 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'Thanks!', None)
e1 == e2
Out[12]:
False

We need to implement __eq__.

In [13]:
class Email:
    def __init__(self, to, frm, subject, message, attachment=None):
        self.to = to
        self.frm = frm
        self.subject = subject
        self.message = message
        self.attachment = attachment
        if self.attachment == None and "attached" in message:
            print("WARNING: did you forgot to attach a file?")

    def __str__(self):
        return f"TO: {self.to}\nFROM: {self.frm}\nSUBJECT: {self.subject}\n\n{self.message}\n\nATTACHMENT: {self.attachment}"

    def __repr__(self):
        return f"Email({repr(self.to)}, {repr(self.frm)}, {repr(self.subject)}, {repr(self.message)}, {repr(self.attachment)})"

    def __eq__(self, other):
        if self.to == other.to and self.frm == self.frm and self.subject == other.subject:
            return True
        return False
In [14]:
e1 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'Thanks!', None)
e2 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'Thanks!', None)
e3 = Email('returns@example.com', 'somebody@gmail.com', 'refund please?', 'Thanks!', None)
print(e1 == e2)
print(e1 == e3)
True
False

The __eq__ should check all the important attributes and return True when they are all equivalent (and False otherwise). The above __eq__ is not very complete. It doesn't check message for example, which leads to strange behavior:

In [15]:
e1 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'I will work hard', None)
e2 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'I promise', None)
e1 == e2 # should be False, but won't be because our __eq__ doesn't check message
Out[15]:
True

Let's say you have a class for grocery inventory, and you want to sort a list of inventory objects based on value.

In [16]:
class Inventory:
    def __init__(self, item, amount, price):
        self.item = item
        self.amount = amount
        self.price = price
        
    def __repr__(self):
        return f"Inventory({repr(self.item)}, {self.amount}, {self.price})"
    
grocery = [Inventory("apples", 10, 0.3), Inventory("oranges", 2, 0.5), Inventory("kiwis", 9, 0.2)]
In [17]:
try:
    grocery.sort()
except Exception as e:
    print(e)
'<' not supported between instances of 'Inventory' and 'Inventory'

As the above error suggests, we need to implement "<" (less-than, or lt for short) to do sorting:

In [18]:
class Inventory:
    def __init__(self, item, amount, price):
        self.item = item
        self.amount = amount
        self.price = price
        
    def __repr__(self):
        return f"Inventory({repr(self.item)}, {self.amount}, {self.price})"    

    def __lt__(self, other):
        return self.amount*self.price < other.amount*other.price

grocery = [Inventory("apples", 10, 0.3), Inventory("oranges", 2, 0.5), Inventory("kiwis", 9, 0.2)]
grocery.sort()
grocery
Out[18]:
[Inventory('oranges', 2, 0.5),
 Inventory('kiwis', 9, 0.2),
 Inventory('apples', 10, 0.3)]

The items are sorted from least valuable (1 dollar) to most valuable (3 dollars).

Sequences

How do brackets work? obj[lookup] is backed by a special function (called __getitem__) that takes lookup and returns a value. Let's create a Sentence class that lets you grab a word in a sentence.

In [19]:
class Sentence:
    def __init__(self, s):
        self.s = s

    def __getitem__(self, lookup):
        print("calling __getitem__ with " + str(lookup))
        return self.s.split()[lookup]
In [20]:
s = Sentence("The quick brown fox jumps over the lazy dog")
s[3]
calling __getitem__ with 3
Out[20]:
'fox'

If we want, we can get clever and take other types. For example, we could interpret the float 3.2 to mean we want the 2nd letter from the 3rd word (counting both from the 0th position, of course).

In [21]:
class Sentence:
    def __init__(self, s):
        self.s = s
        
    def __getitem__(self, lookup):
        print("calling __getitem__ with " + str(lookup))
        word_idx = int(lookup)
        word = self.s.split()[word_idx]
        if type(lookup) == int:
            return word
        letter_idx = int(round(10*(lookup - word_idx)))
        return word[letter_idx]
    
    def __len__(self):
        return len(self.s.split())
In [22]:
s = Sentence("The quick brown fox jumps over the lazy dog")
s[3.2]
calling __getitem__ with 3.2
Out[22]:
'x'

The type checks means the old behavior works too:

In [23]:
s[3]
calling __getitem__ with 3
Out[23]:
'fox'

You might have noticed we implemented __len__ above too. It does what you might guess:

In [24]:
len(s)
Out[24]:
9

One last thing: for loops work. Python starts at index 0, then keeps counting up until there is an exception (which is hidden. Check it out (noticing that no word was returned when index 9 was attempted):

In [25]:
for w in s:
    print(w)
calling __getitem__ with 0
The
calling __getitem__ with 1
quick
calling __getitem__ with 2
brown
calling __getitem__ with 3
fox
calling __getitem__ with 4
jumps
calling __getitem__ with 5
over
calling __getitem__ with 6
the
calling __getitem__ with 7
lazy
calling __getitem__ with 8
dog
calling __getitem__ with 9

Context Managers

Context managers work with the "with" statement in Python. They're useful for making sure some code runs before and after a block, even if there is an exception.

In [26]:
import time

class TimeMe:
    def __enter__(self):
        print("start timer")
        self.t0 = time.time()
    
    def __exit__(self, exc_type, exc_value, traceback):
        print("stop timer")
        self.t1 = time.time()
        
    def total(self):
        return self.t1 - self.t0
In [27]:
tm = TimeMe()

with tm:
    total = 1
    for i in range(1000):
        total *= (i+1)
        
tm.total()
start timer
stop timer
Out[27]:
0.0012068748474121094
In [28]:
with tm:
    total = 1
    for i in range(100000):
        total *= (i+1)
        
tm.total()
start timer
stop timer
Out[28]:
2.6177990436553955

file objects are context managers. This is very useful, as __exit__ automatically closes files. Without context managers:

In [29]:
f = open("hi.txt", "w")
f.write("hello")
f.close() # don't forget to close it!

With context managers:

In [30]:
with open("hi.txt", "w") as f:
    f.write("hello")
# f is closed for us!

Another advantage. Even if the code inside the with fails (for example, maybe there's not enough drive space to write "hello" to the file), the context manager will close the file in the last example. Not so with the prior example.

Conclusion

Special functions are crucial to making new types that are nice to use. For example, pandas Series and DataFrames make heavy use of special methods. For example, that's why we can filter rows in a DataFrame using df[bool_series] -- the __getitem__ is smart enough to do something useful with lookup.