"Special methods" is a technical term referring to methods that get called automatically. In Python, they usually begin and end with double underscores.
We've already seen one special method, the constructor: __init__
.
class C:
def __init__(self):
print("I'm inside __init__")
It's special, because it will run whenever an object is created, even though it is never explicitly called (note that __init__
doesn't appear in the following snippet, even though clearly it runs):
obj = C()
We'll cover the following special methods here:
__str__
, __repr__
, _repr_html_
__eq__
, __lt__
__len__
, __getitem__
__enter__
, __exit__
When we print or view an object, it must be automatically converted to a string first. There are two ways.
For example, let's create a datetime object.
import datetime
today = datetime.datetime.now()
print
automatically converts it to a string, as you can see:
print(today)
print(str(today))
If we make it the last line in a cell, it gets converted to a string too, but differently:
today
The idea behind two styles is that there are both non-programmers (who probably prefer the former) and programmers (who might prefer the latter) in the world. Programmers like the latter format because it can be copy/pasted to create new objects:
obj = datetime.datetime(2021, 1, 25, 14, 15, 50, 951625)
Python deals with these two styles by giving every object two special methods: __repr__
and __str__
:
print("For coders:", today.__repr__())
print("For others:", today.__str__())
Immediately above, we're calling the special methods explicitly; they're still special because sometimes they get called explicitly (e.g., print(today)
earlier called __str__
).
Let's say we have a class for email messages. We can implement the special methods ourselves.
class Email:
def __init__(self, to, frm, subject, message, attachment=None):
self.to = to
self.frm = frm
self.subject = subject
self.message = message
self.attachment = attachment
if self.attachment == None and "attached" in message:
print("WARNING: did you forgot to attach a file?")
def __str__(self):
return f"TO: {self.to}\nFROM: {self.frm}\nSUBJECT: {self.subject}\n\n{self.message}\n\nATTACHMENT: {self.attachment}"
def __repr__(self):
return f"Email({repr(self.to)}, {repr(self.frm)}, {repr(self.subject)}, {repr(self.message)}, {repr(self.attachment)})"
em = Email("jobs@example.com", "somebody@gmail.com", "please hire me!", "Please see attached resume")
# implicit call to __repr__
em
# implicit call to __str__
print(em)
We normally use ==
to tell if two objects are the same. We need to do a little more work for this to work with our own types:
s1 = "hi"
s2 = "hi"
s1 == s2
e1 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'Thanks!', None)
e2 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'Thanks!', None)
e1 == e2
We need to implement __eq__
.
class Email:
def __init__(self, to, frm, subject, message, attachment=None):
self.to = to
self.frm = frm
self.subject = subject
self.message = message
self.attachment = attachment
if self.attachment == None and "attached" in message:
print("WARNING: did you forgot to attach a file?")
def __str__(self):
return f"TO: {self.to}\nFROM: {self.frm}\nSUBJECT: {self.subject}\n\n{self.message}\n\nATTACHMENT: {self.attachment}"
def __repr__(self):
return f"Email({repr(self.to)}, {repr(self.frm)}, {repr(self.subject)}, {repr(self.message)}, {repr(self.attachment)})"
def __eq__(self, other):
if self.to == other.to and self.frm == self.frm and self.subject == other.subject:
return True
return False
e1 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'Thanks!', None)
e2 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'Thanks!', None)
e3 = Email('returns@example.com', 'somebody@gmail.com', 'refund please?', 'Thanks!', None)
print(e1 == e2)
print(e1 == e3)
The __eq__
should check all the important attributes and return True when they are all equivalent (and False otherwise). The above __eq__
is not very complete. It doesn't check message
for example, which leads to strange behavior:
e1 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'I will work hard', None)
e2 = Email('jobs@example.com', 'somebody@gmail.com', 'please hire me!', 'I promise', None)
e1 == e2 # should be False, but won't be because our __eq__ doesn't check message
Let's say you have a class for grocery inventory, and you want to sort a list of inventory objects based on value.
class Inventory:
def __init__(self, item, amount, price):
self.item = item
self.amount = amount
self.price = price
def __repr__(self):
return f"Inventory({repr(self.item)}, {self.amount}, {self.price})"
grocery = [Inventory("apples", 10, 0.3), Inventory("oranges", 2, 0.5), Inventory("kiwis", 9, 0.2)]
try:
grocery.sort()
except Exception as e:
print(e)
As the above error suggests, we need to implement "<" (less-than, or lt for short) to do sorting:
class Inventory:
def __init__(self, item, amount, price):
self.item = item
self.amount = amount
self.price = price
def __repr__(self):
return f"Inventory({repr(self.item)}, {self.amount}, {self.price})"
def __lt__(self, other):
return self.amount*self.price < other.amount*other.price
grocery = [Inventory("apples", 10, 0.3), Inventory("oranges", 2, 0.5), Inventory("kiwis", 9, 0.2)]
grocery.sort()
grocery
The items are sorted from least valuable (1 dollar) to most valuable (3 dollars).
How do brackets work? obj[lookup]
is backed by a special function (called __getitem__
) that takes lookup
and returns a value. Let's create a Sentence class that lets you grab a word in a sentence.
class Sentence:
def __init__(self, s):
self.s = s
def __getitem__(self, lookup):
print("calling __getitem__ with " + str(lookup))
return self.s.split()[lookup]
s = Sentence("The quick brown fox jumps over the lazy dog")
s[3]
If we want, we can get clever and take other types. For example, we could interpret the float 3.2 to mean we want the 2nd letter from the 3rd word (counting both from the 0th position, of course).
class Sentence:
def __init__(self, s):
self.s = s
def __getitem__(self, lookup):
print("calling __getitem__ with " + str(lookup))
word_idx = int(lookup)
word = self.s.split()[word_idx]
if type(lookup) == int:
return word
letter_idx = int(round(10*(lookup - word_idx)))
return word[letter_idx]
def __len__(self):
return len(self.s.split())
s = Sentence("The quick brown fox jumps over the lazy dog")
s[3.2]
The type checks means the old behavior works too:
s[3]
You might have noticed we implemented __len__
above too. It does what you might guess:
len(s)
One last thing: for loops work. Python starts at index 0, then keeps counting up until there is an exception (which is hidden. Check it out (noticing that no word was returned when index 9 was attempted):
for w in s:
print(w)
Context managers work with the "with" statement in Python. They're useful for making sure some code runs before and after a block, even if there is an exception.
import time
class TimeMe:
def __enter__(self):
print("start timer")
self.t0 = time.time()
def __exit__(self, exc_type, exc_value, traceback):
print("stop timer")
self.t1 = time.time()
def total(self):
return self.t1 - self.t0
tm = TimeMe()
with tm:
total = 1
for i in range(1000):
total *= (i+1)
tm.total()
with tm:
total = 1
for i in range(100000):
total *= (i+1)
tm.total()
file objects are context managers. This is very useful, as __exit__
automatically closes files. Without context managers:
f = open("hi.txt", "w")
f.write("hello")
f.close() # don't forget to close it!
With context managers:
with open("hi.txt", "w") as f:
f.write("hello")
# f is closed for us!
Another advantage. Even if the code inside the with
fails (for example, maybe there's not enough drive space to write "hello" to the file), the context manager will close the file in the last example. Not so with the prior example.
Special functions are crucial to making new types that are nice to use. For example, pandas Series and DataFrames make heavy use of special methods. For example, that's why we can filter rows in a DataFrame using df[bool_series]
-- the __getitem__
is smart enough to do something useful with lookup
.