Typed Python: Choose Sequence over List

What am I talking about?

I’ve been working with type hints in python for a few years now. Over time I’ve noticed certain patterns evolving in my code. This will be a short post on one of those patterns. It’s a small pattern where I try and be more precise in what I require or accept as a function input. Or more specifically why I try and default to writing the following:

from typing import Sequence

def do_a_thing(items: Sequence[float]):
    ...

instead of:

def do_a_thing(items: list[float]):
    ...

It’s not a hard rule. There are plenty of situations where list is fine (or even required) but defaulting to a sequence has a number of benefits.

Soft immutability (via a type checker)

If I try and write this code:

def calculate_sum_and_add_ten(items: Sequence[float]):
    items.append(10)
    return sum(items)

then I will get an error from mypy (or my IDE or any other type checker I may be running):

error: "Sequence[float]" has no attribute "append"  [attr-defined]

This means I won’t accidentally mutate a list I pass in to the function. If I expect the function to mutate the list then I can communicate this fact by altering the typehint to a list. This helps make my intent clear. I took a list because I wanted a list (with all its mutability).

Covariance

Most of the time an int can be treated as a float. Your code will treat 5 as effectively being 5.0. So a function which accepts a float can be passed an int without any issues. This breaks down though once you have a function taking a list of floats. If you try and write the following code:

def double_then_sum(items: list[float]):
    return sum(item * 2 for item in items)
    
my_integers: list[int] = [2, 4]
my_doubled_total = double_then_sum(my_integers)

then mypy (or another type checker) will give the error:

error: Argument 1 to "double_then_sum" has incompatible type "list[int]"; expected "list[float]"  [arg-type]
note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance
note: Consider using "Sequence" instead, which is covariant

You can read a thorough writeup of what’s going in the page linked by the error, but effectively it’s because the function signature double_then_sum(items: list[float]) means “I accept a list of floats and may add a float to it”.

You can see why this wouldn’t work with this contrived example:

def add_5_point_0(items: list[float]):
    items.append(5.0)
    
my_integers: list[int] = [2, 4]
add_5_point_0(my_integers)
print(my_integers)
# 2, 4, 5.0
#       ^ this is clearly not an integer

Sequence does not have the same problem because it cannot be appended to. So if I pass the function a list of integers the type checker can ensure that it stays as a list of integers.

Accepts a wider variety of inputs

Another benefit of Sequence is that it can accept a much wider variety of types (including custom classes written by you). This makes it much easier to write functions that are re-usable and compose well together.

Consider my earlier double_then_sum function. But this time I’ve got an input that’s a tuple. This seems like a perfectly valid use-case. There’s no reason why I should have to convert this to a list.

my_integers = (2, 4, 5)
my_doubled_total = double_then_sum(my_integers)

However, mypy says the following:

error: Argument 1 to "double_then_sum" has incompatible type "tuple[int, int, int]"; expected "list[int]"  [arg-type]

A tuple is not a list. But a tuple is a Sequence.

Bonus - I would also consider Collection instead of Sequence

I also often go a step further with specifying the type to indicate exactly what I require. If the order of the items doesn’t really matter to my function then I can hint as a Collection instead.

My function double_then_sum should probably work with sets:

some_set_of_numbers = set([2, 3, 4])
my_doubled_total = double_then_sum(some_set_of_numbers)

but mypy says:

Argument 1 to "double_then_sum" has incompatible type "set[int]"; expected "Sequence[int]"  [arg-type]

I can fix this by swapping Sequence for Collection:

from typing import Collection
def double_then_sum(items: Collection[int]):
    return sum(item * 2 for item in items)
    
some_set_of_numbers = set([2, 3, 4])
my_doubled_total = double_then_sum(some_set_of_numbers)

The type checker is then happy with this as a set is an instance of a Collection.

Wrapping up

This is a fairly specific example but generally what I’ve been trying to do is be more intentional about what data types a function actually requires. I’ve found the upside is more bug free and more flexible code. It requires a little bit more thought on my side as I won’t just always reach for a list or a dict but I’m quite happy with the results.

Comments (from mastodon)