Advanced typing structures - how to express your types better and make your code more robust.

Posted on śro 23 lutego 2022 in Python


Contents

Static typing has gone through a rough way in Python community. Finally, it's becoming a production standard as typed code lets you catch bugs faster. Moreover, I bet that anyone who started using it and was tasked to refactor the code months later, can clearly see its huge benefits.

In this post, we're going to focus on advanced structures offered by Python typing module. These structures help you better express your typing intents.

Here's a list of topic we'll cover in this post:

Generics

Let's imagine a following scenario:

from typing import Any

def first(seq: list[Any]) -> Any:
    return seq[0]

First of all, try to avoid using Any in your codebase. According to mypy docs:

A static type checker will treat every type as being compatible with Any and Any as being compatible with every type.

In the long run, this will obviously result in bugs hiding in your code. In simple terms, Any disables type checker. To solve this problem, you can use generics. The code would look like this:

from typing import TypeVar

T = TypeVar('T')
def first(seq: list[T]) -> T:
    return seq[0]

first_item = first([1,2,3])

mypy now knows that first_item has type int and will use this information for further type checking. That's the power of generics. They let you preserve information about types. Additionally, you can restrict TypeVar to specific types.

from typing import TypeVar

T = TypeVar('T', int, float)
def first(seq: list[T]) -> T:
    return seq[0]

first_item = first([1,2,3]) # Ok
another_first_item = first(["1","2","3"]) # Throws an error!

If you need to be less restrictive, you can use T = TypeVar('T', bound=float) to allow all subtypes for a specified type.

Finally, generics help you solve a common problem that you'll eventually see in any large codebase. Imagine that an intention of a developer is to create a function that adds two numbers of the same type, but supports additions for int and float. Here's an implementation using basic types:

from typing import Union

def add(first: Union[int, float], second: Union[int, float]) -> Union[int, float]:
    return first + second

add(1, 2.1) # Shouldn't be allowed, yet mypy doesn't complain

Unfortunately, with this type annotations, nothing blocks a client from calling add(1, 2.1). Mypy is not able to catch the following error and add function becomes unstable in terms of type that is returned. Here's how you can fix this using generics:

from typing import TypeVar

T = TypeVar('T', int, float)

def add(first: T, second: T) -> T:
    return first + second

add(1, 2.1) # mypy throws an error!

This implementation clearly shows the intention of a developer. Now, only numbers of the same type can be added.

Protocols

We all have gone down the road of writing code where objects are inheriting multiple times (or worse, we're asked to take over a codebase that does it). This becomes a problem when we start typing our code as mypy errors start to scream. Protocol is a great solution to add type annotation for methods that really matter to you. Here's an example:

from typing import Protocol

class MyProtocol(Protocol):
    a: str
    def set_a(self, new_value: str) -> None: ...
    def get_a(self) -> str: ...

def some_function_gazylion_modules_away(obj: MyProtocol) -> str:
    obj.set_a("1")
    return obj.get_a()

Now, any time some_function_gazylion_modules_away is used, it expects an object that has a attribute and two methods with an exact same type signature. It doesn't matter if object that is passed implements some additional methods. Only characteristics specified in Protocol definition are important for a type checker.

Additionally, Protocol can be checked at runtime when you decorate it with runtime_checkable.

from typing import Protocol, runtime_checkable

@runtime_checkable
class MyProtocol(Protocol):
    a: str
    def set_a(self, new_value: str) -> None: ...
    def get_a(self) -> str: ...

Callable

Mypy offers Callable type. For an object to be a valid Callable, it must implement the __call__ method. Callable has the following syntax:

Callable[[<list of input argument types>], <return type>]

Here's how you can use it to type check your decorators:

from time import time, sleep
from typing import Callable

def time_it(func: Callable[[], int]) -> Callable[[], int]:
    def wrapper() -> int:
        start_time = time()
        func()
        end_time = time()
        duration = end_time - start_time
        print(f'This function took {duration:.2f} seconds.')

    return wrapper

@time_it
def computation() -> int:
    sleep(10)
    return 10

computation()

Overload

Sometimes using Union is not enough to express properly a behaviour of a function. A common use case could be a file reading function like this:

from typing import BinaryIO, TextIO, Union

def read_file(file: Union[TextIO, BinaryIO]) -> Union[str, bytes]:
    data = file.read()
    return data

Type annotations do not express what is exactly returned when either TextIO or BinaryIO is passed. We already mentioned that using generics is one solution for this problem, but there's another one that you might like more. You can fix this by adding overload decorators:

from typing import BinaryIO, TextIO, overload

@overload
def read_file(file: TextIO) -> str: ...
@overload
def read_file(file: BinaryIO) -> bytes: ...

Now mypy knows exactly what are return types in case of a particular input. In my experience, adding overload annotations can quickly add a lot of repeated code to your production files. Consequently, this reduces readability. To fix this, you could start using Stub files.

Stub files

Often, we face projects that were developed before mypy was popular. Adding types gradually is the best way to make a project more robust, but let's face a truth - sometimes we just don't have time for it.

Luckily, mypy offers a "workaround" that allows us to type project without modifying the original source code, and it's called Stub files. Let's imagine having a production code (or a third party library) you can't touch:

cache = {}

def some_function(untyped_kwarg):
    return 

You can create a stub file by adding *.pyi file in your project.

from typing import Dict

cache: Dict[int, str]
def some_function(untyped_kwarg: int) -> int: ...

Note: The .pyi file takes precedence if a directory contains both a .py and a .pyi file.

Summary

Static type hints require a lot of effort and have a steep learning curve, but can eventually save you tons of time.
Hopefully, structures presented in this post can add new superpowers to your Python stack!

Happy coding!