The evolution of resource cleanup in Python

The software you use was written by developers just like you. They made mistakes just like you do, learned and improved just like you do. But you have a huge advantage: you can learn from the mistakes and discoveries they have already made. I’ve discussed comparing and contrasting a single task across multiple alternative technologies, using resource cleanup idioms of C++, Go and Python as an example. Another technique for gaining technical depth is examining a single technology and seeing how its support for a single task evolved over time. As an example, let’s consider resource cleanup in Python.

To recap the previous post, resource cleanup is a problem any programming language needs to solve: you’ve opened a file, and eventually you will need to close it. However, in the interim your code might return, or throw an exception. You want to have that file cleaned up no matter what, so you don’t leak resources:

def write():
    f = open("myfile", "w")
    # If this throws an exception, e.g. when disk is full,
    # then f.close() will never be run:
    f.write("hello")
    f.close()

In Python the clean up idiom started as an extension to the exception handling syntax:

def write():
    f = open("myfile", "w")
    try:
        f.write("hello")
    finally:
        f.close()

Whether the code in the try block returns, throws an exception or continues, the finally block will always be called.

What problems can we spot in this idiom? Why would the Python developers try to improve on it? Let’s make a list; notice that these are all focused on solving problems with humans, not with computers:

  1. You still have to remember the particular function name to clean up each kind of resource, e.g. close() for files and release() for locks.
  2. It’s repetitive: every single time you open a file you have to call close() on it, meaning more code to write and more code to read.
  3. The cleanup code happens long after the resource is initialized, interrupting your flow of reading the code.

The Python developers eventually came up with a new, improved language feature that solves these problems:

def write():
    with open("myfile", "w") as f:
        f.write("hello")

This solves all three problems:

  1. Each resource knows how to clean itself up.
  2. Clean up is done automatically, no need to explicitly call the method.
  3. Clean up is done at the right time, but without extra code to read.

Can this be improved? I think so. Consider the following example:

>>> with open("/tmp/file", "w") as f:
...     f.write("hello")
... 
5
>>> print(f)
<_io.TextIOWrapper name='/tmp/file' mode='w' encoding='UTF-8'>

Even though the the file has been closed, the resource has been cleaned up, the variable referring to the object persists outside the with block. This is “namespace pollution”, extra unnecessary variables being added that can potentially introduce bugs. Better if f only existed inside the with block.

By examining the improvements to a particular feature you can take advantage of all the hard work the developers put into coming with alternatives: instead of just learning from the latest version you can see how their thinking evolved over time. You can then try to come up with improvements on your own, to exercise your new understanding. While I’ve examined a language feature in this post you can apply the same technique to any form of technology. Pick your favorite database, for example, and a feature that has evolved over time: why did it change, and what could be improved?

The developers who wrote the software you use had to find their solutions the hard way. You should respect their work by building on what they have learned.


You shouldn't have to work evenings or weekends to succeed as a software engineer. Get to a better place by reading The Programmer's Guide to a Sane Workweek.


You might also enjoy:

Write test doubles you can trust using verified fakes
Resource cleanup, compared: Python, Go and C++
Maintainable Python applications: a guide for skeptical Java developers
Object ownership across programming languages