Don't memorize, automate!

Software projects are full of rules that you must memorize:

  • "Do not modify this file, as it is automatically generated."
  • "If you modify the SchemaVersion field in this module you must also edit config/schema_version.ini."
  • "All classes and functions must be documented."

But human beings are very bad at remembering things, especially if they're sleep-deprived parents like me. A moment of distraction and you've forgotten to write an upgrade script for the database schema... and you might only find out after the production upgrade fails. What to do? Automate!

Most of the time we think of testing as testing of the product we're developing. E.g. you might want to make sure your website operates as expected. But sometimes we want to test the software development process. When the instructions in the source code are addressed to you, the developer, that suggests that those are process requirements. The user doesn't care how well-documented your code is, they just want the website to work.

When you see process instructions there are three approaches you can take, from best to worst:

  1. Automate out of existence. If you have a coding standard that requires certain formatting, use a tool that automatically formats your code appropriately. The Go programming language ships with gofmt, Python has third-party utilities like autopep8, etc..
  2. Test for problems and notify the developer when they are found. If your coding standard requires docstrings (or Javadoc comments, etc.) you can't have software create them: a human must do so. You can however test for their absence using lint tools, which can be run as part of your test suite. When you add a missing docstring the lint tool will automatically stop complaining.
  3. Automatically remind yourself of a necessary step. After you've taken the necessary actions you manually indicate you have done so.

Let's look in detail at the third category of automation since it's less obvious when or how to implement it. Imagine you have some source file which needs to be reviewed for legal compliance every time it is changed. In cases like this the necessary action you have to take is very hard to check automatically: how would your software know whether you've talked to the company lawyer or not? You can however remind yourself or whoever is making changes that they need to do so:

$ py.test test_importantfile_changed.py
1 passed in 0.00 seconds

$ echo "I CHANGED THIS" >> importantfile.txt

$ py.test test_importantfile_changed.py
========================= FAILURES =========================
E           AssertionError: 
E           'importantfile.txt' has changed!
E           Please have Legal Compliance approve these
E           changes, then run 'python compliance.py'.
1 failed in 0.00 seconds

$ python compliance.py
Hash updated.

$ py.test test_importantfile_changed.py
1 passed in 0.00 seconds

The implementation works by storing a hash of the file alongside it. Whenever the tests are run the stored hash is compared to a newly calculated hash; if they differ an error is raised. Here's what compliance.py looks like:

from hashlib import md5

SCHEMA_FILE = "importantfile.txt"
HASH_FILE = SCHEMA_FILE + ".md5"

def stored_hash():
    with open(HASH_FILE) as f:
        return f.read()

def calculate_hash():
    hasher = md5()
    with open(SCHEMA_FILE) as f:
        for line in f:
            hasher.update(line)
    return hasher.hexdigest()

if __name__ == '__main__':
    # Update the stored hash if called as script:
    with open(HASH_FILE, "w") as f:
        f.write(calculate_hash())
    print("Hash updated.")

And here's what the test looks like:

from compliance import stored_hash, calculate_hash

def test_schema_changed():
    if stored_hash() != calculate_hash():
        raise AssertionError("""
'importantfile.txt' has changed!
Please have Legal Compliance approve these
changes, then run 'python compliance.py'.
""")

Instead of trying to remember all the little requirements of your development process, you just need to remember one core principle: automation. So automate your process completely, or automate catching problems, or just automate reminders if you must. Automation is what software is for after all.


You shouldn't have to work evenings or weekends to succeed as a software engineer. Get to a better place with the Programmer's Guide to a Sane Workweek.