Maintainable Python Applications: a Guide for Skeptical Java Developers

When you've been writing Java for a while switching to Python can make you a little anxious. Not only are you learning a new language with new idioms and tools, you're also dealing with a language with far less built-in safety. No more type checks, no more clear separation between public and private.

It's much easier to learn Python than Java, it's true, but it's also much easier to write unmaintainable code. Can you really build large scale, robust and maintainable applications in Python? I think you can, if you do it right.

The suggestions below will help get you started on a new Python project, or improve an existing project that you're joining. You'll need to keep up the best practices you've used in Java, and learn new tools that will help you write better Python.

Tools and Best Practices

Python 2 and 3

Before you start a new Python project you have to choose which version of the language to support: Python 3 is not backwards-compatible with Python 2. Python 2 is only barely being maintained, and will be end-of-lifed in 2020, so that leaves you with only two options with long term viability:

  1. A hybrid language, the intersection of Python 2 and Python 3. This requires you to understand the subtleties of the differences between the two languages. The best guide I've seen to writing this hybrid language is on the Python Future website.
  2. Python 3 only.

Most popular Python libraries now support Python 3, as do most runtime environments. Unless you need to write a library that will be used by both new and legacy applications it's best to stick to Python 3 only.

However, on OS X you'll need to use Homebrew to install Python 3 (though using Homebrew's Python 2 is also recommended over using the system Python 2). And on Google App Engine you'll need to use the beta Flexible Environment to get Python 3 support.

Static typing

Java enforces types on method parameters, on object attributes, and on variables. To get the equivalent in Python you can use a combination of runtime type checking and static analysis tools.

  • To ensure your classes have the correct types on attributes you can use the attrs library, though it's very useful even if you don't care about type enforcement. This will only do runtime type checking, so you'll need to have decent test coverage.
  • For method attributes and variables, the mypy static type checker, combined with the new Python 3 type annotation syntax, will catch many problems. For Python 2 there is a comment-based syntax as well. The clever folks at Zulip have a nice introductory article about mypy.

Public, private and interfaces

Python lets you do many things Java wouldn't, everything from metaclasses to replacing a method at runtime. But while these more dynamic capabilities can be quite useful, there's nothing wrong with using them sparingly. For example, while Python allows you to set random attributes on a passed in object, usually you shouldn't.

  • As with Java, you typically want to interact with objects using a method-based interface (explicit or implicit), not by randomly mucking with its internals.
  • As with Java code, you want to have a clear separation between public and private parts of your API.
  • And as with Java, you want to be coding to an interface, not to implementation details.

Where Java has explicit and compiler enforced public/private separation, in Python you do this by convention:

  • Private methods and attributes on a class are typically prefixed with an "_".
  • The public interface of a module is declared using __all__, e.g. __all__ = ["MyClass", "AnotherClass"]. __all__ also controls what you gets imported when you do from module import *, but wildcard imports are a bad idea. For more details see the relevant Python documentation.

As for interfaces, if you want to explicitly declare them you can use Python's built-in abstract base classes; not quite the same, but they can be used as pseudo-interfaces. Alternatively, the zope.interface package is more powerful and flexible (and the attrs library mentioned above understands it).

Tests

Automated tests are important if you want some assurance your code works. Python has a built-in unittest library that is similar to JUnit, but at a minimum you'll want a more powerful test runner.

  • nose is a test runner for the built-in unittest, with many plugins.
  • pytest is a test runner and framework, supporting the built-in unittest library as well as a more succinct style of testing. It also has numerous plugins.

Other useful tools:

  • Hypothesis lets you write a single function that generates hundreds or thousands of test cases for maximal test coverage.
  • To set up isolated test environments tox is useful; it builds on Python's built-in virtualenv.
  • coverage let's you measure code coverage on your test runs. If you have multiple tox environments, here's a tutorial on combining the resulting code coverage.

More static analysis

In addition to mypy, two other lint tools may prove useful:

  • flake8 is quick, catches a few important bugs, and checks for some standard coding style violations.
  • pylint is much more powerful, slower, and generates massive numbers of false positives. As a result much fewer Python projects use it than flake8. I still recommend using it, but see my blog post on the subject for details on making it usable.

Documentation

You should document your classes and public methods using docstrings. Unless you're using the new type signature syntax you should also document the types of function parameters and results.

Typically Python docstrings are written in reStructuredText format. It's surprisingly difficult to find an example of the standard style, but here's one.

Sphinx is the standard documentation tool for Python, for both prose and generated API docs. It supports reStructuredText API documentation, but also Google-style docstrings.

Editors

A good Python editor or IDE won't be as powerful as the equivalent Java IDE, but it will make your life easier. All of these will do syntax highlighting, code completion, error highlighting, etc.:

  • If you're used to IntelliJ you can use PyCharm.
  • If you're used to Eclipse you can use PyDev.
  • Elpy is a great Emacs mode for Python.
  • Not certain what your best bet is for vim, but python-mode looks plausible.

Writing maintainable Python

In the end, writing maintainable Python is very much like writing maintainable Java. Python has more flexibility, but also more potential for abuse, so Python expects you to be a responsible adult.

You can choose to write bad code, but if you follow the best practices you learned from Java you won't have to. And the tools I've described above will help catch any mistakes you make along the way.


Broken software, bad job offers: you can learn from two decades of my mistakes. Join 1500 other programmers and learn how to avoid a new mistake every week.