TDD with pytest

This week I had the honour of starting off a series about test-driven-development and the tools used to do it at our company. So I gave a tutorial introducing py.test. The slides are here and the following is the textual version from it. You don't need to look at the slides to follow the text below.

TDD the big picture

I realized I don't actually know why one should write tests. And more especially why one should do test-driven-development. I know of the benefits of tests, I expirience them once or twice a day at work. But what is the ultimate argument for TDD? Is there a killer argument for TDD?

why tests

  • Tests prevent bugs from recurring. Nothing is more depressing than fixing a bug again after it re-occured.
  • Tests proof that a solution works. When the tests resemble reality (hint: be easy on the mocking), they are a good way to show that the choosen solution works. Some frameworks even allow repeated execution and time-measurements to show the new solution is faster then the old one.
  • Tests can check for complicated error-combinations. When its a five-step-interaction and wrong input to the third step breaks the last step? Write a test to assure graceful handling, then make that test past. Without clicking through these five steps manually for ten times until you fixed the bug.
  • Time spent fixing regressions is better spent writing tests in the first place. Some people don't want to write tests because that time is better spent inventing new features. But then they spent twice that time when debugging some bugs the QA found. Better to write the tests before or directly after implementing the feature (when its fresh in the human memory).
  • Tests provide better documentation than explicit tutorials or howtos. Developers commenting code are rare, developers writing extensive API documentation are rare, developers adding extensive how-tos and tutorials are only found once a year. But the ultimate resource for seeing how a function/class/module is used is by looking at the tests and what they do. And while traditional documentation tends to be outdated, (passing) tests are always up-to-date with the implementation. So by writing tests you are actually writing tests and documentation at the same time.
  • Tests are the first customer. Sometimes code is written 'according to the specs' but when integrating the code into the bigger picture it all falls apart. So the same as a QA-department is the first customer of your product, tests are the first customer of your code.

test everything

One argument I heard:

“I don't know what it should look like, I can't start with the test!”

Lets take that apart:

  • If you don't know what it should be, why are you starting to write code? Developing software is actually a lot about thinking. And no matter whether you are thinking the problem top-down or bottom-up, starting with the tests lets you write a usage-scenario for your code before you write the code. So first you think about what it should look like, then you formulate how it should look like in a test, then you write the actual implementation.
  • Tests aren't unchangable. When you did TDD-by-the-book and realize the scenario you developed in the tests is wrong, well, change the tests. Thats actually the first refactoring cycle: Write tests, implement the code. Realize the tests are wrong, rewrite the tests, then rewrite the code.
  • Writing tests with incomplete knowledge can result in more tests, especially in tests for bad paths. When you start developing a solution for a problem, you might not know all the relevant facts before-hand. So while the tests evolve you will have intermediate states that feed incomplete or wrong data to your algorithm. Leave these tests in and make them check for sane behaviour of your code when given wrong input. Then you have tests not only for the good code-paths but also for bad paths.

Set up the env

Lets get practical and set up the environment. The tools that set the stage:

  • python + virtualenv + pip The dream-team of python
  • virtualenv:
    • It creates encapsulated python environments.
    • Only the selected python version and the choosen packages are available inside.
    • It prevents system-environment from cluttering with one-off packages.
    • Allows easy switching between python versions for tests.
  • pip:
    • The package installer for python.
    • Downloads and installs packages from pypi.python.org (or its mirrors).
    • Resolves dependencies between packages and their version.

virtualenv

Create a virtual environment:

  $> virtualenv .venv

Activate the virtual environment:

  $> . .venv/bin/activate

Install requirements with pip:

  (.venv)$> pip install pytest

the 'requirements.txt'

A file to gather required dependencies and versions (current at the time of writing):

  (.venv)$> cat requirements.txt
  pytest==2.7.0
  pytest-xdist==1.11
  (.venv)$>

Install all requirements with pip:

  (.venv)$> pip install -r requirements.txt

py.test

py.test is a test-framework for python. While it is backward-compatible to unittest, it has:

  • better test-discovery
  • better assertions
  • better output
  • better execution
  • lots of extensions (localserver, coverage, bdd, mock)

finding tests

With py.test tests are found by:

  • files test_<something>.py
  • classes class Test<something>(object): inside the files
  • functions def test_<something>(): either in the test-classes or directly in the files

write py.test tests

A first test in test_problem.py:

def test_true():
  assert True

and run it:

(.venv)$> py.test
========== test session starts ==========
platform linux2 -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
plugins: xdist
collected 1 items

test_problem.py .

========== 1 passed in 0.06 seconds =======

Run it continuously and verbose:

(.venv)$> py.test -f -v

-f = follow - Runs all (selected) tests - Watches for changes in the code (repeat this until aborted with CTRL+C) - When no test failed before, run all tests - When some tests failed before, run only the failed tests - Once all failed tests pass again, run all tests again

-v = verbose - Print the test-names and 'PASSED' / '[FAILED' instead of just '.' / 'F'

make it fail

Now make the tests fail by replacing the assert True with a x = 5; assert x == 3:

[...]
collected 1 items

test_problem.py::test_true FAILED

========== FAILURES ==========
__________ test_true __________

    def test_true():
        x = 5
>       assert x == 3
E       assert 5 == 3

test_problem.py:4: AssertionError
========== 1 failed in 0.09 seconds ==========

Notice how py.test outputs the code of the assertion and also introspects the value of variables in the assert statement.

pytest.mark.parametrize

Parameters for py.tests have special meaning: they load fixtures (see pytest.org/latest/fixture.html).

Unless given as parameters;-)

import pytest

@pytest.mark.parametrize(
  'input1, input2, equal',
  [
    (1, 1, True),
    (-1, 1, False)
  ]
)
def test_compare(input1, input2, equal):
  assert (input1 == input2) == equal

Run it:

(.venv) $> py.test
[...]
collected 3 items

test_problem.py::test_true PASSED
test_problem.py::test_compare[1-1-True] PASSED
test_problem.py::test_compare[-1-1-False] PASSED

========== 3 passed in 0.11 seconds ==========

hands on

After this quick overview and introduction its time for some more advanced tasks. These where done as singles and pairs at the laptops in the workshop and can be solved within an hour.

Lets assume a product-owner gives super-small incremental tasks. Write the tests first, then fix the code to fulfill the tests.

divide by three - part I

  When I give it the number 3
  Then it should return "Fizz"

  When I give it the number 6
  Then it should return "Fizz"

  When I give it the number 4
  Then it should return "4"

Task:

  • Write as three tests (transform the gherkin above into py.test-tests).
  • Implement the function to make the test pass ;-)
  • Then re-write as one parameterized test.

divide by three - part II

  When I give it a number dividable by 3
  Then it should return "Fizz"

(Why didn't you say so before?)

Task:

  • Write a test to feed mod3 numbers and check for "Fizz".
  • Write a test to feed non-mod3 numbers and check for original number.
  • Fix the function to make the tests pass.

divide by five

  When I give it a number dividable by 5
  Then it should return "Buzz"

Task:

  • Write a test to feed mod5 numbers and check for "Buzz".
  • Make the tests pass.
  • Write a test to feed non-mod5 numbers and check for original number.
    • Notice anything with 15, 30, 45, 60 and so on?

Finally

This is where this tutorial stops. Hopefully you learned as much reading this as we did when I gave this tutorial at our office.

How to change a whitelist into a blacklist

At work we had an interesting requirement some days ago: We are displaying user-input in descriptions and comments and want to allow both markdown and html. The markdown-rendering is done in javascript on the client-side. But sanitizing the input to fix the html (users do forget to close tags) and disallowing certain tags like script is done on the server-side before it hits the clients browsers.

Bleach

So we came to bleach, an html sanitizer. And bleach is great! It allows to give a list of allowed tags and dictionaries of allowed attributes per tag. And more stuff like filtering the allowed classes and style per element. It also allows to give a function to decide whether the attribute is allowed for that tag.

Sadly it didn't allow to give a function to decide whether the tag is allowed at all. So I got on github, forked the repo and created a patch/pull-request.

My patch allowed to give bleach.clean a function to check for the allowed tags. I also wrote unittests for giving functions as allowed_attributes and allowed_tags. Unfortunately the maintainer denied my pull-request as it allows to change the behaviour of bleach from a whitelist to a blacklist very easily. Well duh, that was the intention. But I do understand and respect that decision!

So is there another way to solve our little problem?

Whitelist in bleach

Lets take a look at how bleach does the whitelisting. Don't worry, its really easy: bleach uses the given list of allowed tags as allowed_elements:

if element in allowed_elements:
    # Do the rest of the parsing

But what if allowed_elements isn't a list? What if its a custom object that just happens to implement __contains__()?

Blacklist script-tag instead of whitelisting everything else

Lets write a little blacklist-object that inverts the behaviour of __contains__.

class BlackList(object):
    def __contains__(self, value):
        return value not in ['script']

html = bleach.clean(html, tags=BlackList())

Done. Changed the whitelist of bleach.clean() into a blacklist to allow all tags excluding only the script-tag.

Our version is a little bit more advanced, it also takes a list-argument to __init__ to set an extendable list of forbidden tags.

Of course this depends on the way bleach works, which might change with future versions. But that is one of the reasons we have unittests. Not only do these protect against developers changing one part of the code and breaking a completely different corner they didn't think of. Unittests also protect against changed behaviour in your dependencies...

Update 2014-10-14: Gave a small talk (slides) about this at the Leipzig Python User Group.

Update of my Atom-build

Atom.io

A while back I posted my build of the Atom-Editor as deb-package. Seems as if some people actually downloaded that. So here is update:

atom-0.125.0-amd64.deb Revision a00b3b2cc925dbbfb064fd0b50f0d0895a495548 from trunk

Update: On ubuntu its better and easier to use the ppa from WebUpd8

As always without warranty, support or anything similar. You may say thanks if you like this service;-)

The system used to build this package is an ubuntu 14.04 since recently.