Mocking in Python tests

For a while now, I've felt unsure about mocking in Python. It has helped me test in the past, but I've also seen a lot of smart people talk or write about its pitfalls.

Recently, I had to test some functionality around calls to Twilio, so I decided to investigate to see if I could get a better understanding of the tradeoffs involved.

Blog posts and Conference Talks

This post really clarified things for me. Particularly this part following a section setting up some mocks for testing a couple of API calls (emphasis mine):

And you can imagine adding a few more tests, perhaps one that checks that we do the date-to-isoformat conversion correctly, maybe one that checks we can handle multiple lines. Three tests, one mock each, we’re ok.

The trouble is that it never stays quite that simple does it?

It hit me that the confusion I had with all of these blog posts is that the times I've used mocks, it has stayed that simple (I've also used them in other cases that provide value, but more on that later). This isn't to say that post is wrong. In fact, I agree with the author's point that when things get more complicated, mocks can cause trouble.

Another great example of this is an old (but not outdated) talk from Pycon 2012 in which two engineers from Google talk about how they evolved their codebase to deal with issues caused by large numbers of mocks spread across a large codebase. Ned Batchelder has a good summary of the talk here. If you are having trouble with mocks already, or the number of mocks in your test suite is growing, this is a great talk to see some possible solutions.

Hynek Schlawack has an interesting post about mocking (with a lot of great links for further reading) that covers another pattern I've used: instead of mocking the service you are calling, wrap the call(s) to that service in a function so you can more easily control it in tests. I've found this pattern especially helpful if you are calling a service with different sets of parameters.

I wrote about AWS S3's presigned URL API a few years ago. Back then I only had one way to call the function, as you can see in the example code. Since then, I've added a few files to download, so I'm now calling that API with a few different parameters. I wrapped that API call in a helper function to make testing easier and more robust.

Conclusion for External Web APIs

What I've taken away from all of this is that mocks are like most solutions in technology: they are not a panacea. But they also aren't worthless or always bad. I think Microservices are an appropriate analogy here. They can solve some interesting problems that come with scaling (headcount as well as requests), but their existence doesn't make a monolithic codebase a bad choice. So the next time you are looking to mock object(s) in Python, consider how much you need to mock. Depending on your answer to that, mocks might be helpful, or you might want to look into one of the solutions above.

Use Cases Beyond Web APIs

A lot of the discussion I see around mocks is about mocking external services so you aren't reaching out to a third party service in your tests. But there is another use case I found a lot of value in: skipping long running algorithms to test the code around it. In a previous job, I helped write a library for machine learning functions, a lot of which were machine learning algorithms that took significant time to run (especially compared to the time we'd like for a test to run). Mocking these allowed us to separate our test suite into unit tests that allowed us to test the code around these long running functions (that we ran on CI) and longer running tests that we ran locally and in nightly test runs. This is another good use case for mocks.