A few months ago, we released a Python SDK for Dagger. This SDK makes it easy to develop CI/CD pipelines in Python and run them on any OCI-compatible container runtime. Since the release, we’ve been continuously updating the SDK by adding new features and making changes to improve the code quality. In this blog post, I’ll discuss the “why” and “how” of one of our most recent improvements: adopting Ruff as the linter for our Python code.
Ruff is a new Python linter that is very fast because it’s built in Rust. In its inaugural post “Python tooling could be much, much faster”, Charlie Marsh (Ruff’s creator) goes into more detail over how it works, tradeoffs, and implications, which I think is worth a read.
Charlie was one of our early Python SDK testers before we launched. At the time I noticed he built a new tool and checked it out. I found it interesting but, as an early project, it certainly didn’t cross my mind to use it in Dagger then.
A few weeks ago I came across a tweet from Charlie noting that Pandas was moving to Ruff and that caught my attention because Pandas is a very well known and popular Python project. I took another look and saw that several other popular open-source projects were using it as well. Some of which I’ve used, like FastAPI, Pydantic, Sphinx and Hatch.
There were other signs that pointed to Ruff’s growing popularity, including a tweet where PyPI (the Python packages repository) reported that the project had become critical (i.e., a project in the top one percent of downloads over the previous six months).
I thought it was time to give Ruff a try in Dagger, not because I was looking to solve a specific problem, but because I was curious. During experimentation, I found enough benefits to make the switch.
The Adopt Ruff PR demonstrates both a reduction in dependencies and numerous code improvements to satisfy the linter. Below, I’ll expand a bit more on what I found were the main benefits.
The main advantage of Ruff is its speed, but it wasn’t the deciding factor because our codebase is small, so linting time wasn’t a problem for us. Linting approximately 6500 lines of code in 40+ files used to take over 3 seconds. With Ruff, it takes approximately 0.03 seconds. That’s 100x faster.
The speed bump unexpectedly improved my productivity, because Ruff is fast enough to lint while editing code and provide immediate feedback. I used to run the linter in the terminal before committing, but Ruff has good documentation on supporting multiple environments and IDEs, which led me to set up ruff-lsp. Now, I get the warnings directly in my IDE, in real-time.
diff in lint dependencies and commands in the Adopt ruff PR
Reducing dependencies always feels good. This was a major advantage for us.
Like many Python projects, Dagger was using flake8 and isort to lint our codebase, as well as autoflake to automatically fix a few linting violations during formatting. Flake8 has a pluggable architecture, which means you need to install quite a few plugins if you want to cover more ground, adding to your dependencies.
Ruff has near feature parity with these tools and is implementing many popular flake8 plugins in its codebase, making them “batteries included”. So, your project has fewer dependencies, you get many more linting rules for free, and it all still runs faster.
With flake8, and even Ruff, you usually specifically select which rules you want to use for linting. With Dagger, I decided to take the opposite approach - excluding the ones I didn’t want but allowing everything else.
This means that when Ruff is updated (which usually results in more checks due to the implementation of more flake8 plugins), linting is likely to fail. But I like it that way since I get notified by GitHub’s dependabot automatically that there’s code that could probably be improved by following some best practice or avoiding a security or potential bug issue. That’s what linters are for.
Compared to the stagnation we had before, having only a short set of linter rules enabled, this is an easy win for continually improving the codebase.
Ruff is released frequently. This is good but puts a burden on maintenance. I get a new dependabot PR for it almost daily but can’t review them everyday. Dependabot closes the previous PR and opens a new one so there’s only ever one to deal with at any time, but I don’t like to ignore it in the meantime. I’m thinking about changing the schedule frequency from daily to weekly and reserving a little bit of time each week to review all updates.
We still use black to format code. It’s also tied to the lint step because I consider it a linting violation if black has changes to make. So in both the lint and format steps, we’re only using Ruff and black now. It would be great to only use Ruff for both steps. It seems others want it too and that some experimentation has started on this, so it may become a reality soon.
Why not use a more performant language to write tooling for Python? Ruff started as a proof-of-concept to question the status quo. On one hand, it limits contribution by pythonistas but on the other hand, as a user of the tool, I do appreciate the fast execution and reduction of dependencies while still getting many more rules in return.
I’m looking forward to seeing how this project evolves and if it inspires even more performant tools to emerge as well.