More and more applications are running on Kubernetes, but application delivery is still a challenge. CI/CD operations are typically expressed in YAML or shell scripts; the more complex they are, the harder they are to decipher and debug. Delivery pipelines often perform unnecessary rebuilds and test reruns even when nothing has changed, resulting in long wait times and slow deployments.
Dagger solves these problems by giving platform engineers a rich API and native-language SDKs in which to express CI/CD operations, and intelligent caching to significantly accelerate delivery pipelines. When these features are combined with Kubernetes' scalability and orchestration capabilities, they not only make the delivery platform more efficient, but they also improve the overall experience for both development and platform teams.
If you're interested in running Dagger on Kubernetes, this blog post provides important background information to help you get started. It also explains the recommended architecture pattern and components, together with links to more detailed documentation.
TIP: Before going deeper, this may be a good time to remind yourself of the basics of Dagger.
By default, any Dagger SDK (seamlessly) uses the CLI to auto-provision the Dagger Engine as a Docker container. This is automatic and only requires that Docker is available locally, where the SDK runs. While this default behavior optimizes for developer convenience, and it's a good starting point on all CI/CD platforms where Docker is available, taking control of where the Dagger Engine runs has numerous benefits, starting with removing the Docker dependency.
By default, the same as when run locally: a Dagger SDK uses the CLI to auto-provision the Dagger Engine as a Docker container. If Docker is not available, auto-provisioning fails.
One of the most important reasons to use Dagger is persistent caching that works out of the box.
A Dagger pipeline is composed of discrete operations. The result of each operation is persisted locally, as a file on disk. These files are effectively OCI image layers. Data from mounts is also stored on a local path where the Dagger Engine runs. When running locally, this is great since caching is fast and reliable (most of us have SSDs, some even NVMEs). However, when running on a CI/CD platform, each run starts with a clean state (because the runners are ephemeral).
Re-using the same Dagger Engine state across CI/CD runs yields the same great caching benefits that are the norm when running pipelines locally, on workstations. And when you combine the Dagger Engine's caching powers with everything Kubernetes provides in terms of auto-scaling and orchestration, you end up with a fast, scalable and resource-efficient CI system!
The minimum required components are:
Kubernetes cluster, consisting of support nodes and runner nodes.
Certificates manager, required by Runner controller for admission control.
Runner controller, responsible for managing CI runners in response to CI job requests.
Dagger Engine on each runner node, running alongside one or more CI runners.
This architecture can be further optimized:
Here's an example of what this architecture might look like:
If you're interested in implementing this architecture in your own Kubernetes cluster, we recommend the following resources:
Lastly, the Dagger Team will be at KubeCon NA 2023. We’ve been hard at work preparing a bunch of new stuff, so come by our booth (N37) to see some demos of our newest tech, or just come tell us about your Dagger experience and what you’d like to see from us going forward. If you want some dedicated time with us, for example, to discuss how Dagger can help you speed up your pipelines, reach out to us and we’ll schedule some in-person time with you. We can’t wait to see you!