Best practices for container compliance

How ensuring continuous compliance through automated policy enforcement can help keep your infrastructure, and your containers, stable and secure.
Part of
Issue 17 May 2021


When you’re setting up your infrastructure to deploy that first demo service container, it’s easy to neglect the enforcement of best practices in the interest of moving fast. Before you know it, critical workloads are running, but best practices aren’t being followed. This can lead to that one resource-hungry service consuming all the memory and bringing down other services, or a third-party library exploiting privileged user access to sinister ends. 

But there’s a solution to these scenarios: continuous compliance. By investing in compliance from your infrastructure’s inception, you’ll benefit from a more secure and predictable environment from day one. Best of all, you can enforce these practices through automation, reducing developer toil. And if you already have critical workloads running in your infrastructure, it’s never too late to start leveling up.

Building policy guardrails

“Continuous compliance” means complying with best practices for running containers in a, well, continuous manner. If you’re deploying regularly—multiple times a month, week, or day—you can’t rely on quarterly compliance assessments to catch weaknesses quickly enough. You need a benevolent overlord, if you will, to continuously watch those containers. 

The key to continuous compliance is to implement best practices as code, transforming them into policies. A policy is a best practice that’s been encoded in a machine-parsable, auditable, and reproducible form; instead of a manual checklist of best practices, you’ll create a repository of policies, which form the guardrails for your container infrastructure. 

Creating policies for your organization will be an iterative process, likely involving the teams implementing your build and deployment pipelines, container infrastructure, and security strategy. Start small, consulting industry recommendations like Sysdig’s “Container and Kubernetes Security Checklist” and StackRox’s “Docker Container Security 101” to create a list of must-have items, and add to it over time as the process matures. Augment these with best practices derived from your organization’s unique requirements.

Next, set up a mechanism to automatically enforce these policies. Automation is essential to continuous compliance, ensuring policies are regularly and uniformly enforced. It removes the possibility of human error, and can save your org countless hours of repeated manual effort.

What these steps look like in practice depends largely on your choice of tooling and the specifics of your infrastructure, such as how you deploy your containers or which container orchestrator you’re using. In the next section, we’ll look at a few flexible process and tooling recommendations. 

Policy enforcement, step by step

A container image can be deployed through a service’s build and deployment pipeline, manual operation by a human, and external event triggers, such as scheduled jobs. These processes suggest two possible surfaces for your policy checkpoints: internally, inside the container infrastructure, and externally, in the build and deployment pipeline. 

For maximum coverage, aim to implement policy enforcement at both of these checkpoints. Start by integrating policy enforcement into your container infrastructure; this ensures that no matter what pathway an image takes, it’s subject to these checks. If you’re pressed for time, you can choose to address checks in the build and deployment pipelines at a later stage. (Only implementing checks in your infrastructure can be more cumbersome because the policy enforcement happens later in the deployment cycle, meaning you may have to revert a pull request or make multiple changes.)

In your container infrastructure

To enforce policies in your container infrastructure, you’ll want to set up a dedicated software component: a policy engine. Whatever software you use, it should be tightly integrated with the infrastructure so it can independently decide whether to allow a container to be deployed. 

If you use Kubernetes, Gatekeeper is an example of an off-the-shelf policy engine you can set up in your cluster. It configures admission controller webhooks in the cluster to make decisions based on the configured policies and prohibits workloads that violate these policies from running. For example, a container trying to run as a root user or using a non-tagged image won’t be allowed to start. Organizations that don’t use Kubernetes may need to invest in creating a custom solution using dedicated policy management software such as Open Policy Agent, which also works with Kubernetes.

If you’re rolling out policies to existing infrastructure, consider running your policy engine in audit mode first to identify any policy-violating workloads currently running in your cluster. Once you fix them—likely by changing how the Docker image is built or how the deployment configuration is written—you can turn off the audit mode and redeploy your workload.

Containers can have very short life spans, which poses an additional compliance challenge. If a malicious workload, like a crypto miner, only “wakes up” every few minutes or hours, it might not be around by the time you realize something’s going on and want to inspect it. I recommend adopting a solution that stores and logs container activity so you can query the logs at a later time if needed. Solutions such as Falco are able to go a step further, blocking suspicious processes at runtime. 

In your build pipeline

To ensure compliance in your build pipeline, implement static checking of Dockerfiles. Static checkers like Hadolint or Conftest will check for adherence to best practices and other custom policies, such as using a trusted base image or the presence of a specific label instruction. Since static checks run before the container image has been built, the result is a fast, automatic feedback loop.

Next, scan the container image using a vulnerability scanner such as Clair or Snyk. If the scanner finds a potential security issue, the image won’t be pushed to the registry. You’ll also want to enable container image scanning in the registry itself; most hosted and open-source registries offer this as a default. Finally, make sure you set up notifications so you’re alerted to, and can act on, any issues.

In your deployment pipeline

As with your build pipeline, static checking for Kubernetes manifests in your deployment pipeline will result in a fast feedback loop. These checks, using a tool like Conftest, should look for things like a valid service account in the deployment specification or the presence of a valid resource requirements specification. 

If you’re using a continuous deployment solution, perform these checks before you apply a change. This is often done using a hooking mechanism, in which a custom script can be executed before the application of the manifests. If you’ve adopted the GitOps model for managing your infrastructure code, you’d typically run these checks before a pull request is merged. 

Compliance is around the corner

Like cost tracking, compliance is often an afterthought when it comes to an organization’s infrastructure. But just as you shouldn’t wait for a financial squeeze to figure out why you’re paying so much on your cloud bill, you shouldn’t wait for an access exploit to implement solid policy enforcement mechanisms. 

The move to containers has inspired engineers and organizations to develop well-documented and version control–friendly solutions for automated policy enforcement, and a lot of these tools are open source, including many of the ones mentioned in this article. By unlocking continuous compliance through automated policy enforcement, your team can learn about drifts from a compliant state more quickly, and action can be taken without manual steps. This means no more noncompliant containers in your infrastructure—a shift that should continuously benefit your organization.

About the author

Amit Saha is a senior site reliability engineer at Atlassian based in Sydney, Australia. He’s the author of two books, including Doing Math with Python, and several other publications.


Buy the print edition

Visit the Increment Store to purchase print issues.


Continue Reading

Explore Topics

All Issues