Policies are rules that ensure the appropriate, efficient, and secure use of an organization's information technology resources. We can distinguish two types of policies:
For example, an organizational policy might be “All deployments must be validated by the QA and security team before deployment in production”. Firewall policies are, as for them, examples of technical policies.
Policy As Code (PaC) is the process of translating policies into a piece of code. Both organizational and technical policies can be converted into PaC.
Th benefits of Policy as Code are similar to those of Infrastructure as Code:
Policy as Code’s operating principle is quite simple. It relies on a main component that we will call a policy engine. This engine is in charge of running a query on a set of policies and data to provide an answer.
Queries generally ask whether the input data conforms to the policy or requests a set of validated or non-validated policies. The answer of the policy engine, then, may be used to authorize an action or to generate a report on policy compliance.
The policy engine usually defines the policy language used to create policies. In practice, some data will often be passed simultaneously as the query, while some external data sources will be queried to handle complex policy validation.
Open Policy Agent (OPA) is an open-source policy agent backed by the Cloud Native Computing Foundation. It uses its own policy language: Rego.
Another agent worth mentioning is Sentinel, with some smooth integration with Terraform Cloud and Vault. However, while having an easy-to-read and comprehensible language, this solution still suffers from a closed-source ecosystem, and its language may not be as extended as OPA’s.
Kyverno is a policy agent for Kubernetes focused on security. In a previous article, we presented its capabilities and how it can help secure a Kubernetes cluster.
An interesting use case for Policy As Code is policy validation in the CD pipeline. In this example, we will see how to force the use of a GitOps approach on Terraform Cloud, using the Sentinel policy engine integrated into the platform.
Let’s consider a policy composed of two rules:
The enforcement of the first rule is made possible by using Terraform Cloud and delegating it the rights to deploy infrastructure. To validate our second requirement, we will need some policy, and that’s when Sentinel comes into play.
Sentinel gives you the opportunity to validate policies between the plan
and the apply
of code on Terraform Cloud. It can be setup to block apply operation if the policies are not validated:
To translate our policy into code, we will use the Sentinel language, a dedicated language for Sentinel. The Sentinel integration in Terraform Cloud exposes some data sources we can use in our policy code.
The [tfrun](<https://developer.hashicorp.com/terraform/cloud-docs/policy-enforcement/sentinel/import/tfrun>)
object contains metadata about the workspace that is being applied. We can query its vcs_repo
attribute to validate that a repository is linked to the workspace that is requested to be applied:
# tfrun contains information about the workspace that has been planned
import "tfrun"
import "strings"
# Checks if plan has been done on Terraform Cloud
is_remote_run = rule {
tfrun.workspace.execution_mode == "remote"
}
# Checks if workspace is linked with a vcs repository
is_using_vcs = rule {
tfrun.workspace.vcs_repo is defined
}
# Checks if terraform workspace is using vcs on corporate repo
main = rule {
is_remote_run and is_using_vcs
}
The idea here is to give you an idea of the possibilities offered by Policy As Code. We won't go into detail about writing or adding policies to Terraform Cloud. If you'd like to deploy the following example, I'll let you refer to the editor's tutorials, which are pretty well done.
When applying a workspace not linked to a repository, the main rule is returning false
. Sentinel would be able to block the planned operation on this workspace:
Note that the policy is actually running in advisory mode. In this case, Sentinel is not blocking the apply operation but gives visibility on the application of policies.
The value of policy engines lies partly in their ability to query external systems to feed their decision-making. To complete the previous example, it would be possible to enrich the information on the VCS repository used, by querying the Terraform Cloud and Github APIs. This would enable to ensure that the repository is located in the company's Github organization or that a code quality workflow has been applied to the code.
By combining data sources, you’ll be able to build complex policies that closely match your business policies and your company processes. For example, to implement organizational processes that require validation of an individual, it may be useful to request ticket management systems such as Jira or Trello.
However, interconnecting a large number of systems can cause problems if one of the data sources becomes unavailable. The policy engine may need help to decide on a policy, which could disrupt the workflow of the ops team.
As we’ve just seen, we might end up facing some pitfalls when implementing Policy As Code. In this last paragraph, I’ll advise you on how to avoid them based on our experience at Padok.
Start small: A common pitfall when implementing Policy As Code is to try to enforce every policy as soon as possible. Instead, it is recommended to start with a small perimeter and focus on the most relevant policies.
Several Policy As Code implementations allow setting policies in audit mode, preventing them from actually blocking any process they are tasked to validate. It is usually a good practice to start with this mode and iterate to improve developer experience and fix unexpected issues.
Use specialized tools as external data source: Reinventing the wheel is another common pitfall when implementing Policy As Code. For instance, if you want to validate code quality in your policy, it may be a good idea to query the results of a dedicated tool such as Checkov rather than re-implementing them.
Test your policies: It is essential to anticipate that a data source required for the policy may not be available. The policy engine itself may no longer be available or may contain errors. Testing your policies is highly recommended to reduce the risk related to these issues.
Provide bypass procedure: In case of policy engine failure or during a production incident, it may become vital to bypass the policy engine. It is important that this case is anticipated, and a procedure must be defined to ensure the traceability of bypassing actions.
Provide intelligible output and train your team to understand them: The biggest challenge when implementing Policy As Code is getting teams to adopt new and potentially more restrictive practices. Policy engine’s output must be verbose enough to guide teams towards understanding the conditions under which the policies are validated. Setting up policies in non-blocking mode may also be sufficient to give visibility on security practices while not blocking ops team workflow.
You should now have a pretty clear idea of what Policy As Code is. I hope that the use cases presented and the best practices have been able to guide you in implementing your system.