Companies sometimes forget about policy compliance when planning and beginning their digital transformation. However, it’s a very important area, and keeping an eye on it can save you from serious problems. Let’s consider a situation like this: you’re already on your DevOps transformation journey. Your company moves most of its IT infrastructure to the cloud and starts focusing on cloud-native solutions. Your development teams are working closely with the Operations department and a few projects are scheduled to “go-live” in a few days. The only missing thing is to get the compliance and security sign-off from the CAB board.
Then, you find out that the application doesn’t meet the security requirements. You have critical infrastructure parts open to the public Internet, and you forgot to prepare an application monitoring dashboard to detect faults. You’re forced to delay the release to implement the required compliance rules and wait for the next CAB boards.
From your perspective, you were following the DevOps principles and used the infrastructure as code paradigm with a continuous delivery pipeline. So, how is this possible? Well, having your infrastructure “codified”, doesn’t mean it’s being constantly checked for compliance. But you can implement a process called policy as code specifically for that reason. This article will show you what it is, and how you can implement it in your projects.
Policy and Compliance as Code
Normally, the policy authors would create a PDF document and distribute it around the company or project. Their colleagues can then read these policies and abide by them. Before a release or a change, the compliance and security team is requested to verify whether the system is compliant. This procedure isn’t exactly efficient. Auditing and applying changes to meet the policies in the application so late in the development process is hard and prone to errors. It’s also not in line with the DevOps principle to automate as much as possible. The proposed solution to this problem is to automate the policy validation process.
Policy as code is a concept where, instead of a PDF document, you translate your policies into machine-readable definition files, and use them to check and enforce that the state of the system meets those policies. This allows us to perform validations on a fixed schedule without a human interaction or even “shift-left” by embedding it in your continuous delivery pipeline. This can ensure that the cloud infrastructure you provisioned meets the declared policies.
The policies can cover multiple categories:
- Compliance policies – GDPR is a good example of a policy that covers nearly every branch. Apart from that, different branches may also have their own regulations. For the health sector it’s HIPAA and BfArM, for finance it’s FCA, BaFin and FINMA, and for retail it’s OFT. From the IT perspective, those regulations impact the way data is stored and transferred, and the level of transparency related to the creation of a product.
- Security policies – IT security policies to protect the computer system. These could be an HTTPS-only policy or a rule to only allow connections from a specific IP range.
- Cost optimisation policies – used to reduce the cost of your IT infrastructure. For example, you could shut down the unused development environments outside of the business hours or identify those services that are underutilised and can be downgraded.
- Good practices – other policies to ensure the durability or high availability. For example, larger infrastructure changes should require a manual approval in the pipeline.
Figure 1. An example Open Policy Agent policy code, which detects untagged resources in the Terraform plan. Such a policy could be used to ensure cloud resources are properly tagged, so we know whom does each resource belong to
Reactive Policy Checks
When you start implementing policy as code, you will probably already have some cloud infrastructure in place. In this case, the best option is to use tools that allow you to continuously validate the existing infrastructure against the policies. Cloud providers often supply such tools. AWS has it as AWS Config and AWS Inspector, on Azure, you can use Azure Policy. There also are open source hybrid cloud tools like Cloud Custodian.
Let’s try to implement an example policy in Cloud Custodian. Our goal is to enable deletion protection for all the AWS Relational Database Service (RDS) production instances , so we can avoid accidental data loss. When we detect an RDS instance without it, we can fix the non-compliant instances and enable termination protection on it.
1. policies:
2. - name: enable-rds-deletion-protection-on-prod
3. resource: rds
4. filters:
5. - DeletionProtection: false
6. - “tag:Environment”: “production”
7. actions:
8. - type: modify-db
9. update:
10. - property: 'DeletionProtection'
11. value: true
12. immediate: true
Executing this policy with Cloud Custodian will detect RDS instances with the “Environment” tag set to “production” and disabled deletion protection. It will then automatically enable this tag on all of them. You could run such policy on a daily schedule, to ensure that any of your databases won’t be accidentally removed.
Shift the Policy Checks Left!
Having a good feedback loop from your production systems and operations teams to the development teams is an important aspect of the DevOps culture. Likewise, it’s a good idea to improve the feedback loop from your security and compliance teams to the development. You can achieve it by embedding policy checks in your integration and delivery pipeline. This will notify the developers about every policy validation each time they commit code to the repositories, allowing the team to avoid policy fixes late in the project. Such practice also embraces the cooperation between the compliance and development teams and brings them together.
We’re going to use Open Policy Agent for this case. Open Policy Agent is a policy as code engine that provides a policy language called Rego and a binary to evaluate the policies against some input data. This could be a Terraform plan, a Kubernetes resource or any JSON file. The project is getting more and more popular in the cloud-native community, and it’s currently an incubating project in the Cloud Native Computing Foundation.
Let’s look at an example Open Policy Agent policy, which takes a Terraform plan as input and ensures that all provisioned resource have the two required tags.
1. package policies.terraform
2.
3. import data.libraries.terraform
4. import data.libraries.common
5.
6. required_tags = {
7. "Project",
8. "Environment",
9. }
10.
11. missing_tags_errors[msg] {
12. tags = terraform.taggable_resources[res].values.tags
13. resource_tags := { tag | tags[tag] }
14.
15. required_tag := required_tags[_]
16. not common.contains(resource_tags, required_tag)
17.
18. msg := sprintf("Error in resource %v. Missing required tag %v", [res.address, required_tag])
19. }
20.
21. approve {
22. count(missing_tags_errors) == 0
23. }
This policy verifies that all resources managed by Terraform are tagged as “Project” and “Environment”. A person can easily understand what this policy does by simply reading the code. With such a definition in place, you can now embed it into your continuous delivery pipeline, so the application teams get feedback every time they commit code and can apply fixes at the earliest stage of development.
1. # Evaluate OPA policies
2. $ opa eval --format pretty -d '.' --input terraform.tfplan.json "data.policies.missing_tags_errors"
3. [
4. "Error in resource aws_cloudtrail.organization. Missing required tag Project",
5. "Error in resource aws_cloudtrail.organization. Missing required tag Environment",
6. "Error in resource aws_s3_bucket.cloudtrail. Missing required tag Project",
7. "Error in resource aws_s3_bucket.cloudtrail. Missing required tag Environment"
8. ]
Bring Compliance and Security to DevOps
The security and compliance teams are often overlooked in the DevOps transformation. Everyone talks about continuous delivery, infrastructure as code and other DevOps principles. However, compliance policies can become a pain point that halts the release of your project.
Policy and compliance as code can help you embed the compliance and security process to your continuous integration and delivery pipelines. This ensures that your infrastructure and products meet all the policy requirements. Bringing the compliance and security teams to the table and “codifying” the policy validation process can help you shorten the delivery pipeline, hence provide the business value faster, while keeping your cloud deployments safe and secure. With this solution, no CAB meeting will be too difficult to handle or even required!
Running reactive compliance checks allows you to detect misconfigured resources or unsecured IT systems. From our experience, the best approach is to have a combination of reactive compliance checks and proactive pipelines with embedded policy checks.