In the whole article, we will use the vocabulary associated with Kyverno resources: Policy, Rule, ...
Kyverno is a policy engine for Kubernetes. It allows to :
Benefits
Disadvantages
Kyverno runs as a dynamic admission controller in the Kubernetes cluster.
The Kyverno webhook receives requests from the API server during the "validating admission" and "mutating admission" steps:
A Kyverno Policy is composed of the following fields (for more info: kubectl explain policy.spec
) :
rules
: one or more rules define the policyotherwise
it applies only to new resourcesvalidationFailureAction
: the action mode of the policy: audit or enforceA Rule contains the following fields (for more info: kubectl explain policy.spec.rules
):
match
: to select the resourcesexclude
(optional): to exclude resources from the selectionmutate
, validate
, generate
, or verifyImages
: depending on the type of policy allows to mutate, validate, generate a resource, or verify the signature of an image (in beta)Kyverno has 2 modes of operation (validationFailureAction
):
Policy Reports are Kubernetes resources that can be listed simply:
kubectl get policyreport -A
For a given namespace, we can list policy violations with the command :
kubectl describe polr polr-ns-default | grep "Result: \\+fail" -B10
Kyverno can be installed on clusters via a simple Helm chart. Nothing could be simpler, that's the power of Kubernetes:
kelm repo add kyverno https://kyverno.github.io/kyverno/
helm repo update
helm install kyverno --namespace kyverno --create-namespace --values values.yaml
Here are the important points to consider in the chart values.yaml
:
---
# 3 replicas for High Availability
replicaCount: 3
# Necessary in EKS with custom Network CNI plugin
# https://cert-manager.io/docs/installation/compatibility/#aws-eks
hostNetwork: true
config:
webhooks:
# Exclude namespaces from scope
- namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: NotIn
values:
- kube-system
- kyverno
- calico-system
# Exclude objects from scope
- objectSelector:
matchExpressions:
- key: webhooks.kyverno.io/exclude
operator: DoesNotExist
Some remarks about the installation :
kube-system
and kyverno
are whitelisted in order not to block the deployment of critical Kubernetes resources (kube-proxy, weave, ...).A list of simple examples is provided in the Kyverno documentation.
I'd like to present a slightly more advanced use case: dynamic RBAC rights management. Here is the use case we encountered. We set up on-the-fly development environments in Kubernetes at a customer's site.
We allowed developers, via a Gitlab CI job, to test their applications in environments created on the fly. These environments are in dedicated namespaces also created on the fly.
How do you provide the associated Gitlab runner with RBAC rights to namespaces that don't yet exist? Unfortunately, Kubernetes does not allow this via RBAC, but with Kyverno, it is very simple.
All you need to do is:
Here are the implementation details:
apiVersion: v1
kind: ServiceAccount
metadata:
name: gitlab-runner-ephemeral-env
labels:
app: gitlab-runner-ephemeral-env
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: gitlab-runner-ephemeral-env
labels:
app: gitlab-runner-ephemeral-env
rules:
- apiGroups: ["*"]
resources: ["namespaces"]
verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: gitlab-runner-ephemeral-env
labels:
app: gitlab-runner-ephemeral-env
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: gitlab-runner-ephemeral-env
subjects:
- kind: ServiceAccount
name: gitlab-runner-ephemeral-env
namespace: gitlab
cluster-admin
via a ClusterPolicy KyvernoapiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-rbac-rules-env-volee
annotations:
policies.kyverno.io/title: Add RBAC permissions for ephemeral environments.
policies.kyverno.io/category: Multi-Tenancy
policies.kyverno.io/subject: RBAC
policies.kyverno.io/description: >-
Add RBAC rules when a namespace is created by a specific gitlab runner (gitlab-runner-env-volee), useful for ephemeral
environments.
spec:
background: false
rules:
- name: create-rbac
match:
resources:
kinds:
- Namespace
subjects:
- kind: ServiceAccount
name: gitlab-runner-ephemeral-env
namespace: gitlab
generate:
kind: RoleBinding
name: ephemeral-namespace-admin
namespace: ""
synchronize: true
data:
subjects:
- kind: ServiceAccount
name: gitlab-runner-ephemeral-env
namespace: gitlab
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
I will detail in this part several problems encountered when implementing Kyverno. Besides the fact that Kyverno is a SPOF on all the namespaces it monitors, the policies are quite complicated to write and debug. Not to mention that Kyverno can have side effects with other tools like ArgoCD.
Overall, Kyverno policies can be quite difficult to write. The documentation has many examples, but the whole mechanism of filtering and mutating resources can be a bit confusing at first.
Let's take a live example. We want to disallow the privileged: true
parameter except for two types of pods (as shown in the following diagram):
debug
namespacegitlab
namespace whose name starts with runner
Following the documentation, we are tempted to write the following policy:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: disallow-privileged-containers
annotations:
policies.kyverno.io/category: Pod Security Standards (Baseline)
policies.kyverno.io/severity: medium
policies.kyverno.io/subject: Pod
policies.kyverno.io/description: >-
Privileged mode disables most security mechanisms and must not be allowed. This policy
ensures Pods do not call for privileged mode.
spec:
validationFailureAction: audit
background: true
rules:
- name: priviledged-containers
match:
resources:
kinds:
- Pod
exclude:
any:
- resources:
namespaces:
- "debug"
# Whitelisting
- resources:
namespaces:
- "gitlab"
names:
- "runner-*"
validate:
message: >-
Privileged mode is disallowed. The fields spec.containers[*].securityContext.privileged
and spec.initContainers[*].securityContext.privileged must not be set to true.
pattern:
spec:
=(initContainers):
- =(securityContext):
=(privileged): "false"
containers:
- =(securityContext):
=(privileged): "false"
This policy does not work, the filtering mechanism is not effective. After some research, here is the fix to apply:
18,20c18,21
< resources:
< kinds:
< - Pod
---
> all:
> - resources:
> kinds:
> - Pod
There is no indication in the documentation of a change in behavior between these two ways of filtering resources. Not easy to debug a policy that doesn't work... fortunately, the community is active, and someone quickly proposed the solution on Slack..
From experience, one should always be careful with Webhook Mutation, which can be confusing for DevOps teams. Kubernetes Webhook Mutations inherently induce a difference between the specified resources and the resources actually deployed on the cluster.
If an Ops is not aware of the existence of these mutations, they can waste a lot of time understanding why a particular resource appears or has certain attributes.
Similarly, if a cluster has too many MutationPolicies, there may be incompatibilities between policies, or edge effects that are difficult to identify.
I recommend using Webhook Mutations sparingly and documenting them very clearly. This can be extremely useful (e.g. adding the address of an HTTP proxy as an environment variable for all pods in a namespace), but it is best to avoid abusing it if possible.
We have also encountered some difficulties with Kubernetes clusters whose CD is managed via ArgoCD.
When a Kyverno policy is created that relates to a resource that deploys containers, such as pods, Kyverno intelligently modifies the rules
so that the policies take into account all types of Kubernetes resources that deploy containers.
For example, if we create this policy:
apiVersion : kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-image-registries
spec:
validationFailureAction: enforce
rules:
- name: validate-registries
match:
any:
- resources:
kinds:
- Pod
validate:
message: "Images may only come from our internal enterprise registry."
pattern:
spec:
containers:
- image: "registry.domain.com/*"
Kyverno will modify the policy on the fly via a Webhook Mutation like this:
aspec:
background: true
failurePolicy: Fail
rules:
- match:
any:
- resources:
kinds:
- Pod
name: validate-registries
validate:
message: Images may only come from our internal enterprise registry.
pattern:
spec:
containers:
- image: registry.domain.com/*
- match:
any:
- resources:
kinds:
- DaemonSet
- Deployment
- Job
- StatefulSet
name: autogen-validate-registries
validate:
message: Images may only come from our internal enterprise registry.
pattern:
spec:
template:
spec:
containers:
- image: registry.domain.com/*
- match:
any:
- resources:
kinds:
- CronJob
name: autogen-cronjob-validate-registries
validate:
message: Images may only come from our internal enterprise registry.
pattern:
spec:
jobTemplate:
spec:
template:
spec:
containers:
- image: registry.domain.com/*
validationFailureAction: enforce
What happens if the Kyverno policy was created via Argo? Argo will detect a change between the Yaml file of the declared policy and the resource actually deployed in the cluster. There is then a constant back and forth between Argo and Kyverno, which modify the Kyverno policy in turn.
To indicate to Argo that these changes are not to be taken into account, it is sufficient to use the ignoreDifferences
keyword in the Argo application:
ignoreDifferences:
# Kyverno auto-generates rules to make policies smarter. We want ArgoCD to
# ignore the auto-generated rules.
# For more information: https://kyverno.io/docs/writing-policies/autogen/
- group: kyverno.io
kind: ClusterPolicy
jqPathExpressions:
- .spec.rules[] | select( .name | startswith("autogen-") )
Now you know what Kyverno is, how to install it, and how to use it to secure your Kubernetes cluster! Once again, use Webhook Mutation sparingly, test your policies well in audit mode beforehand, and don't hesitate to contact the community in case of problems.