In my
previous post on managing AWS ElastiCache (Valkey) clusters with Crossplane and GitOps, I showed how we can stand up clusters entirely through
declarative YAML. One of the biggest wins of that approach is that everything becomes code.
That means we can encode our company’s best design patterns as policies and automatically apply them across every cluster—past, present, and future.
Why Kyverno?
There are many ways to validate and enforce Crossplane resources. We chose Kyverno because it speaks Kubernetes-native YAML, it’s easy to read,
and it supports both soft (audit) and hard (enforce) modes. Our rollout strategy:
- Start in Audit mode to see which existing stacks violate standards without breaking deploys.
- Fix the drift (massage the clusters/stacks) until everything passes.
- Flip to Enforce to block future non-compliant changes.
If you’re new to Crossplane and want to understand how we provision Valkey clusters declaratively,
I recommend reading the
step-by-step Crossplane + GitOps guide here.
Design Choice: One Validation per Rule
When we first built our cluster policy, we wanted a simple way to surface a clear, actionable list of problems.
We landed on a structure where we can have multiple rules, but each rule performs exactly one validation.
If we need another validation, we create another rule. This gives us:
- Custom error messages that read like a to-do item for developers.
- Cleaner visualization in dashboards (see Policy Reporter below).
- Modular maintenance—toggle, tune, or extend rules independently.
Visualizing Compliance with Policy Reporter
We tested a few UIs and found the Policy Reporter UI makes it easiest to review policy findings.
It’s effectively a categorized to-do list by policy, namespace, and resource. Developers (and DBAs) can quickly drill into
just the ElastiCache items and see exactly what needs to be fixed.
Helpful Policy Annotations
We use Kyverno’s annotations to improve grouping and filtering in reports:
metadata: name: replicationgroup-policy annotations: policies.kyverno.io/title: ReplicationGroup Policy policies.kyverno.io/category: ElastiCache policies.kyverno.io/severity: medium policies.kyverno.io/subject: ReplicationGroup Crossplane
With a dedicated category
(e.g., ElastiCache
), DBAs can filter the Policy Reporter UI
to the most relevant findings for them.
Full Example: ClusterPolicy for Crossplane ReplicationGroup
Below is a simple, end-to-end example that validates several core standards. We start in Audit mode
to gather findings without blocking deploys. After all clusters pass, switch to Enforce
to prevent non-compliant changes going forward.
apiVersion: kyverno.io/v1 kind: ClusterPolicy metadata: name: replicationgroup-policy annotations: policies.kyverno.io/title: ReplicationGroup Policy policies.kyverno.io/category: ElastiCache policies.kyverno.io/severity: medium policies.kyverno.io/subject: ReplicationGroup Crossplane policies.kyverno.io/description: "Validates Crossplane ReplicationGroup resources for ElastiCache to ensure compliance with organizational standards, security requirements, and cost optimization guidelines. This includes enforcing approved instance types, encryption requirements, and other configuration standards. The resource for the policy can be found at https://github.ancestry.com/infrastructure/containers-applicationbases/tree/master/applications/kyverno/templates/deploy/policies" spec: validationFailureAction: Audit background: true rules: - name: validate-replication-group match: any: - resources: kinds: - elasticache.aws.upbound.io/v1beta2/ReplicationGroup validate: message: "ReplicationGroup validation failed: Instance type must be in the r6g family. Current instance type is '{{ request.object.spec.forProvider.nodeType }}'" pattern: spec: forProvider: nodeType: "cache.r6g.*" - name: validate-encryption match: any: - resources: kinds: - elasticache.aws.upbound.io/v1beta2/ReplicationGroup validate: message: "Encryption at rest must be enabled" pattern: spec: forProvider: atRestEncryptionEnabled: true - name: validate-transit-encryption match: any: - resources: kinds: - elasticache.aws.upbound.io/v1beta2/ReplicationGroup validate: message: "Encryption in transit must be enabled" pattern: spec: forProvider: transitEncryptionEnabled: true - name: validate-management-policy match: any: - resources: kinds: - elasticache.aws.upbound.io/v1beta2/ReplicationGroup validate: message: "ReplicationGroup is still in Observe mode and not being managed by Crossplane" pattern: spec: managementPolicies: "!Observe"
Switching from Audit to Enforce
Once your dashboards show green across the board, flip the policy to hard enforcement by changing a single field:
spec: validationFailureAction: Enforce
Developer Experience Tips
- Name rules by intent (
validate-encryption
,validate-transit-encryption
, etc.) so it’s obvious what failed. - Write messages like tickets—tell the developer exactly what to fix and (if useful) echo the current value using variables like
{{ request.object.spec.forProvider.nodeType }}
. - Keep rules atomic (one validation per rule) for better UX in Policy Reporter and simpler maintenance.
- Batch remediation by category (e.g., all ElastiCache issues) so DBAs and app teams can focus on what they own.
Wrap-Up
With Crossplane defining ElastiCache (Valkey) as YAML and Kyverno validating those definitions, we get a clear, automated path to standardization.
Start in Audit to learn, remediate the drift, then move to Enforce for durable guardrails—no more snowflake clusters.
If you haven’t seen how we provision these clusters in the first place, check out the companion post:
How to Manage AWS Valkey Clusters with Crossplane and GitOps.