Chaos Mesh 1.0: Chaos Engineering Platform

Table of Contents
Introduction
Most outages aren’t triggered by a single catastrophic bug — they’re triggered by ordinary failures (a node reboot, a flaky network link, a disk that starts timing out) happening at the worst possible time. Chaos engineering is a disciplined way to practice those failures before production does it for you.
Chaos Mesh 1.0, released on September 20, 2020, is a big step toward making that practice repeatable on Kubernetes: a single platform for injecting faults, running controlled experiments, and learning which assumptions in your systems are actually true.
How to start safely
- Keep the blast radius small: select a single namespace and a narrow label selector before you ever target shared dependencies.
- Treat experiments like deployments: schedule them, review them, and add clear rollback steps (even for “simple” pod-kill tests).
- Measure a hypothesis: define what “healthy” means (latency, error rate, SLO) so you can tell resilience from luck.
Fault Injection Capabilities
- Pod chaos enables killing, stopping, or restarting pods to test application resilience.
- Network chaos simulates network failures, latency, and packet loss.
- I/O chaos injects file system and disk I/O faults.
- Time chaos manipulates system time to test time-dependent behaviors.
- Kernel chaos injects kernel-level faults for advanced testing scenarios.
Experiment Management
- Web UI provides intuitive interface for creating and managing chaos experiments.
- Scheduling support enables running experiments on schedules or triggers.
- Experiment templates simplify creating common chaos scenarios.
- Multi-cluster support enables running experiments across multiple clusters.
Observability
- Metrics integration exposes detailed chaos experiment metrics for Prometheus.
- Event tracking provides comprehensive logs of all chaos operations.
- Dashboard integration with Grafana provides visualization of chaos experiments.
- Alerting support enables notifications when experiments complete or fail.
Getting Started
curl -sSL https://mirrors.chaos-mesh.org/latest/install.sh | bash
kubectl apply -f https://mirrors.chaos-mesh.org/latest/crd.yaml
Create a chaos experiment:
apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
name: pod-kill-example
spec:
action: pod-kill
mode: one
selector:
namespaces:
- default
labelSelectors:
app: my-app
scheduler:
cron: "@every 2m"
Summary
| Aspect | Details |
|---|---|
| Release Date | September 20, 2020 |
| Headline Features | Comprehensive fault injection, experiment management, observability |
| Why it Matters | Provides a platform for testing and improving application resilience through chaos engineering |
Chaos Mesh 1.0 continues to evolve as a leading chaos engineering platform, providing teams with powerful tools for testing and improving the resilience of Kubernetes applications.