Chaos Mesh 2.0: Comprehensive Chaos Engineering Platform

Chaos Mesh 2.0: Comprehensive Chaos Engineering Platform

Introduction

Chaos Mesh 2.0 — Comprehensive Chaos Engineering Platform — was released on October 12, 2021.

This is a practical update aimed at making day‑to‑day Kubernetes work a bit more predictable.

In this release: Chaos Mesh 2.0 delivers a powerful chaos engineering platform with enhanced experiment types, improved observability, and better integration with Kubernetes workflows.


Enhanced Chaos Experiments

  • Network chaos improvements provide more sophisticated network failure injection including bandwidth limits, packet loss, and latency.
  • Pod chaos enhancements enable more granular pod failure scenarios with better control over failure timing.
  • I/O chaos experiments validate storage and filesystem resilience under various failure conditions.
  • Time chaos tests validate application behavior during clock skew and time manipulation scenarios.

Kubernetes-Native Integration

  1. Custom resources enable chaos experiments to be defined and managed as Kubernetes resources.
  2. Controller architecture provides reliable experiment execution with automatic retry and recovery.
  3. RBAC integration enables fine-grained access control for chaos experiment execution.
  4. Namespace isolation ensures chaos experiments are scoped to appropriate namespaces.

Observability & Dashboard

  • Web UI provides an intuitive interface for creating, scheduling, and monitoring chaos experiments.
  • Experiment status tracking provides real-time visibility into experiment execution and results.
  • Metrics integration exposes detailed chaos experiment metrics for Prometheus and Grafana.
  • Event logging records all chaos experiment events for audit and troubleshooting.

Safety & Reliability

  • Experiment scoping limits chaos experiments to specific pods, namespaces, or resource selectors.
  • Safeguards prevent chaos experiments from affecting critical system components.
  • Automatic recovery ensures systems return to normal state after experiments complete.
  • Dry-run mode previews experiment effects without actually injecting failures.

Advanced Features

  • Workflow orchestration enables complex multi-stage chaos experiments with dependencies.
  • Scheduled experiments support recurring chaos tests for continuous resilience validation.
  • Experiment templates simplify creation of common chaos experiment patterns.
  • Multi-cluster support enables chaos testing across distributed Kubernetes deployments.

Getting Started

curl -sSL https://mirrors.chaos-mesh.org/latest/install.sh | bash

Create a network chaos experiment:

apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
  name: network-delay-example
spec:
  action: delay
  mode: one
  selector:
    namespaces:
    - default
    labelSelectors:
      app: nginx
  delay:
    latency: "10ms"
    correlation: "100"
    jitter: "0ms"
  duration: "30s"

Summary

AspectDetails
Release DateOctober 12, 2021
Headline FeaturesEnhanced experiment types, Kubernetes-native integration, improved observability, safety controls
Why it MattersProvides a comprehensive platform for validating Kubernetes system resilience through controlled chaos experiments

Chaos Mesh 2.0 continues to evolve as a leading chaos engineering platform, providing teams with powerful tools to proactively identify and fix system weaknesses.