ELK Stack

The ELK Stack (Elasticsearch, Logstash, Kibana) is a powerful open-source log management and analytics platform widely used for collecting, storing, searching, and visualizing logs from Kubernetes clusters. Each component serves a specific purpose in the log pipeline.

What is ELK Stack?

ELK Stack consists of three main components:

Elasticsearch - Distributed search and analytics engine (data store)
Logstash - Data processing pipeline (log collection and transformation)
Kibana - Visualization and dashboarding (UI for exploring data)

graph TB A[Log Sources] --> B[Logstash] B --> C[Elasticsearch] C --> D[Kibana] E[Kubernetes Pods] --> B F[Node Logs] --> B G[Application Logs] --> B B --> B1[Parse] B --> B2[Transform] B --> B3[Enrich] C --> C1[Index Data] C --> C2[Search] C --> C3[Store] D --> D1[Visualize] D --> D2[Dashboards] D --> D3[Discover] style A fill:#e1f5ff style B fill:#e8f5e9 style C fill:#fff4e1 style D fill:#f3e5f5

Architecture Components

Elasticsearch

Elasticsearch is a distributed, RESTful search and analytics engine:

Distributed storage - Horizontally scalable cluster
Full-text search - Powerful search capabilities
JSON document store - Flexible schema (NoSQL)
RESTful API - Easy integration
Real-time indexing - Near real-time search

Logstash

Logstash is a server-side data processing pipeline:

Input plugins - Collect data from various sources
Filter plugins - Parse, transform, and enrich data
Output plugins - Send data to destinations (Elasticsearch, etc.)
Pipeline processing - Parse and structure logs

Kibana

Kibana provides visualization and exploration:

Discover - Explore and search data
Visualize - Create charts and graphs
Dashboard - Combine visualizations
Dev Tools - Query and debug Elasticsearch

ELK Stack Architecture in Kubernetes

graph TB A[Application Pods] --> B[Log Files] B --> C[Fluent Bit/Fluentd] C --> D[Logstash] E[Node Logs] --> C F[System Logs] --> C D --> G[Elasticsearch Cluster] G --> H[Kibana] I[Filebeat] --> D J[Beats] --> D style C fill:#e1f5ff style D fill:#e8f5e9 style G fill:#fff4e1 style H fill:#f3e5f5

Common Patterns

Pattern 1: Fluent Bit → Elasticsearch → Kibana

Fluent Bit collects logs
Sends directly to Elasticsearch
Kibana visualizes data
Simpler setup, less processing

Pattern 2: Fluent Bit → Logstash → Elasticsearch → Kibana

Fluent Bit collects logs
Logstash processes and enriches
Elasticsearch stores data
Kibana visualizes
More processing power

Installation

Using Helm (Recommended)

# Add Elastic Helm repository
helm repo add elastic https://helm.elastic.co
helm repo update

# Install Elasticsearch
helm install elasticsearch elastic/elasticsearch \
  --namespace logging \
  --create-namespace \
  --set replicas=3 \
  --set minimumMasterNodes=2

# Install Logstash
helm install logstash elastic/logstash \
  --namespace logging \
  --set replicas=2

# Install Kibana
helm install kibana elastic/kibana \
  --namespace logging \
  --set replicas=1

Manual Deployment

Elasticsearch Deployment

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: logging
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
        env:
        - name: discovery.type
          value: single-node
        - name: ES_JAVA_OPTS
          value: "-Xms512m -Xmx512m"
        - name: xpack.security.enabled
          value: "false"
        ports:
        - containerPort: 9200
          name: http
        - containerPort: 9300
          name: transport
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      volumes:
      - name: data
        emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: logging
spec:
  selector:
    app: elasticsearch
  ports:
  - port: 9200
    targetPort: 9200
    name: http
  clusterIP: None

Logstash Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: logstash
  namespace: logging
spec:
  replicas: 2
  selector:
    matchLabels:
      app: logstash
  template:
    metadata:
      labels:
        app: logstash
    spec:
      containers:
      - name: logstash
        image: docker.elastic.co/logstash/logstash:8.11.0
        ports:
        - containerPort: 5044
          name: beats
        volumeMounts:
        - name: config
          mountPath: /usr/share/logstash/pipeline
      volumes:
      - name: config
        configMap:
          name: logstash-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash-config
  namespace: logging
data:
  logstash.conf: |
    input {
      beats {
        port => 5044
      }
    }
    
    filter {
      if [fields][container_name] {
        mutate {
          add_field => { "container_name" => "%{[fields][container_name]}" }
        }
      }
    }
    
    output {
      elasticsearch {
        hosts => ["http://elasticsearch:9200"]
        index => "kubernetes-logs-%{+YYYY.MM.dd}"
      }
    }

Kibana Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: logging
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: docker.elastic.co/kibana/kibana:8.11.0
        env:
        - name: ELASTICSEARCH_HOSTS
          value: http://elasticsearch:9200
        ports:
        - containerPort: 5601
          name: http
---
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: logging
spec:
  selector:
    app: kibana
  ports:
  - port: 5601
    targetPort: 5601
  type: ClusterIP

Log Collection Setup

Using Fluent Bit with ELK

Fluent Bit can send logs directly to Elasticsearch:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     info
        Daemon        off
    
    [INPUT]
        Name              tail
        Path              /var/log/pods/**/*.log
        Parser            cri
        Tag               kubernetes.*
        Refresh_Interval  5
    
    [FILTER]
        Name                kubernetes
        Match               kubernetes.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Merge_Log           On
        Keep_Log            Off
    
    [OUTPUT]
        Name  es
        Match *
        Host  elasticsearch.logging.svc.cluster.local
        Port  9200
        Index kubernetes-logs
        Type  _doc
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
spec:
  selector:
    matchLabels:
      name: fluent-bit
  template:
    metadata:
      labels:
        name: fluent-bit
    spec:
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:latest
        volumeMounts:
        - name: config
          mountPath: /fluent-bit/etc
        - name: varlog
          mountPath: /var/log
      volumes:
      - name: config
        configMap:
          name: fluent-bit-config
      - name: varlog
        hostPath:
          path: /var/log

Using Filebeat

Filebeat can collect logs and send to Logstash:

apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: logging
data:
  filebeat.yml: |
    filebeat.inputs:
    - type: container
      paths:
        - /var/log/containers/*.log
      processors:
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"
    
    output.logstash:
      hosts: ["logstash.logging.svc.cluster.local:5044"]
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: logging
spec:
  selector:
    matchLabels:
      name: filebeat
  template:
    metadata:
      labels:
        name: filebeat
    spec:
      containers:
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:8.11.0
        volumeMounts:
        - name: config
          mountPath: /usr/share/filebeat/filebeat.yml
          subPath: filebeat.yml
        - name: varlog
          mountPath: /var/log
          readOnly: true
      volumes:
      - name: config
        configMap:
          name: filebeat-config
      - name: varlog
        hostPath:
          path: /var/log

Kibana Configuration

Accessing Kibana

# Port forward to access Kibana
kubectl port-forward -n logging svc/kibana 5601:5601

Access at: http://localhost:5601

Creating Index Patterns

Go to Management > Index Patterns
Click Create index pattern
Enter pattern: kubernetes-logs-*
Select timestamp field: @timestamp
Click Create index pattern

Creating Visualizations

Example: Log Volume Over Time

Go to Visualize
Create new visualization
Select Line chart
Choose index pattern: kubernetes-logs-*
X-axis: Date Histogram on @timestamp
Y-axis: Count
Save visualization

Example: Top Error Messages

Create Data Table visualization
Add bucket: Terms on message.keyword
Add filter: level: ERROR
Sort by: Count (descending)
Limit: 10

Creating Dashboards

Go to Dashboard
Click Create dashboard
Add saved visualizations
Arrange and configure
Save dashboard

Logstash Pipeline Configuration

Parsing JSON Logs

filter {
  if [message] =~ /^\{/ {
    json {
      source => "message"
    }
  }
}

Adding Kubernetes Metadata

filter {
  if [kubernetes] {
    mutate {
      add_field => {
        "pod_name" => "%{[kubernetes][pod_name]}"
        "namespace" => "%{[kubernetes][namespace]}"
        "container_name" => "%{[kubernetes][container_name]}"
      }
    }
  }
}

Grok Parsing

filter {
  grok {
    match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:message}" }
  }
}

Elasticsearch Queries

Basic Search

GET /kubernetes-logs-*/_search
{
  "query": {
    "match": {
      "message": "error"
    }
  }
}

Filter by Namespace

GET /kubernetes-logs-*/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "kubernetes.namespace": "production" } },
        { "match": { "level": "ERROR" } }
      ]
    }
  }
}

Time Range Query

GET /kubernetes-logs-*/_search
{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "now-1h",
        "lte": "now"
      }
    }
  }
}

Best Practices

1. Index Lifecycle Management

Set up ILM to manage indices:

PUT /_ilm/policy/kubernetes-logs-policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50GB",
            "max_age": "7d"
          }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

2. Resource Limits

Set appropriate resource limits:

resources:
  requests:
    memory: "2Gi"
    cpu: "1000m"
  limits:
    memory: "4Gi"
    cpu: "2000m"

3. Persistent Storage

Use persistent volumes for Elasticsearch:

volumeClaimTemplates:
- metadata:
    name: data
  spec:
    accessModes: ["ReadWriteOnce"]
    resources:
      requests:
        storage: 100Gi

4. Cluster Sizing

Elasticsearch: 3+ nodes for HA
Logstash: Scale based on log volume
Kibana: 1-2 replicas

5. Security

Enable security features for production:

Enable X-Pack security
Use TLS for transport
Configure authentication
Set up RBAC

6. Monitoring

Monitor ELK Stack health:

Elasticsearch cluster health
Logstash pipeline performance
Kibana query performance
Index sizes and growth

Troubleshooting

Elasticsearch Cluster Health

# Check cluster health
kubectl exec -n logging elasticsearch-0 -- \
  curl -s http://localhost:9200/_cluster/health

# Check indices
kubectl exec -n logging elasticsearch-0 -- \
  curl -s http://localhost:9200/_cat/indices?v

Logstash Pipeline Issues

# Check Logstash logs
kubectl logs -n logging -l app=logstash

# Test pipeline configuration
kubectl exec -n logging logstash-0 -- \
  logstash --config.test_and_exit -f /usr/share/logstash/pipeline

No Data in Kibana

Check index pattern matches indices
Verify data is in Elasticsearch
Check timestamp field configuration
Verify time range in Discover