GKE Cluster Setup
Creating a GKE cluster involves setting up the Google Cloud infrastructure (VPC network, IAM roles), creating the cluster control plane, configuring worker nodes (Standard mode) or enabling Autopilot mode, and connecting your local kubectl to the cluster. This guide covers the complete setup process from prerequisites to deploying your first application.
Prerequisites
Before creating a GKE cluster, ensure you have:
Google Cloud Account Requirements
- Google Cloud Project - Active project with billing enabled
- IAM Permissions - Ability to create clusters, node pools, and manage resources
- Service Quotas - Sufficient service quotas for Compute Engine, VPC, and GKE
- Region Selection - Choose a Google Cloud region where GKE is available
Local Tools
Install these tools on your local machine:
kubectl:
# macOS
brew install kubectl
# Linux
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
# Verify installation
kubectl version --client
gcloud CLI:
# macOS
brew install google-cloud-sdk
# Linux
curl https://sdk.cloud.google.com | bash
exec -l $SHELL
# Verify installation
gcloud --version
Google Cloud Authentication:
# Login to Google Cloud
gcloud auth login
# Set default project
gcloud config set project YOUR_PROJECT_ID
# Verify authentication
gcloud auth list
Enable Required APIs:
# Enable GKE API
gcloud services enable container.googleapis.com
# Enable Compute Engine API (required for nodes)
gcloud services enable compute.googleapis.com
Understanding GKE Components
Before creating a cluster, understand what gets created:
GKE Cluster:
- Control plane managed by Google Cloud
- API endpoint for cluster access
- Cluster configuration and version
VPC Network and Networking:
- VPC network for cluster isolation
- Subnets across zones
- Firewall rules for traffic control
- Route tables for traffic routing
Service Accounts:
- Cluster Service Account - Permissions for cluster components
- Node Service Account - Permissions for worker nodes to access GCP services
Node Pool (Standard Mode):
- Managed Instance Group for worker nodes
- Compute Engine VM instances running Kubernetes node components
Autopilot Mode:
- Fully managed nodes
- Automatic provisioning and scaling
- No node pool management required
Creating a Cluster
There are three main ways to create a GKE cluster:
Method 1: gcloud CLI (Recommended)
gcloud CLI is the official tool for GKE cluster creation:
Simple Cluster Creation (Standard Mode):
# Create cluster with default settings
gcloud container clusters create my-cluster \
--zone us-central1-a \
--num-nodes 3 \
--machine-type n1-standard-2
This single command creates:
- GKE cluster
- Default node pool with 3 nodes
- VPC network (if not specified)
- Firewall rules
- kubeconfig configuration
Simple Cluster Creation (Autopilot Mode):
# Create Autopilot cluster
gcloud container clusters create-auto my-autopilot-cluster \
--region us-central1
Advanced Cluster Configuration:
# Create cluster with custom configuration
gcloud container clusters create production-cluster \
--zone us-central1-a \
--num-nodes 3 \
--machine-type n1-standard-2 \
--enable-autoscaling \
--min-nodes 1 \
--max-nodes 10 \
--enable-autorepair \
--enable-autoupgrade \
--network my-vpc \
--subnetwork my-subnet \
--enable-private-nodes \
--enable-private-endpoint \
--enable-shielded-nodes \
--enable-binary-authorization \
--workload-pool PROJECT_ID.svc.id.goog
Method 2: Google Cloud Console
Creating via the Google Cloud Console provides a visual interface:
Navigate to GKE:
- Go to Google Cloud Console → Kubernetes Engine → Clusters → Create
Choose Cluster Mode:
- Standard mode (manage nodes yourself)
- Autopilot mode (fully managed)
Configure Cluster (Standard):
- Cluster name
- Kubernetes version
- Location (zone or region)
- VPC network and subnet
- Private cluster options
- Security options
Configure Node Pool (Standard):
- Machine type and size
- Number of nodes
- Auto-scaling configuration
- Node labels and taints
Create Cluster:
- Click Create
- Wait for cluster creation
Configure kubectl:
- Click Connect
- Run the provided command
Method 3: Terraform
For infrastructure as code, use Terraform:
# main.tf
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.0"
}
}
}
provider "google" {
project = "my-project-id"
region = "us-central1"
}
resource "google_container_cluster" "primary" {
name = "my-cluster"
location = "us-central1-a"
remove_default_node_pool = true
initial_node_count = 1
network = google_compute_network.vpc.name
subnetwork = google_compute_subnetwork.subnet.name
private_cluster_config {
enable_private_nodes = true
enable_private_endpoint = true
master_ipv4_cidr_block = "172.16.0.0/28"
}
workload_identity_config {
workload_pool = "my-project-id.svc.id.goog"
}
addons_config {
network_policy_config {
disabled = false
}
}
}
resource "google_container_node_pool" "primary_nodes" {
name = "general-pool"
location = "us-central1-a"
cluster = google_container_cluster.primary.name
node_count = 3
autoscaling {
min_node_count = 1
max_node_count = 10
}
management {
auto_repair = true
auto_upgrade = true
}
node_config {
preemptible = false
machine_type = "n1-standard-2"
workload_metadata_config {
mode = "GKE_METADATA"
}
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]
}
}
resource "google_compute_network" "vpc" {
name = "gke-vpc"
auto_create_subnetworks = false
}
resource "google_compute_subnetwork" "subnet" {
name = "gke-subnet"
ip_cidr_range = "10.0.0.0/24"
region = "us-central1"
network = google_compute_network.vpc.id
}
Apply with:
terraform init
terraform plan
terraform apply
Standard vs Autopilot Clusters
Standard Mode
Standard mode gives you full control over node configuration:
Characteristics:
- You manage node pools and nodes
- Full control over machine types and sizes
- Manual or automatic node scaling
- Node lifecycle management
- Lower cost for large, predictable workloads
Use When:
- Need specific node configurations
- Have predictable workloads
- Want full control over nodes
- Need custom node images or configurations
Autopilot Mode
Autopilot mode provides fully managed nodes:
Characteristics:
- Google Cloud manages nodes automatically
- Pay only for requested resources (CPU, memory)
- Automatic scaling and optimization
- Enhanced security defaults
- No node pool management
Use When:
- Want simplified operations
- Have variable workloads
- Prefer pay-per-pod pricing
- Don’t need specific node configurations
Cluster Configuration Options
Kubernetes Version
Choose a Kubernetes version supported by GKE:
# List available versions
gcloud container get-server-config --zone us-central1-a
# Create cluster with specific version
gcloud container clusters create my-cluster \
--zone us-central1-a \
--cluster-version 1.28.0-gke.100
Version Considerations:
- Use a recent stable version for new features
- Check GKE version support lifecycle
- Consider upgrade path when choosing version
- Test version compatibility with your applications
VPC Network Configuration
GKE requires a VPC network with specific configuration:
Subnet Requirements:
- At least one subnet for nodes
- Sufficient IP addresses for pods (alias IP ranges)
- Secondary IP ranges for pods
IP Address Planning:
VPC-Native Networking:
- Pods get IP addresses from secondary IP ranges
- No overlay networks required
- Direct VPC connectivity
Private Clusters
Configure private clusters for enhanced security:
# Create private cluster
gcloud container clusters create private-cluster \
--zone us-central1-a \
--enable-private-nodes \
--enable-private-endpoint \
--master-ipv4-cidr 172.16.0.0/28 \
--network my-vpc \
--subnetwork my-subnet
Private Cluster Features:
- Private nodes (no external IPs)
- Private endpoint (API server only accessible from VPC)
- Enhanced security
- Requires VPN or bastion host for access
Initial Node Pool Setup (Standard Mode)
After creating the cluster, configure node pools:
# Create node pool
gcloud container node-pools create general-pool \
--cluster my-cluster \
--zone us-central1-a \
--num-nodes 3 \
--machine-type n1-standard-2 \
--enable-autoscaling \
--min-nodes 1 \
--max-nodes 10 \
--enable-autorepair \
--enable-autoupgrade
Node Pool Configuration Options:
- Machine types and sizes
- Minimum, maximum, and initial node count
- Auto-scaling configuration
- Auto-repair and auto-upgrade
- Preemptible VMs for cost savings
- Labels and taints
Autopilot Configuration
Autopilot clusters don’t require node pool configuration:
# Create Autopilot cluster
gcloud container clusters create-auto my-autopilot-cluster \
--region us-central1 \
--release-channel regular \
--workload-pool PROJECT_ID.svc.id.goog
Autopilot Features:
- Automatic node provisioning
- Automatic scaling
- Enhanced security defaults
- Pay-per-pod pricing
- No node management
Cluster Authentication
Configure kubectl to access your cluster:
Get Cluster Credentials
# Get cluster credentials
gcloud container clusters get-credentials my-cluster \
--zone us-central1-a
# For regional clusters
gcloud container clusters get-credentials my-cluster \
--region us-central1
This updates ~/.kube/config with cluster credentials.
Verify Access
# Test cluster access
kubectl get nodes
# Should show your worker nodes (Standard mode)
NAME STATUS ROLES AGE VERSION
gke-my-cluster-default-pool-xxx-yyy Ready <none> 5m v1.28.0-gke.100
gke-my-cluster-default-pool-xxx-zzz Ready <none> 5m v1.28.0-gke.100
gke-my-cluster-default-pool-xxx-aaa Ready <none> 5m v1.28.0-gke.100
Cloud IAM Integration
GKE integrates with Google Cloud IAM for authentication:
# Grant IAM permissions
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="user:[email protected]" \
--role="roles/container.developer"
# Grant cluster admin
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="user:[email protected]" \
--role="roles/container.clusterAdmin"
IAM Roles:
container.viewer- View clusterscontainer.developer- Create and update resourcescontainer.clusterAdmin- Full cluster administrationcontainer.admin- Full project administration
Post-Setup Configuration
After cluster creation, configure essential components:
Enable Workload Identity
Enable Workload Identity for pod-level authentication:
# Enable Workload Identity on cluster
gcloud container clusters update my-cluster \
--zone us-central1-a \
--workload-pool PROJECT_ID.svc.id.goog
# Enable Workload Identity on node pool
gcloud container node-pools update default-pool \
--cluster my-cluster \
--zone us-central1-a \
--workload-metadata=GKE_METADATA
Configure Network Policy
Enable network policies for pod-to-pod isolation:
# Enable network policy
gcloud container clusters update my-cluster \
--zone us-central1-a \
--enable-network-policy
Enable Binary Authorization
Enable Binary Authorization for container image verification:
# Enable Binary Authorization
gcloud container clusters update my-cluster \
--zone us-central1-a \
--enable-binary-authorization
Set Up Monitoring
Enable Cloud Operations (monitoring and logging):
# Enable monitoring
gcloud container clusters update my-cluster \
--zone us-central1-a \
--monitoring=SYSTEM,WORKLOAD
Best Practices
Use Regional Clusters - For high availability across zones
Enable Auto-Repair and Auto-Upgrade - Automatic node maintenance
Enable Workload Identity - For secure pod-to-GCP authentication
Use Private Clusters - For enhanced security in production
Configure Auto-Scaling - For cost optimization
Use Preemptible VMs - For cost savings on non-critical workloads
Enable Network Policy - For pod-to-pod isolation
Enable Binary Authorization - For container image security
Set Resource Quotas - To prevent resource exhaustion
Use Release Channels - For automatic version management
Common Issues
Insufficient IP Addresses
Problem: Pods can’t get IP addresses
Solution:
- Increase secondary IP range size
- Create additional secondary ranges
- Use larger subnet CIDR
Cluster Creation Fails
Problem: Cluster creation times out or fails
Solution:
- Check IAM permissions
- Verify service quotas
- Check VPC network configuration
- Review Cloud Logging for errors
kubectl Access Denied
Problem: Can’t access cluster with kubectl
Solution:
- Verify cluster credentials are updated
- Check Cloud IAM permissions
- Verify you’re authenticated:
gcloud auth list - Check private endpoint configuration
Node Pool Creation Fails
Problem: Node pool creation times out or fails
Solution:
- Check service account permissions
- Verify firewall rules
- Check subnet configuration
- Review Cloud Logging for errors
Next Steps
After cluster setup:
- Networking - Configure VPC-native networking
- Storage - Set up Persistent Disk and Filestore
- Security - Configure Workload Identity and security
- Node Management - Manage and optimize node pools
See Also
- GKE Overview - Understanding GKE architecture
- Cluster Operations - General Kubernetes cluster management
- Troubleshooting - Common issues and solutions