Introduction / Why This Is Needed
In Kubernetes, a pod is the smallest deployable unit, and when things go wrong, debugging becomes critical. The kubectl debug command allows you to launch a temporary ephemeral container inside a problematic pod to inspect the file system, processes, and network settings without modifying the pod's configuration. This guide will show you how to effectively use kubectl debug for diagnostics, starting from checking the pod's status to gathering key data.
Prerequisites / Preparation
Before you begin, ensure that:
kubectl(version 1.20+) is installed and configured with access to your Kubernetes cluster (kubectl cluster-info).- Your account has permissions to create ephemeral containers in the target namespace (the
pods/execrole or similar). - You know the pod name and namespace you need to debug.
- Familiarity with basic
kubectlcommands (get,describe,logs) is recommended.
Step-by-Step Guide
Step 1: Check Pod Status and Gather Information
First, get general information about the pod to understand its state:
kubectl get pods -n <namespace>
Pay attention to the STATUS field (e.g., Running, CrashLoopBackOff, Error) and RESTARTS.
Then, describe the pod for details on events, containers, and conditions:
kubectl describe pod <pod-name> -n <namespace>
In the output, look for the Events, Containers, and Conditions sections. Common issues: Failed events, ImagePullBackOff, OOMKilled.
Finally, check the container logs. If the pod has multiple containers, use --all-containers:
kubectl logs <pod-name> -n <namespace> --all-containers=true
If the pod is restarting, add --previous to see logs from the previous container instance:
kubectl logs <pod-name> -n <namespace> --previous --all-containers=true
Step 2: Launch a Debug Container with kubectl debug
The kubectl debug command adds an ephemeral container to an existing pod. An ephemeral container is a temporary container that shares the network and IPC namespaces with the primary containers but has its own PID namespace (unless specified otherwise).
Use an image with diagnostic tools. busybox is suitable for basic commands (sh, ps, netstat). For network debugging, you can use nicolaka/netshoot.
kubectl debug -it <pod-name> -n <namespace> --image=busybox
Options:
-it— interactive mode with TTY.--image— the image for the debug container.--target=<container-name>— if you need to attach to a specific container in the pod (shares its namespace).--share-processes— so the debug container can see the primary container's processes (shares PID namespace).
Example with a target container and shared PID namespace:
kubectl debug -it <pod-name> -n <namespace> --image=busybox --target=app-container --share-processes
After running the command, you will be inside the debug container's shell.
Step 3: Diagnostics Inside the Debug Container
Inside the debug container, you have access to the file system and (optionally) the primary container's processes. Run the following commands to gather information:
- Processes: check which processes are running. If you used
--share-processes, you will see the primary container's processes.ps aux top # if installed - File System: examine the structure, permissions, and presence of critical files.
df -h # disk usage ls -la / # root file system ls -la /app # e.g., application directory cat /etc/os-release # OS information in the container - Network: check network connections and port availability.
netstat -tuln # listening ports curl http://localhost:8080 # if the application listens on port 8080 ip addr # network interfaces - Environment Variables: see which variables are set.
env - Application Logs: if logs are written to a file, read it.
tail -f /var/log/app.log - Additional: if the image contains utilities (e.g.,
strace,lsof), use them for deeper diagnostics.
Step 4: Analyze Cluster Logs and Events
Besides container logs, check cluster events related to the pod. Events can indicate scheduling issues, resource failures, or security errors.
kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name> --sort-by='.lastTimestamp'
Look for events with type Warning and messages like Failed or Unhealthy.
If the pod crashed, ensure you reviewed logs from the previous run (as in Step 1). Also, check kubelet logs on the node where the pod runs (requires node access).
Step 5: End Debugging and Clean Up
After gathering the necessary data, exit the debug container:
exit
# or Ctrl+D
The ephemeral container will automatically be removed on the next pod restart (e.g., if the pod is managed by a Deployment and a rolling update occurs). You do not need to manually delete ephemeral containers, as they do not appear as separate containers in kubectl get pods.
If you want to delete the entire pod immediately (e.g., after test debugging when the issue is fixed), run:
kubectl delete pod <pod-name> -n <namespace>
But be cautious: if the pod is managed by a controller (Deployment, StatefulSet), it will be recreated. Ensure that changes (e.g., fixing the image) have already been applied.
Verify the Result
Successful debugging means you have identified the root cause. For example:
- Found an error in application logs (e.g.,
Connection refused). - Discovered the container lacks file access due to incorrect permissions.
- Saw a process crashing due to memory shortage (OOM).
- Identified missing or incorrect application configuration.
After fixing (updating the Deployment, changing a ConfigMap, increasing resources), restart the pod and ensure it transitions to Running without restarts. Monitor logs in real-time:
kubectl logs -f <pod-name> -n <namespace>
Potential Issues
Access Denied (Forbidden)
If you receive Error from server (Forbidden): pods "<pod-name>" is forbidden, your user or service account lacks sufficient permissions. Ephemeral containers require the pods/ephemeralcontainers permission. Contact your cluster administrator to add a role, for example:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: <namespace>
name: pod-debugger
rules:
- apiGroups: [""]
resources: ["pods/ephemeralcontainers"]
verbs: ["create", "get", "list"]
Then bind the role to your account.
Debug Image Unavailable
If the busybox or other image cannot be pulled, check:
- Image availability:
docker pull busyboxlocally. - ImagePullSecret settings in the namespace or pod if a private registry is used.
- Use the full image name, including the registry:
kubectl debug -it <pod-name> --image=registry.example.com/busybox:latest.
Pod Does Not Support Ephemeral Containers
Ephemeral containers require Kubernetes 1.20+ and an enabled feature (usually on by default). Check your cluster version:
kubectl version --short
If the version is below 1.20, kubectl debug may not work. Alternatives:
- Temporarily modify the Deployment to add a sidecar container with debugging tools.
- Use
kubectl execin a running container (if it is still alive). - Launch a new pod with the same volumes and namespace for diagnostics.
Container Name Conflict
If the pod already has a container named "debug" (the default name for the debug container), the command will fail. Specify a unique name:
kubectl debug -it <pod-name> --image=busybox --name=debugger
Debug Container Cannot See Primary Container Processes
By default, an ephemeral container has its own PID namespace and cannot see the primary container's processes. To share the PID namespace, use --share-processes. However, this only works if the primary container also shares its PID namespace (often the default). Check the pod's configuration:
kubectl get pod <pod-name> -o yaml --template='{{.spec.containers[*].securityContext}}'
If shareProcessNamespace is not set to true, --share-processes may not work. In this case, to view the primary container's processes, use kubectl exec directly if the container is still running.