Debugging Kubernetes Pods: A Step-by-Step Guide with kubectl debug

Introduction / Why This Is Needed

In Kubernetes, a pod is the smallest deployable unit, and when things go wrong, debugging becomes critical. The kubectl debug command allows you to launch a temporary ephemeral container inside a problematic pod to inspect the file system, processes, and network settings without modifying the pod's configuration. This guide will show you how to effectively use kubectl debug for diagnostics, starting from checking the pod's status to gathering key data.

Prerequisites / Preparation

Before you begin, ensure that:

kubectl (version 1.20+) is installed and configured with access to your Kubernetes cluster (kubectl cluster-info).
Your account has permissions to create ephemeral containers in the target namespace (the pods/exec role or similar).
You know the pod name and namespace you need to debug.
Familiarity with basic kubectl commands (get, describe, logs) is recommended.

Step-by-Step Guide

Step 1: Check Pod Status and Gather Information

First, get general information about the pod to understand its state:

kubectl get pods -n <namespace>

Pay attention to the STATUS field (e.g., Running, CrashLoopBackOff, Error) and RESTARTS.

Then, describe the pod for details on events, containers, and conditions:

kubectl describe pod <pod-name> -n <namespace>

In the output, look for the Events, Containers, and Conditions sections. Common issues: Failed events, ImagePullBackOff, OOMKilled.

Finally, check the container logs. If the pod has multiple containers, use --all-containers:

kubectl logs <pod-name> -n <namespace> --all-containers=true

If the pod is restarting, add --previous to see logs from the previous container instance:

kubectl logs <pod-name> -n <namespace> --previous --all-containers=true

Step 2: Launch a Debug Container with kubectl debug

The kubectl debug command adds an ephemeral container to an existing pod. An ephemeral container is a temporary container that shares the network and IPC namespaces with the primary containers but has its own PID namespace (unless specified otherwise).

Use an image with diagnostic tools. busybox is suitable for basic commands (sh, ps, netstat). For network debugging, you can use nicolaka/netshoot.

kubectl debug -it <pod-name> -n <namespace> --image=busybox

Options:

-it — interactive mode with TTY.
--image — the image for the debug container.
--target=<container-name> — if you need to attach to a specific container in the pod (shares its namespace).
--share-processes — so the debug container can see the primary container's processes (shares PID namespace).

Example with a target container and shared PID namespace:

kubectl debug -it <pod-name> -n <namespace> --image=busybox --target=app-container --share-processes

After running the command, you will be inside the debug container's shell.

Step 3: Diagnostics Inside the Debug Container

Inside the debug container, you have access to the file system and (optionally) the primary container's processes. Run the following commands to gather information:

Processes: check which processes are running. If you used --share-processes, you will see the primary container's processes.
```
ps aux
top  # if installed
```

File System: examine the structure, permissions, and presence of critical files.

df -h  # disk usage
ls -la /  # root file system
ls -la /app  # e.g., application directory
cat /etc/os-release  # OS information in the container

Network: check network connections and port availability.

netstat -tuln  # listening ports
curl http://localhost:8080  # if the application listens on port 8080
ip addr  # network interfaces

Environment Variables: see which variables are set.
```
env
```
Application Logs: if logs are written to a file, read it.
```
tail -f /var/log/app.log
```
Additional: if the image contains utilities (e.g., strace, lsof), use them for deeper diagnostics.

Step 4: Analyze Cluster Logs and Events

Besides container logs, check cluster events related to the pod. Events can indicate scheduling issues, resource failures, or security errors.

kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name> --sort-by='.lastTimestamp'

Look for events with type Warning and messages like Failed or Unhealthy.

If the pod crashed, ensure you reviewed logs from the previous run (as in Step 1). Also, check kubelet logs on the node where the pod runs (requires node access).

Step 5: End Debugging and Clean Up

After gathering the necessary data, exit the debug container:

exit
# or Ctrl+D

The ephemeral container will automatically be removed on the next pod restart (e.g., if the pod is managed by a Deployment and a rolling update occurs). You do not need to manually delete ephemeral containers, as they do not appear as separate containers in kubectl get pods.

If you want to delete the entire pod immediately (e.g., after test debugging when the issue is fixed), run:

kubectl delete pod <pod-name> -n <namespace>

But be cautious: if the pod is managed by a controller (Deployment, StatefulSet), it will be recreated. Ensure that changes (e.g., fixing the image) have already been applied.

Verify the Result

Successful debugging means you have identified the root cause. For example:

Found an error in application logs (e.g., Connection refused).
Discovered the container lacks file access due to incorrect permissions.
Saw a process crashing due to memory shortage (OOM).
Identified missing or incorrect application configuration.

After fixing (updating the Deployment, changing a ConfigMap, increasing resources), restart the pod and ensure it transitions to Running without restarts. Monitor logs in real-time:

kubectl logs -f <pod-name> -n <namespace>

Potential Issues

Access Denied (Forbidden)

If you receive Error from server (Forbidden): pods "<pod-name>" is forbidden, your user or service account lacks sufficient permissions. Ephemeral containers require the pods/ephemeralcontainers permission. Contact your cluster administrator to add a role, for example:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: <namespace>
  name: pod-debugger
rules:
- apiGroups: [""]
  resources: ["pods/ephemeralcontainers"]
  verbs: ["create", "get", "list"]

Then bind the role to your account.

Debug Image Unavailable

If the busybox or other image cannot be pulled, check:

Image availability: docker pull busybox locally.
ImagePullSecret settings in the namespace or pod if a private registry is used.
Use the full image name, including the registry: kubectl debug -it <pod-name> --image=registry.example.com/busybox:latest.

Pod Does Not Support Ephemeral Containers

Ephemeral containers require Kubernetes 1.20+ and an enabled feature (usually on by default). Check your cluster version:

kubectl version --short

If the version is below 1.20, kubectl debug may not work. Alternatives:

Temporarily modify the Deployment to add a sidecar container with debugging tools.
Use kubectl exec in a running container (if it is still alive).
Launch a new pod with the same volumes and namespace for diagnostics.

Container Name Conflict

If the pod already has a container named "debug" (the default name for the debug container), the command will fail. Specify a unique name:

kubectl debug -it <pod-name> --image=busybox --name=debugger

Debug Container Cannot See Primary Container Processes

By default, an ephemeral container has its own PID namespace and cannot see the primary container's processes. To share the PID namespace, use --share-processes. However, this only works if the primary container also shares its PID namespace (often the default). Check the pod's configuration:

kubectl get pod <pod-name> -o yaml --template='{{.spec.containers[*].securityContext}}'

If shareProcessNamespace is not set to true, --share-processes may not work. In this case, to view the primary container's processes, use kubectl exec directly if the container is still running.

F.A.Q.

What is kubectl debug and when should I use it?

How to debug a pod that is constantly restarting (CrashLoopBackOff)?

Can I use kubectl debug in production?

What are the alternatives to kubectl debug for pod diagnostics?

Hints

Check pod status

Launch debug container

Diagnosis inside the container

Log analysis

End debugging