What the CrashLoopBackOff Error Means
CrashLoopBackOff is a Kubernetes pod state where a container inside it repeatedly fails (exit code ≠ 0), and the kubelet unsuccessfully attempts to restart it. After each failed attempt, the delay before the next restart increases exponentially (10s, 20s, 40s, etc.), which is reflected in the state's name.
The error appears in the output of kubectl get pods:
NAME READY STATUS RESTARTS AGE
my-app-5d89d7c8b9-xyz 0/1 CrashLoopBackOff 5 2m
This is not a Kubernetes error itself, but an indicator that the container cannot start reliably. The problem always lies in the pod configuration or the application itself.
Common Causes
- Application error — the application crashes on startup (e.g., due to incorrect arguments, missing environment variables, code errors).
- Insufficient resources — the container lacks memory (OOMKilled) or CPU, causing a forced termination.
- Incorrect startup command — the pod manifest specifies invalid
commandorargs, or the image lacks an ENTRYPOINT. - Image issues — non-existent image tag, authentication error in a private registry, or a corrupted image.
- Unavailable volumes — a volume (PersistentVolumeClaim) is not mounted, unavailable, or has incorrect permissions.
- Insufficient privileges — the container lacks permissions (SecurityContext, ServiceAccount) to access resources (e.g., a port <1024).
- Port conflict — the application tries to listen on a port already occupied by another process in the container or on the host.
- Initialization problems — init containers failed, failing to provide necessary resources.
Method 1: Diagnosis via Logs and Pod Description
The first and most crucial step is to gather information about the error.
- Find the pod name:
kubectl get pods
Locate the pod with theCrashLoopBackOffstatus. Note its full name (e.g.,my-app-5d89d7c8b9-xyz). - View logs from the previous container instance:
kubectl logs <pod-name> --previous
The--previousflag shows logs from the container that already terminated. If the pod is multi-container, specify the container name:-c <container-name>. - Examine the pod's Events:
kubectl describe pod <pod-name>
In the output, find theEventssection. Common errors:OOMKilled— memory shortage.FailedorErrImagePull— image problems.FailedMount— volume mounting error.Exitedwith a code — application error.
- If the application logs to stdout/stderr, they will appear in
kubectl logs. If logs are written to a file inside the container, try connecting to the crashed container (if possible):kubectl exec -it <pod-name> -- /bin/sh
However, withCrashLoopBackOff, the container may be unavailable for exec. In this case, use--previousfor logs or temporarily modify the manifest to prevent the container from terminating (e.g., runsleep infinity).
Method 2: Checking and Increasing Resources (Memory/CPU)
A frequent cause is the container running out of memory, causing the kernel to kill the process (OOMKilled).
- In
kubectl describe pod <pod-name>, look for:Limits: memory: 256Mi Requests: memory: 128Mi
If thememorylimit is too low (e.g., 128Mi) for a Java application or database, increase it. - Modify the pod manifest (deployment/statefulset):
spec: containers: - name: my-app resources: limits: memory: "512Mi" cpu: "500m" requests: memory: "256Mi" cpu: "250m"
Apply the changes:kubectl apply -f deployment.yaml - If the issue is CPU, increase
limits.cpuandrequests.cpu. Note that CPU limits can cause throttling but not OOMKilled. - For diagnostics, you can temporarily remove limits (not for production!) to check if the error disappears:
resources: limits: {} requests: {}
Method 3: Checking the Startup Command and Arguments
If the application requires specific arguments or environment variables that aren't provided, it may crash.
- Check which command is executed in the container:
kubectl describe pod <pod-name> | grep -A5 "Command:" kubectl describe pod <pod-name> | grep -A5 "Args:"
If the command or arguments are incorrect, the container will terminate immediately. - Example problem:
The Dockerfile specifiesENTRYPOINT ["java", "-jar", "app.jar"], but the pod manifest overridescommand: ["python"]— this will cause an error. - Fix the manifest:
spec: containers: - name: my-app image: my-image:latest command: ["java"] # optional, if you want to override ENTRYPOINT args: ["-jar", "app.jar"] # optional, if you want to override CMD env: - name: DB_HOST value: "postgres"
Ensure the command exists in the image (check the Dockerfile). - If the application depends on environment variables, add all required ones in the
envsection. Check for typos.
Method 4: Checking the Image and Its Availability
Image issues are one of the most common causes of CrashLoopBackOff.
- Verify the image exists and is accessible:
kubectl describe pod <pod-name> | grep -i "image"
Look for events likeErrImagePullorImagePullBackOff. - Ensure the image tag is correct:
- Avoid using
latestin production (it can change). - Check that the image exists in the registry:
docker pull <image:tag>(if you have access).
- Avoid using
- If the image is in a private registry, ensure you have created an
imagePullSecret:kubectl get secret
If the secret doesn't exist, create it:kubectl create secret docker-registry regcred \ --docker-server=<registry-server> \ --docker-username=<username> \ --docker-password=<password>
And add it to the pod/deployment:spec: imagePullSecrets: - name: regcred - Check that the image is compatible with the node's architecture. For example, an
amd64image won't run on anarm64node.
Method 5: Checking Volumes and SecurityContext
If the container fails to mount a volume or access a resource due to permissions, it will crash.
- Check for FailedMount events:
kubectl describe pod <pod-name> | grep -A10 "Events"
Errors likeMountVolume.SetUporfailed to mount volumesindicate issues with the PVC or volume configuration. - Ensure the PVC exists and is in
Boundstatus:kubectl get pvc - Review SecurityContext:
- If the container runs as
root(UID 0) but the application requires a non-privileged user, configurerunAsUser. - If you need to listen on a port <1024,
CAP_NET_BIND_SERVICEis required:securityContext: capabilities: add: ["NET_BIND_SERVICE"] - Verify
fsGroupis set correctly for volume access.
- If the container runs as
- Example fix for a volume:
spec: containers: - name: my-app volumeMounts: - name: data mountPath: /data volumes: - name: data persistentVolumeClaim: claimName: my-pvc
Ensuremy-pvcexists and has sufficient capacity.
Prevention
- Test the image locally before deployment:
docker run --rm -it <image:tag> <command>
Ensure the container starts and doesn't crash. - Specify explicit commands and arguments in the manifest, even if they match the Dockerfile. This eliminates implicit dependencies.
- Configure reasonable resource limits based on monitoring (e.g., via
kubectl top pod). Start with a 20-30% buffer. - Use readiness/liveness probes — they won't prevent CrashLoopBackOff, but help distinguish startup issues from runtime health problems.
- Enable logging to stdout/stderr and collect it via a DaemonSet (Fluentd, Filebeat) — this simplifies diagnostics.
- Validate manifests with
kubectl apply --dry-run=clientbefore applying. - For critical applications, consider using
restartPolicy: OnFailure(for Jobs) or configuringbackoffLimitin a Job to limit restart attempts.
💡 Tip: If the application only crashes in Kubernetes but works locally, compare the environments: variables, volumes, network, node architecture. Often, the issue stems from differences between
docker runand a Kubernetes pod.