Debugging workloads¶
If you have problems getting your pods running you should check out the official documentation from Kubernetes:
Info
Applications in the context above is not the NAIS applications.
Debugging a NAIS applications resource is done with kubectl describe application $app_name
.
kubectl
debug¶
You can run an ephemeral container in a pod using the kubectl debug
command.
The following example starts a shell in a new ephemeral container in the my-pod-name
pod using the nais debug image:
kubectl debug -it my-pod-name --image="europe-north1-docker.pkg.dev/nais-io/nais/images/debug:latest" --profile=restricted
Once the ephemeral container is created, you will be presented with a shell prompt where you can try debugging the issue.
image capabilities
The specified --image
cant have more capabilities than the pod it is attached to and must be able to run as non-root.
kubectl
attach¶
If you want to attach to a running process inside a running container you need to use kubectl attach
.
You can read more about the command over at kubernetes.io/docs.
# Switch to raw terminal mode; sends stdin to 'bash' in ruby-container from pod mypod
# and sends stdout/stderr from 'bash' back to the client
kubectl attach mypod -c ruby-container -i -t
Debugging Memory Leaks¶
If you experience memory leaks in Java processes you can get heap dumps either automatically on OOM or on-demand.
Automatically on OOM¶
Set JAVA_OPTS
to -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp
The /tmp
volume is maintained through restarts, so if your app is restarting because of OOM, the heap dumps can be retrieved from there.
Manually on-demand¶
You can use jmap
to create a heap dump of a running Java process.
Find a pod and exec jmap
in it (assuming PID 1 is the Java process):
Getting the heap dump¶
You can use kubectl cp
to get the files from the pod to your local computer:
You can inspect the heap dumps with tools like JProfiler, VisualVM or IntelliJ.
FAQ¶
I get an HTTP 503 Service Unavailable error when visiting the ingress for my application, why?¶
Answer
This indicates that your application is not ready to serve traffic. This is usually due to one of the following:
- The application is not deployed to the cluster
- The application is not up and running. This can be caused by a problem with the application itself, for example:
- The application doesn't respond to any configured health checks
- The application only has a single pod or replica, and that pod is not running
- The application is configured incorrectly (e.g. has missing required dependencies, has the wrong image, etc.)
My application gets an HTTP 504 Gateway Timeout error when attempting to communicate with another application, why?¶
Answer
If you're using service discovery, ensure that the access policies for both applications are correctly set up.
Otherwise, ensure that the other application is running and responding to requests in a timely manner (see also ingress customization for timeout configuration).