In Kubernetes, probes and health checks are mechanisms used to determine the availability and readiness of applications running within the cluster. They are essential for ensuring the stability and reliability of the deployed services. Let's explore each concept:
Probes are diagnostics performed by Kubernetes to assess the health of a container. They can be of three types:
Probes can be defined in the pod specification using the livenessProbe
, readinessProbe
, and startupProbe
fields. Each probe has configurable parameters such as the type of probe (HTTP, TCP, or Exec), the path, port, and timeout. Kubernetes periodically executes the probes based on the specified settings and evaluates the results.
Health checks are mechanisms used to monitor the overall health and performance of a Kubernetes cluster. They ensure that the cluster components, including the control plane and worker nodes, are functioning correctly. Health checks can be classified into two categories:
Kubernetes provides built-in health check mechanisms for cluster components, and it's also possible to implement custom health checks using tools like Prometheus, Grafana, or other monitoring solutions.
By combining probes and health checks, Kubernetes ensures that applications and the underlying cluster infrastructure are in a healthy state, providing better reliability and resilience.
For the example I will specifically looking at setting up a pod with livenessProbe
and readinessProbe
. As I am predominately a dotnet engineer I have made use of a custom docker image which implements health checks in a dotnet minimal API. You can checkout the source code of this application at https://github.com/reggieray/dotnet-health-checks and you can read how health checks work in dotnet in this Dotnet health checks blog post.
To run the example I had the following tools/software installed:
I also setup a alias for kubectl using the following command:
Set-Alias -Name k -Value kubectl
Make sure minikube is up and running:
minikube start
Create a pod.yaml
file with the following contents. As mentioned previously it makes use of a custom build image that I created specifically for demonstrating health checks. The link to the source code is put above.
I have defined a livenessProbe
and readinessProbe
, also pay attention to the environment variables, as these can be set to give a healthy or unhealthy results depending on what value is set. To start off with I have set them to values that results in both returning healthy results.
apiVersion: v1
kind: Pod
metadata:
labels:
app: healthz
name: healthz
namespace: default
spec:
containers:
- image: matthewregis/public:dotnet-health-checks
imagePullPolicy: IfNotPresent
name: healthz
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /healthz/live
port: 80
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /healthz/ready
port: 80
initialDelaySeconds: 5
periodSeconds: 10
env:
- name: HealthCheck__MyCustomStartUpHealthCheck
value: "True"
- name: HealthCheck__UriCheck
value: "https://matthewregis.dev"
resources: {}
Apply the pod.yaml
.
> k apply -f pod.yaml
pod/healthz created
Create a service that exposes the pod so we can manually test the API and it's endpoints. Create service.yaml
file with following:
apiVersion: v1
kind: Service
metadata:
name: healthz
spec:
selector:
app: healthz
ports:
- name: http
port: 80
targetPort: 80
type: NodePort
Then apply the service
> k apply -f service.yaml
service/healthz created
Create a tunnel with minikube
> minikube service healthz
|-----------|---------|-------------|---------------------------|
| NAMESPACE | NAME | TARGET PORT | URL |
|-----------|---------|-------------|---------------------------|
| default | healthz | http/80 | http://192.168.49.2:30904 |
|-----------|---------|-------------|---------------------------|
🏃 Starting tunnel for service healthz.
|-----------|---------|-------------|------------------------|
| NAMESPACE | NAME | TARGET PORT | URL |
|-----------|---------|-------------|------------------------|
| default | healthz | | http://127.0.0.1:52625 |
|-----------|---------|-------------|------------------------|
🎉 Opening service default/healthz in default browser...
❗ Because you are using a Docker driver on windows, the terminal needs to be open to run it.
This opened http://127.0.0.1:52625/ for me, to test the endpoints I updated the url to:
All the endpoints worked as expected.
You can also verify the pod is working as expected by looking at the pod
> k logs healthz
info: Microsoft.Hosting.Lifetime[14]
Now listening on: http://[::]:80
info: Microsoft.Hosting.Lifetime[0]
Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
Content root path: /app
warn: Microsoft.AspNetCore.HttpsPolicy.HttpsRedirectionMiddleware[3]
Failed to determine the https port for redirect.
info: System.Net.Http.HttpClient.UriCheck.LogicalHandler[100]
Start processing HTTP request GET https://matthewregis.dev/
### output shortened for brevity ###
> k describe pod healthz
### output shortened for brevity ###
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m52s default-scheduler Successfully assigned default/healthz to minikube
Normal Pulled 7m52s kubelet Container image "matthewregis/public:dotnet-health-checks" already present on machine
Normal Created 7m52s kubelet Created container healthz
Normal Started 7m51s kubelet Started container healthz
### if you have minikube dashboard, you can also use this to verify ###
> minikube dashboard
Remove the pod
> k delete pod healthz
pod "healthz" deleted
Update the following section of the pod definition with a website that would fail. In my case I removed the .dev
at the end.
- name: HealthCheck__UriCheck
value: "https://matthewregis"
Re-apply the pod
> k apply -f pod.yaml
pod/healthz created
Now you should see that because the livenessProbe
returns a unhealthy result, Kubernetes continually restarts the pod to try and get it back into a healthy state.
> k describe pod healthz
### output shortened for brevity ###
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m54s default-scheduler Successfully assigned default/healthz to minikube
Normal Pulled 54s (x3 over 2m54s) kubelet Container image "matthewregis/public:dotnet-health-checks" already present on machine
Normal Killing 54s (x2 over 114s) kubelet Container healthz failed liveness probe, will be restarted
Normal Created 53s (x3 over 2m54s) kubelet Created container healthz
Normal Started 53s (x3 over 2m54s) kubelet Started container healthz
Warning Unhealthy 14s (x8 over 2m34s) kubelet Liveness probe failed: Get "http://10.244.0.153:80/healthz/live": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
> k delete pod healthz
pod "healthz" deleted
> k delete svc healthz
service "healthz" deleted