Application failure injection is a form of chaos engineering where we artificially increase the error rate of certain services in a microservice application to see what impact that has on the system as a whole. Traditionally, you would need to add some kind of failure injection library into your service code in order to do application failure injection. Thankfully, the service mesh gives us a way to inject application failures without needing to modify or rebuild our services at all.
We can easily inject application failures by using the Traffic Split API of the Service Mesh Interface. This allows us to do failure injection in a way that is implementation agnostic and works across service meshes.
We will do this first by deploying a new service which only return errored responses. We will be using a simple NGINX service which has configured to only return HTTP 500 responses.
We will then create a traffic split which would redirect the service mesh to send a sample percentage of traffic to the error service instead, let's say 20% of service's traffic to error, then we would have injected an artificial 20% error rate in service.
We will be deploying Linkerd Books application for this part of the demo
Use meshery to deploy the bookinfo application :
default
in the Namespace
field.Sample Application
card and select Books Application
from the list.Inject linkerd into sample application using
1linkerd inject https://run.linkerd.io/booksapp.yml | kubectl apply -f -
In the following, one of the service has already beeen configured with the error let's remove the error rate from the same :
1kubectl edit deploy/authors
Remove the lines
1- name: FAILURE_RATE2 value: "0.5
Now if you will see linkerd stat
, the success rate would be 100%
1linkerd stat deploy
Now we will create our error service, we have NGINX pre-configured to only respond with HTTP 500 status code
1apiVersion: apps/v12kind: Deployment3metadata:4 name: error-injector5 labels:6 app: error-injector7spec:8 selector:9 matchLabels:10 app: error-injector11 replicas: 112 template:13 metadata:14 labels:15 app: error-injector16 spec:17 containers:18 - name: nginx19 image: nginx:alpine20 ports:21 - containerPort: 8022 name: nginx23 protocol: TCP24 volumeMounts:25 - name: nginx-config26 mountPath: /etc/nginx/nginx.conf27 subPath: nginx.conf28 volumes:29 - name: nginx-config30 configMap:31 name: error-injector-config32---33apiVersion: v134kind: Service35metadata:36 labels:37 app: error-injector38 name: error-injector39spec:40 clusterIP: None41 ports:42 - name: service43 port: 700244 protocol: TCP45 targetPort: nginx46 selector:47 app: error-injector48 type: ClusterIP49---50apiVersion: v151data:52 nginx.conf: |25354 events {55 worker_connections 1024;56 }5758 http {59 server {60 location / {61 return 500;62 }63 }64 }65kind: ConfigMap66metadata:67 name: error-injector-config
After deploying the above errored service, we will create a traffic split resource which will be responsible to direct 20% of the book service to the error.
1apiVersion: split.smi-spec.io/v1alpha32kind: TrafficSplit3metadata:4 name: fault-inject5spec:6 service: books7 backends:8 - service: books9 weight: 800m10 - service: error-injector11 weight: 200m
You can now see an 20% error rate for calls from webapp to books
1linkerd routes deploy/webapp --to service/books
You can also see the error on the web browser
1kubectl port-forward deploy/webapp 7000 && open http://localhost:7000
If you refresh page few times, you will see Internal Server Error
.
1kubectl delete trafficsplit/error-split
Meshery Dashboard
by clicking on the trash icon
in the sample application
card on the linkerd adapters' page.