Fault Injection

Fault Injection using SMI in Linkerd

Application failure injection is a form of chaos engineering where we artificially increase the error rate of certain services in a microservice application to see what impact that has on the system as a whole. Traditionally, you would need to add some kind of failure injection library into your service code in order to do application failure injection. Thankfully, the service mesh gives us a way to inject application failures without needing to modify or rebuild our services at all.

Using SMI Traffic Split API to inject errors

We can easily inject application failures by using the Traffic Split API of the Service Mesh Interface. This allows us to do failure injection in a way that is implementation agnostic and works across service meshes.

We will do this first by deploying a new service which only return errored responses. We will be using a simple NGINX service which has configured to only return HTTP 500 responses.

We will then create a traffic split which would redirect the service mesh to send a sample percentage of traffic to the error service instead, let's say 20% of service's traffic to error, then we would have injected an artificial 20% error rate in service.

Deploy Linkerd Books Application

We will be deploying Linkerd Books application for this part of the demo

Use meshery to deploy the bookinfo application :

  • In Meshery, navigate to the Linkerd adapter's management page from the left nav menu.
  • On the Linkerd adapter's management page, please enter default in the Namespace field.
  • Then, click the (+) icon on the Sample Application card and select Books Application from the list.

Inject linkerd into sample application using

1linkerd inject https://run.linkerd.io/booksapp.yml | kubectl apply -f -

In the following, one of the service has already beeen configured with the error let's remove the error rate from the same :

1kubectl edit deploy/authors

Remove the lines

1- name: FAILURE_RATE
2 value: "0.5

Now if you will see linkerd stat, the success rate would be 100%

1linkerd stat deploy

Create the errored service

Now we will create our error service, we have NGINX pre-configured to only respond with HTTP 500 status code

1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: error-injector
5 labels:
6 app: error-injector
7spec:
8 selector:
9 matchLabels:
10 app: error-injector
11 replicas: 1
12 template:
13 metadata:
14 labels:
15 app: error-injector
16 spec:
17 containers:
18 - name: nginx
19 image: nginx:alpine
20 ports:
21 - containerPort: 80
22 name: nginx
23 protocol: TCP
24 volumeMounts:
25 - name: nginx-config
26 mountPath: /etc/nginx/nginx.conf
27 subPath: nginx.conf
28 volumes:
29 - name: nginx-config
30 configMap:
31 name: error-injector-config
32---
33apiVersion: v1
34kind: Service
35metadata:
36 labels:
37 app: error-injector
38 name: error-injector
39spec:
40 clusterIP: None
41 ports:
42 - name: service
43 port: 7002
44 protocol: TCP
45 targetPort: nginx
46 selector:
47 app: error-injector
48 type: ClusterIP
49---
50apiVersion: v1
51data:
52 nginx.conf: |2
53
54 events {
55 worker_connections 1024;
56 }
57
58 http {
59 server {
60 location / {
61 return 500;
62 }
63 }
64 }
65kind: ConfigMap
66metadata:
67 name: error-injector-config

After deploying the above errored service, we will create a traffic split resource which will be responsible to direct 20% of the book service to the error.

1apiVersion: split.smi-spec.io/v1alpha3
2kind: TrafficSplit
3metadata:
4 name: fault-inject
5spec:
6 service: books
7 backends:
8 - service: books
9 weight: 800m
10 - service: error-injector
11 weight: 200m

You can now see an 20% error rate for calls from webapp to books

1linkerd routes deploy/webapp --to service/books

You can also see the error on the web browser

1kubectl port-forward deploy/webapp 7000 && open http://localhost:7000

If you refresh page few times, you will see Internal Server Error.

Cleanup

1kubectl delete trafficsplit/error-split
  • Remove the book info application from the Meshery Dashboard by clicking on the trash icon in the sample application card on the linkerd adapters' page.

NEXT CHAPTER

Getting Started

Layer5, the cloud native management company

An empowerer of engineers, Layer5 helps you extract more value from your infrastructure. Creator and maintainer of cloud native standards. Maker of Meshery, the cloud native manager.