Wednesday 24 July 2024

Load testing with Azure Pipelines

 

Introduction

Performance issues can quickly turn expensive. Reasons can vary from unexpected traffic surges, pieces of non-performant code acting as bottlenecks, or misconfigured network components. Integrating load- and performance testing into your CI pipeline will allow you to know about performance degradations early, in most cases, even before it has any impact on your users in the production environment.

In this guide, we will be using k6 and Azure Pipelines to quickly, and effortlessly, get started.

k6 is a free and open-source testing tool for load and performance testing of APIs, microservices, and websites. It provides users with an easy-to-use javascript interface for writing load- and performance tests as code, effectively allowing developers to fit it into their everyday workflow and toolchain without the hassle of point-and-click GUIs.

Azure Pipelines is a continuous integration (CI) and continuous delivery (CD) service part of Microsoft's Azure DevOps offering. It can be used to continuously test and build your source code, as well as deploying it to any target you specify. Just as k6, it is configured through code and markup.

Writing your first performance test

It is usually a good idea to resist the urge to design for the end-goal right away. Instead, start small by picking an isolated but high-value part of the system under test, and once you got the hang of it - iterate and expand. Our test will consist of three parts:

  1. An HTTP request against the endpoint we want to test.
  2. A couple of load stages that will control the duration and amount of virtual users.
  3. A performance goal, or service level objective, expressed as a threshold.

Creating the test script

During execution, each virtual user will loop over whatever function we export as our default as many times as possible until the duration is up. This won't be an issue right now, as we've yet to configure our load, but to avoid flooding the system under test later, we'll add a sleep to make it wait for a second before continuing.

import http from 'k6/http';
import { sleep } from 'k6';
export default function () {
const res = http.get('https://test.k6.io');
sleep(1);
}

Configuring the load

In this guide, we will create a k6 test that will simulate a progressive ramp-up from 0 to 15 virtual users (VUs) for duration of ten seconds. Then the 15 virtual users will remain at 15 VUs 20 seconds, and finally, it will ramp down over a duration of 10 seconds to 0 virtual users.

import http from 'k6/http';
import { sleep } from 'k6';
export const options = {
stages: [
{ duration: '10s', target: 15 },
{ duration: '20s', target: 15 },
{ duration: '10s', target: 0 },
],
};
export default function () {
const res = http.get('https://test.k6.io');
sleep(1);
}

With that, our test should now average somewhere around ten requests per second. As the endpoint we're testing is using SSL, each request will also be preceded by an options request, making the total average displayed by k6 around 20 per second.

Expressing our performance goal

A core prerequisite to excel in performance testing is to define clear, measurable, service level objectives (SLOs) to compare against. SLOs are a vital aspect of ensuring the reliability of your system under test.

As these SLOs often, either individually or as a group, make up a service level agreement with your customer, it's critical to make sure you fulfill the agreement or risk having to pay expensive penalties.

Thresholds allow you to define clear criteria of what is considered a test success or failure. Let’s add a threshold to our options object:

import http from 'k6/http';
import { sleep } from 'k6';
export const options = {
stages: [
{ duration: '10s', target: 15 },
{ duration: '20s', target: 15 },
{ duration: '10s', target: 0 },
],
thresholds: {
http_req_duration: ['p(95)<250'],
},
};
export default function () {
const res = http.get('https://test.k6.io');
sleep(1);
}

In this case, the 95th percentile response time must be below 250 ms. If the response time is higher, the test will fail. Failing a threshold will lead to a non-zero exit code, which in turn will let our CI tool know that the step has failed and that the build requires further attention.

ℹ️ Threshold flexibility

Thresholds are extremely versatile, and you can set them to evaluate just about any numeric aggregation of the response time returned. The most common are maximum or 95th/99th percentile metrics. For additional details, check out our documentation on Thresholds.

Before proceeding, let's add the script file to git and commit the changes to your repository. If you have k6 installed on your local machine, you could run your test locally in your terminal using the command: k6 run loadtest.js.

Running your test in Azure Pipelines

To be able to use the marketplace extension in our pipeline, we first need to install it in our Azure DevOps organization. This can be done directly from the marketplace listing. Once that is done, we're all set for creating our pipelines configuration.

⚠️ Permission to install marketplace extensions required

The recommended approach is to use the marketplace extension, but if you for some reason can't, for instance if you lack the ability to install extensions, using the docker image as described further down, works great as well.

Local Execution

# azure-pipelines.yml
pool:
vmImage: 'ubuntu-latest'
steps:
- task: k6-load-test@0
inputs:
filename: 'YOUR_K6_TEST_SCRIPT.js'

We will run our tests on a virtual machine running the latest available version of Ubuntu. To run our actual load test, only one step is required.

The task for this step will be running the just installed k6-load-test extension. Additionally, we're supplying it with the filename of our test script. This will be the equivalent of executing k6 run YOUR_K6_TEST_SCRIPT.js on the pipeline virtual machine.

If your test is named test.js and is placed in the project root, the inputs key may be skipped altogether.

Do not forget to add azure-pipelines.yml and commit the changes to your repository. After pushing your code, head over to the Azure dashboard and visit the job that triggered the git push. Below is the succeeded job:

succeeded job

Cloud Execution

Cloud execution can be useful in these common cases:

  • You want to run a test from one or multiple geographic locations (load zones).
  • You want to run a test with a high-load that will need more compute resources than provisioned by the CI server.
  • Get automatic analysis of the results.

Running your tests in our cloud is transparent for the user, as no changes are needed in your scripts. Just add cloud: true to your pipeline configuration.

To make this script work, you need to get your account token from Grafana Cloud k6 and add it as a variable. The Azure Pipeline command will pass your account token to the Docker instance as the K6_CLOUD_TOKEN environment variable, and k6 will read it to authenticate you to Grafana Cloud k6 automatically.

create variable

By default, the cloud service will run the test from N. Virginia (Ashburn). But, we will add some extra code to our previous script to select another load zone:

export const options = {
// ...
ext: {
loadimpact: {
name: 'test.k6.io',
distribution: {
loadZoneLabel1: {
loadZone: 'amazon:ie:dublin',
percent: 100,
},
},
},
},
};

This will create 100% of the virtual users in Ireland. If you want to know more about the different Load Zones options, read more here.

Now, we recommend you to test how to trigger a cloud test from your machine. Execute the following command in the terminal:

$ k6 login cloud
$ k6 cloud loadtest.js

With that done, we can now go ahead and git addgit commit and git push the changes we have made in the code and initiate the CI job.

By default, the Azure Pipelines task will print the URL to the test result in Grafana Cloud k6, which you may use to navigate directly to the results view and perform any manual analysis needed.

cloud url

Grafana Cloud k6 results

We recommend that you define your performance thresholds in the k6 tests in a previous step. If you have configured your thresholds properly and your test passes, there should be nothing to worry about.

If the test fails, you will want to visit the test result on the cloud service to find its cause.

Variations

Scheduled runs (nightlies)

It is common to run some load tests during the night when users do not access the system under test. For example, to isolate larger tests from other types of testing or to periodically generate a performance report.

Below is the first example configuring a scheduled trigger that runs at midnight (UTC) of everyday, but only if the code has changed on master since the last run.

# azure-pipelines.yml
pool:
vmImage: 'ubuntu-latest'
steps:
- task: k6-load-test@0
inputs:
filename: 'YOUR_K6_TEST_SCRIPT.js'
schedules:
- cron: '0 0 * * *'
displayName: Daily midnight build
branches:
include:
- master

Using the official docker image

If you, for any reason, don't want to use the marketplace extension, you may almost as easily use the official docker image instead.

# azure-pipelines.yml
pool:
vmImage: 'ubuntu-18.04'
steps:
- script: |
docker pull grafana/k6
displayName: Pull k6 image
- script: |
docker run -i \
-e K6_CLOUD_TOKEN=$(K6_CLOUD_TOKEN) \
-v `pwd`:/src \
grafana/k6 \
cloud /src/loadtest.js \
displayName: Run cloud test

Using the Azure Load Testing service

 

Using the Azure Load Testing service


What is load testing, and why is it important?

You have designed, planned, and built your application, and it's now operating in your environment. Next up is understanding how the application behaves during expected, and perhaps unexpected, load.

Load testing is generally considered a practice of testing the expected usage of your application(s) when multiple users are accessing it simultaneously.

For example, you designed a new web app to serve customers or end-users with specific tasks. Perhaps it's an order form, booking system, an entire SaaS product suite, or something else. How do you know that it can handle simultaneous user load as it scales up and your service grows, or if you have traffic spikes during specific events?

In these situations, a good plan for load testing can help you identify bottlenecks before the end-users discover them.

I often see Azure solutions deployed without regard for reliance, reliability, operational excellence, and scalability. In many cases, applications are deployed and seem to work, but the issues often come when there is an increased load.

I've identified common pitfalls in many solutions, specific to Azure solutions in this case, but relevant regardless of tech stack.

  • The system doesn't automatically scale as demand increases, leading to CPU and memory exhaustion.
  • The system doesn't handle retry-actions with exponential back-off, causing it to drop important calls or events, eventually causing application crashes or unexpected behavior.
  • Connectivity to dependent services like the Azure Key Vault isn't done according to best practices, leading to throttling and exceptions.
  • Service leaks are injected across multiple user sessions, eventually causing inconsistencies and crashes.

There are many more common pitfalls when designing your solutions, but this list should hint at some of the common real-world challenges I have faced.

With proper load testing, you can identify several of these bottlenecks and ensure the system scales accordingly and can handle the load - and also ensure that dependent services are correctly being used and do not cause dependency failures as the system scales up.

There are a few things that load testing helps with, and the above.

  • Test your scalability. Can you handle an increase in traffic? How quickly?
  • Test how your application performs under stress. Do your application and the dependent services work as expected?
  • Test your signals and monitoring! The signals, logs, and monitoring data will significantly increase when the load increases. Can you still make sense of this information, or do you need to adjust how you log and monitor your system?
  • Verify that you get your alerts. Can you get the alerts you expect for unexpected outages, dependency failures, scalability issues, resource exhaustion (CPU/Memory/Disk IO, SNAT port exhaustion, and more)?
  • Does your load balancer work? Is traffic successfully balanced between instances and regions?
  • Measure your server-side metrics. See how the services you use to support your solution behave and any flags being raised during load.
  • Measure your client-side metrics. Track response times, the number of requests happening per second, how many users are being loaded onto the system - this helps in correlating the data with what happens on the service side.

I often see "fire and forget" deployments, demo-code, and solutions that work well on a development machine or even in a QA system with minimal users and load. When the solutions hit prime-time and get battle-tested, the results usually look different.

While this is not an exhaustive list of things you should check, I hope it paints a picture of the importance of load testing your applications.

What is the Azure Load Testing service?

I have set the scene. You know why load testing is essential and how it can help you understand how your system behaves under stress and heavy load.

Let's take a moment to explore the Azure Load Testing service and how it helps us achieve a more reliable solution.

If you want to tune in and listen to a podcast about this service:
Episode 115 - We took the Azure Load Testing service for a spin
- Ctrl+Alt+Azure Podcast

Microsoft published an excellent overview to understand how the Azure Load Testing service works.

Azure Load Testing service functionality - from Microsoft Docs

The critical points for why I'm currently evaluating the Azure Load Testing service:

  • Ease of use.
  • Take any JMeter script and plug it into the load test.
  • Measure both the client-side and server-side metrics during load tests.
  • Configurable with virtual users and simulated load.
  • Integrate with Azure Pipelines or GitHub.
  • Can use managed identities in Azure.
  • Supports many Azure resource types for server-side monitoring.

The documentation is very elaborate, and therefore I don't need to repeat what's already been stated there.

Let's set it up

The documentation around creating Azure Load Testing service accounts and setting up tests is elaborate. I'll just run through the basics to show how easy this is to get started with - then you can take it further in your organization and turn the volume up.

Create the Azure Load Testing service

You now have a new service called "Azure Load Testing" from the Azure Portal that you can provision into your Azure resource groups.

Azure Load Testing, as seen in the Azure Portal.

Per usual, define your resources and create them in any available locations.

Create a new Azure Load Testing resource in your given resource group and location from the Azure Portal.

Provisioning is done - that's the easiest part. Next, when the service is created, your need to explicitly configure some new permissions.

Grant the appropriate Load Test role to your user(s)

To work with and create new Load Tests with the Azure Load Test service, you need to assign at least one of the Load Test roles to your users or groups.

If you have not been granted the Load Test roles, you may see the following message:

You are not authorized to use this resource. You need to have the Load Test Owner, Load Test Contributor, or Load Test Reader role. To assign Azure roles, you need to have Microsoft.Authorization/roleAssignments/write permissions such as User Access Administrator or Owner. In case the role was granted recently, please refresh the page and try again. Learn more

Go to your new Azure Load Testing service and select "Access Control (IAM)", and then add any of the desired "Load Test" roles.

There are currently three available roles, and the names are self-explanatory.

  • Load Test Reader
  • Load Test Contributor
  • Load Test Owner

Since I want to perform all operations, settings, and further configurations in my Azure Load Testing instance, I will assign the Owner role to myself.

The Azure Load Testing service requires one of the Load Test roles to be assigned to a group or user.

Quickly reviewing the permissions on the resource should now show that you have one of the Load Test roles assigned.

The Azure Load Test service has roles assigned to some users.

Create a new test in the Azure Load Testing service

To set up a new load test, you have one main requirement. You need a JMeter test file.

For instructions on how to use JMeter, check this out:

I will not talk about how to create JMeter files, as there are many available resources for that purpose. Instead, let's review how to make the test and eventually see the results.

Click "Create test" to start configuring a new load test.

Create a new test in the Azure Load Testing service.

Follow the instructions per the wizard, and upload your recorded JMeter script, which will be in the .jmx format.

Upload an Azure Load Testing service script based on JMeter (jmx script).

Some notes about secrets and sensitive parameters

In the next step of the wizard, you can add parameters. Parameters can, for example, include Environment variables or Secrets if needed for your script to execute. For example, username/passwords, tokens, API keys required for a web request, etc.

Using secrets in JMX, use the GetSecret function to pull out the value of the secret when the test is executing. You need to perform that on the Jmeter side when designing your scripts.

There is also a built-in integration to Azure Key Vault to parameterize load tests with secrets from the Key Vault.

Moving on, configure the load. The load setting determines how many test engines to spin up for the test. The engines multiplied with the number of threads defined in the JMX script equals the number of total threads - a great way to scale up the load testing.

Define the number of engine instances for the Azure Load Testing service.

Next, you need to determine and define the test criteria. These are usually metrics that decide if the test fails or not.

Define test criteria for the Azure Load Testing service.

Connect monitoring

When you reach the "Monitoring" step, it becomes more interesting again. We can connect any back-end service in Azure from this page, like Application Insights or App diagnostics, or metrics from Azure Storage Accounts.

The Azure Load Testing service has a great USP (Unique Selling Point) in that it directly integrates with many of the supporting services to our web apps. Using the integrations to monitor these services during load gives us a better understanding of how the overall system is impacted - not just the web application metrics.

Connect monitoring services to the Azure Load Testing service.

Connect the test to any of the monitoring resources.

Review and create

Creating the load test will bring you to the review and eventually generate the test. New tests usually start to run within a few moments, and then you can start seeing the results as the test progresses and puts your system under load.

The Azure Load Testing service is creating new tests.

Review the results

When the load tests have finished, we can see the test's metrics, including errors and performance - and all the server-side metrics from the connected services.

The connected monitoring is the true benefit of using Azure Load Testing. I have used many load testing platforms in the past, but I never got immediate insights from all my dependencies.