Performance testing is the act of ensuring your application is ready for whatever amount of load customers can throw at it. This non-functional testing type is typically executed by a specific performance engineer role, prior to new code being deployed to production. However, some companies rely on functional testing engineers to execute performance testing as well. In this post, I’d like to give you a jumping-off place if you’re starting your performance testing journey.
Types of Performance Testing
There are multiple subtypes of performance testing that can be executed, depending on your goal. To start, it may be best to focus on one of these (like load testing) and work your way through the other types over time.
This is the basic starting place for many companies. Load testing measures your system performance under normal conditions. It measures response time and system stability as load increases.
Best Used For: Companies starting performance testing that want to ensure everyday load doesn’t overburden their system.
Endurance Testing (Soak Test)
This testing type ensures system stability during normal peak volumes, but over an extended period of time. While this may seem very similar to Load Testing, in this test you’re checking for things such as memory leaks, that may occur over an extended period.
Best Used For: Measuring longevity to uncover system failures caused by issues such as memory leaks.
Break-Point Testing (Stress Test)
Break-point testing is meant to push your system to the point that it fails. The application is given more volume (whether transactions or virtual users) than it can handle. The goal of this test is to understand at what point your application will fail and to ensure your application can recover after the failure.
Best Used For: Ensuring readiness for a large volume day.
Spike testing is similar to a stress test but instead of sustaining a higher than normal load on the application over an extended period of time, a spike test will quickly increase loads, bringing them up and down, repeatedly.
Best Used For: The best use case that comes to mind is a company that is planning to market via radio or t.v throughout an event. Spikes in marketing will hopefully lead to a spike in application traffic.
Scalability Testing gradually increases virtual user load or data while monitoring application performance. Likewise, the volume may stay normal while resources such as CPUs and memory are changed to understand the effect.
Best Used For: Using system environmental factors to change the outcome of application performance.
Volume Testing, aka Flood Testing, populates a large amount of data into the database, and system behavior is monitored.
Best Used For: An organization might use this to validate site readiness while thousands of users are downloading data from it.
Useful Tools for Performance Testing
There are several performance testing tools on the market today. Being able to script your user flows so you know you’re testing the right scenarios, and quickly being able to see degradation, are two really important factors with performance testing. Here are a few choice tools that will help make scripting, execution, and analyzing easier:
CloudTest: Akamai makes this product and I’ve personally used it. It’s robust, user-friendly, and allows you to execute any of the testing types above.
NeoLoad: While I’ve not personally used this tool (yet), I have sat through a detailed demo of it. It looks very user-friendly and enables engineers to quickly script scenarios and start testing. Sometimes the barrier to entry is being able to quickly script the flows you need to cover.
LoadRunner: Historically a popular choice by many, LoadRunner allows you to run the full gamut of tests. I used this tool several years ago and was impressed by its ease of use.
jMeter: This is a free open source tool designed to load test functional behavior and measure performance. A nice option for those on a budget.
Fiddler: This is a free web debugging proxy tool, similar to Charles Proxy, that allows you to track HTTP traffic between the internet and your computer and manipulate requests. Alexander has a nice post about how to set up web performance tests with FIddler.
Taurus: They describe their purpose as: Hides the complexity of performance and functional tests with an automation-friendly convenience wrapper. Taurus relies on JMeter, Gatling, Locust.io, Grinder, and Selenium WebDriver as its underlying tools.
If you’re looking for more free options, visit Joe Colantonio and review his list of 14 free performance testing tools.
Questions to Ask Before You Start Testing
Understanding what everyone considers acceptable for system performance is paramount to your testing. Here are some questions you should answer with your team before you even begin writing the first test:
- What is an acceptable amount of time for a page to load?
- What is an acceptable amount of time for a transaction (think payment) to go through?
- What is an acceptable response time for actions through the web pages? Does that change depending on what action they’re performing?
- Understand the volume of people on your site in a normal day.
- Understand the volume of people and actions on your site on your peak days.
- Understand the volume of actions hitting external APIs through third parties.
- What is your user’s journey? Start small, with your top three user paths.
- What environment will I be testing in? Ideally, you have a distinct load test environment.
- What does your server or database activity typically look like?
- How long are users going to be on a page before they make a decision? (Think time)
So What Am I Measuring Exactly?
There are multiple measurements and metrics to track when performance testing. To track performance over time, it is important to set a baseline so you can quickly spot degradation in your application’s performance. Let’s start with a couple of standard definitions that will help us talk through metrics:
Baseline: This is the normal level of performance your application is expected to perform at. Deviations from this baseline should be tracked carefully. We will talk more about using and maintaining baselines later in the post.
Benchmark: These are the agreed-upon response and load times for your system. This sets the standard for your application’s performance and allows you to know when to sound the alarm. I would recommend working with your business partners or product team to define these. They are sometimes referred to as SLAs (service level agreements).
Virtual Users (VU): These concurrent users simulate the behavior of real users that visit your app. To calculate the number of VU you need to use during your test, try this calculation: Virtual Users = (Hourly Sessions x Average Session Duration in seconds) / 3,600
Next, let’s dive into the measurements and metrics you should be tracking as a part of your testing efforts:
- Render Time: The amount of time it takes for a page to completely load and be ready for the user to interact.
- Response Time: The amount of time it takes to send a request and receive a response.
- Wait Time: When a page begins loading, this is the time it takes to get that first byte after the request is sent.
- Error Rate: Percentage of requests made / errors received
- Transaction Failure Rate: Percentage of total requests made / total failed requests
- Throughput: Amount of bandwidth used throughout the test, measured in kilobytes per second
- Requests per Second: How many requests are handled, per second
- CPU Utilization: how much time the CPU needs to process the requests
- Memory Utilization: How much memory is needed to process the request
- Database Locks: locks on tables and databases are needed in some scenarios but ensuring these are not causing a performance degradation is important.
- Garbage Collection: measure the unused memory that goes back into the application.
- Bandwidth: Measures bits per second used by a network interface
- Hit Ratios: This is the number of SQL statements that are handled by cached data instead of I/O operations. This is a solid starting place to look if you’re experiencing bottlenecking issues.
If you’re just starting, page load times, render time, and response times are three areas that I would consider honing in on first. Any of the tools above should have these data points available in a dashboard to monitor during your execution.
Setting a Baseline and Re-Baselining
Before you start testing and presenting data to stakeholders, ensure you know what your baseline is. A baseline can be defined as “normal” for your application. To capture a baseline, run a 10-15 minute load test. From that test, document what your “normal” results look like. The documentation should incorporate response times for each call and page load times.
As you begin performance testing on a more frequent basis, it is important to know when to re-baseline and when to leave it alone. You don’t want to update your baseline too much, because seeing degradation over time will become difficult. However, not updating enough will miss taking into account new features that are important to the user journey or new hardware meant to speed things up. There are more statistical techniques for understanding when to re-baseline, however, for this beginner’s guide, here is my advice on questions to ask when considering whether to re-baseline:
- Have there been any architectural improvements?
- Have there been any significant changes to the user flows?
- Has there been any replacement of code/upgrading of plugins?
- Has there been a change to hardware?
- Are you seeing a significant positive shift in performance? (10% or more)
If the answer to any of these is yes, you may consider re-baselining.
Best Practices to Keep in Mind
Here are some best practices to keep in mind:
- Set up automation in your load environment to ensure consistent stability. Running that automation, prior to your performance test, will validate your environment is stable. Executing your performance testing in an unstable environment could lead to skewed results.
- Ensure your scenarios are designed and scripted in a way that reflects what actions users most take. It is important that you are applying load in realistic ways.
- Make certain your load environment closely reflects production, including your setups such as load balancers, server counts, configurations, and firewalls.
- Ensure your think time closely matches user actions. Setting it to zero doesn’t mimic a user. There is no magic number here as every page might differ depending on user actions.
- If you’re not using a specific load testing tool, be sure to clear your browser and cookies before executing a test. Using cached data might circumvent sending data/receiving a response.
- Make sure you’re monitoring your tests for script errors during execution. Script errors might result in skewed outcomes and should be resolved quickly. Some scripting errors might require you to stop your test, fix them, and start a fresh test. Running a “calibration” test prior to your actual test will help validate your scripts before you’re in the middle of the load test.
- Ensure performance testing is a collaborative effort. Include developers, as well as infrastructure folks, to champion performance testing as a shared and monitored effort by everyone.
- Ideally, performance testing is a part of your development process. Whether agile or waterfall, performance testing should be completed prior to new code being released.
- Targeting the right load will help ensure you aren’t stressing your application beyond reality. If you aren’t specifically trying to find your break-point, ensure you’re testing at scale or realistically exceeding expected loads.
- Just like running a sprint race, the runner doesn’t get to her quickest speed in a second. Likewise, she isn’t stopping abruptly after the finish line. Similarly, in a load test, you want to ensure you’re planning for ramp up and down time. Any measurements you take should be taken between the ramp up and ramp down.