Load Test Concepts¶
Before running your first load test, you need to understand a few key concepts. Load testing is straightforward once you grasp how virtual users, think time, and load profiles work together to simulate real traffic.
What is Load Testing?¶
Load testing simulates multiple users accessing your web application simultaneously to measure how it performs under realistic traffic conditions. The questions it answers are the ones that keep operations teams up at night: How many concurrent users can the application handle? Which pages become slow under load? Does the application crash when overloaded, or does it degrade gracefully? What's the maximum throughput in requests per second? What are the bandwidth requirements at peak traffic? And what's the baseline performance before anyone starts optimizing?
Virtual Users¶
A virtual user (VU) is a simulated user that executes your test case exactly as a real person would. It sends HTTP/HTTPS requests, parses responses, extracts dynamic values (session IDs, CSRF tokens), follows redirects, handles cookies automatically, waits for realistic think time between requests, and loops through the test case to simulate repeated sessions.
From your web server's perspective, virtual users look exactly like real users. That's the whole point.
Virtual Users vs User Identities¶
It's important to distinguish between two related but different concepts. Virtual Users (VUs) are the number of simultaneous active users executing test cases at any given moment. User Identities are the total pool of unique credentials (usernames, passwords, etc.) available for those virtual users to draw from.
You might have 100 virtual users but 10,000 user identities in a dataset. As each virtual user completes a test case iteration, it grabs the next identity from the dataset and starts a new session. This simulates realistic user churn where different people log in and out throughout the test.
Your license limits the maximum number of simultaneous virtual users, not the total number of user identities.
Concurrency and Realistic Load¶
Virtual users create concurrency: multiple requests hitting your application at the same time. One user accessing your site 1,000 times sequentially is not a load test. One hundred users accessing your site simultaneously is.
Concurrency reveals bottlenecks that sequential testing simply cannot: database connection pool exhaustion, thread pool saturation, memory pressure from simultaneous sessions, lock contention in shared resources. These problems don't exist with one user. They appear at 50. They become critical at 500.
Virtual Browsers vs Real Browsers¶
There are two fundamentally different ways to simulate users in a load test, and the choice between them shapes everything else about how the test runs.
Virtual Browsers (HTTP-Level Simulation)¶
A virtual browser replays the HTTP and HTTPS traffic that was captured during recording. It doesn't execute JavaScript. It doesn't render the DOM. It doesn't open a window. It sends the network traffic a browser would have sent, and it interprets the responses it gets back well enough to extract dynamic values and follow the workflow. From the server's point of view, the traffic looks like a real browser.
This is what Load Tester does today, and it's what most of the industry uses for load testing at scale. The reason is mostly arithmetic. Per virtual user, a virtual browser uses a few megabytes of memory and a tiny amount of CPU, roughly two orders of magnitude lighter than a real browser instance.
How many virtual users you can actually drive from a single cloud instance depends entirely on the test case, though. A test case where each user clicks once every thirty seconds and the pages are small might run a thousand virtual users on one machine. A test case with large pages, heavy JSON payloads, and a click every two seconds might cap out at a hundred virtual users on the same machine before memory, bandwidth, or CPU on the load generator saturates. "It depends" is the honest answer.
That's part of why Load Tester load-balances across multiple load generators. The system distributes virtual users across the engines you've allocated based on each engine's measured Capacity (visible per-engine in the Engines View during a live test; two engines in the same test routinely show different Capacity numbers because regions differ in hardware and network conditions). The default policy, Distribute evenly (respect capacity), allocates more users to engines with more headroom and fewer to engines with less. The point is to keep any single load generator from being pushed to where its own resource consumption distorts the measurements it's making. An overloaded load generator reports slow response times that have nothing to do with the server under test, and you would not know that unless you happened to notice the generator was the bottleneck. Spreading the work across machines is a cheap insurance policy against bad measurement data.
Real Browsers (Browser-Level Simulation)¶
A real browser is exactly what it sounds like: a full Chrome or Firefox instance, one per virtual user, driven through Selenium or the Chrome DevTools Protocol. It executes JavaScript, renders pages, fires every client-side fetch, runs every analytics tag, and reports every paint event. It's the most faithful possible reproduction of a real user's session.
The tradeoff is overhead. Each browser instance consumes roughly 500 MB to 2 GB of RAM and significant CPU. A reasonable 8-core load-generator machine can host maybe 10 to 30 real-browser virtual users before its own resource consumption starts to skew the measurements. A 10,000-user real-browser load test needs hundreds of beefy machines and a corresponding budget.
Because of that overhead, the industry uses virtual browsers for almost all scale testing. Real browsers are the right tool when heavy client-side JavaScript changes what server requests get made in ways that virtual scripting can't model, when you're measuring user-experience metrics like Time to Interactive or Largest Contentful Paint rather than server capacity, or when the application is essentially a JavaScript app and HTTP-level scripting becomes prohibitively complex. For server-capacity load testing, virtual browsers are the better tool.
For deeper background on the tradeoffs, see Load Testing with Virtual vs Real Browsers on the Web Performance blog.
Why Our HTTP Stack Produces More Accurate Timing¶
Load Tester's virtual browser has a measurement-accuracy advantage that's worth understanding. We rewrote the HTTP protocol stack from the byte level up specifically so it doesn't instantiate Java objects during the load test itself. The protocol stack runs without allocating new objects per request, which means the JVM's garbage collector doesn't have to run mid-test. Timing measurements are correspondingly more accurate, because they aren't being perturbed by GC pauses that would otherwise add unpredictable milliseconds (and in pathological cases, seconds) to response-time numbers.
This is the kind of difference that doesn't matter at low percentiles but matters a lot at p95 and p99, where GC pauses on other tools' virtual browsers turn into spurious "slow responses" that have nothing to do with the server you're trying to measure. Other load testing tools built on standard Java HTTP libraries inherit the JVM's default allocation behavior, which makes their high-percentile latency measurements inherently noisier.
Real Browser Load Generation in v7.1¶
We're targeting full real-browser load generation for the Load Tester v7.1 release. Until then, browser-driven workflows are modeled at the HTTP level by recording the resulting network traffic and replaying it through virtual browsers. For most server-capacity testing, that's the right choice anyway. Virtual browsers are faster, cheaper, and (on Load Tester specifically) measure timing more accurately than real-browser alternatives. The 7.1 release is for the cases where you genuinely need browser-level fidelity, not as a replacement for the virtual-browser path.
Think Time¶
Think time is the simulated delay between user actions, representing the time a real user spends reading content, filling out forms, or deciding what to click next.
Why Think Time Matters¶
Without think time, virtual users send requests as fast as the network allows, creating an unrealistic spike load that doesn't reflect how real people use your application. Your server gets overwhelmed by a request volume that would never happen in production.
With realistic think time, virtual users send requests at human-like intervals (typically 5-30 seconds between pages), the load pattern matches real-world traffic, and you can accurately measure application capacity.
Think Time Configuration¶
Think time is configured per-page in your test case. You can use the recorded think time (the actual delays from your recording session), set a fixed think time (a constant delay between all pages), set a random range (say 5-15 seconds, randomly chosen per page), or disable think time entirely for stress testing maximum request throughput.
Recommended Think Time
For realistic load tests, use 10-15 seconds average think time between pages. This closely models real user behavior for most web applications.
Load Profiles¶
A load profile defines how many virtual users are active at each point during your load test. Load tests typically use a ramp-up pattern rather than starting all users simultaneously.
Ramp-Up Pattern¶
A typical load test follows this pattern:
xychart-beta
title "Virtual Users Over Time"
x-axis "Time (minutes)" [0, 5, 10, 25, 40, 55, 70, 75, 80]
y-axis "Virtual Users" 0 --> 120
line [0, 50, 100, 100, 100, 100, 100, 50, 0]
Phases:
Ramp-Up gradually increases virtual users from 0 to your target (say, 0 to 100 over 10 minutes). This avoids shocking the system with instant peak load, lets you observe performance degradation at each user level, and reflects reality: real traffic grows gradually, not all at once.
Steady State holds a constant virtual user count (100 users for 60 minutes). This is where you measure sustained performance under target load, and where memory leaks, resource exhaustion, and gradual degradation reveal themselves. It's the primary analysis period for capacity testing.
Ramp-Down gradually decreases virtual users back to 0, allowing graceful shutdown of user sessions. This phase is optional; some tests simply end at steady state.
Load Levels¶
Instead of a smooth ramp, you can use stepped load levels:
xychart-beta
title "Stepped Load Levels"
x-axis "Level (each held for ~10 minutes)" ["25 VUs", "50 VUs", "75 VUs", "100 VUs"]
y-axis "Virtual Users" 0 --> 120
bar [25, 50, 75, 100]
Each level holds for a fixed duration (say, 10 minutes), giving you time to analyze performance at each specific user count. Use stepped levels when you want to identify exact capacity limits ("performance degrades at 75 users"), when you're doing capacity planning and need metrics at specific user counts, or when you want clean before-and-after comparisons at each load level.
Test Case vs Load Test¶
These two concepts are related but distinct:
Test Case¶
A test case is a recorded sequence of HTTP requests representing a single user's workflow ("Login, browse products, add to cart, checkout"). You create it by recording a browser session, configure it with ASM correlation, authentication, and datasets, validate it by running a single-user replay, and it represents one user's journey through your application.
Load Test¶
A load test executes that test case with multiple virtual users simultaneously according to a load profile. It requires a validated test case (one that replays successfully), is configured with virtual user count, ramp-up timing, and think time, runs on load engines (local or cloud), and measures application performance under concurrent load.
The workflow is sequential and each step depends on the previous one: record test case, validate replay, configure load profile, run load test, analyze results.
Load Test Metrics¶
During a load test, Load Tester collects comprehensive metrics:
Transaction Metrics (Per-Page)¶
- Response Time: Time from request sent to full response received
- Page Size: Total bytes transferred (HTML + resources)
- Success Rate: Percentage of successful requests (HTTP 200-299)
- Error Rate: Percentage of failed requests (HTTP 400-599 errors)
Aggregate Metrics (Whole Test)¶
- Throughput: Total requests per second across all virtual users
- Bandwidth: Data transferred per second (KB/s or MB/s)
- Errors: Total error count and error types
- Virtual User Level: Active user count at each moment
Server Metrics (Optional)¶
When using the Server Monitoring Agent: - CPU Usage: Processor utilization on application servers - Memory Usage: RAM consumption and garbage collection activity - Disk I/O: Read/write operations per second - Network I/O: Bytes sent/received at network interfaces
Test Execution Architecture¶
Load tests can be executed in different configurations depending on your needs:
Local Load Testing¶
Single Machine: Run the Load Tester and execute virtual users on your laptop/desktop.
graph LR
subgraph host["Your Computer"]
LT["Load Tester<br/>20–50 VUs"]
end
LT -->|HTTP requests| WS["Web Server<br/>Under Test"]
classDef host fill:#f5f5f5,stroke:#999,stroke-width:1px,color:#333
classDef component fill:#fff,stroke:#333,stroke-width:1.5px,color:#333
classDef server fill:#fff,stroke:#c0392b,stroke-width:1.5px,color:#333
class LT component
class WS server
Best for: Development testing, small-scale validation, quick smoke tests.
Limitations: Tops out around 50 virtual users because your workstation is doing everything: running the UI, generating load, and collecting results.
Distributed Load Testing (On-Premises)¶
Multiple Load Engines: Deploy Load Engines on separate servers to scale virtual user capacity.
graph TD
LT["Load Tester<br/>Controller"]
E1["Engine 1<br/>250 VUs"]
E2["Engine 2<br/>250 VUs"]
E3["Engine 3<br/>250 VUs"]
E4["Engine 4<br/>250 VUs"]
WS["Web Server<br/>Under Test"]
LT --> E1
LT --> E2
LT --> E3
LT --> E4
E1 --> WS
E2 --> WS
E3 --> WS
E4 --> WS
classDef controller fill:#fff,stroke:#333,stroke-width:1.5px,color:#333
classDef engine fill:#f5f5f5,stroke:#666,stroke-width:1.5px,color:#333
classDef server fill:#fff,stroke:#c0392b,stroke-width:1.5px,color:#333
class LT controller
class E1,E2,E3,E4 engine
class WS server
Best for: Large-scale testing (100 to 10,000+ virtual users), testing from specific network locations.
Requires: Load Engine licenses and network access between Load Tester and the engine machines.
Cloud Load Testing (AWS)¶
AWS-Based Engines: Launch Load Engines as EC2 instances on-demand, then terminate them after the test.
graph TD
LT["Load Tester<br/>Controller"]
subgraph aws["AWS Cloud"]
E1["EC2 Engine<br/>500 VUs"]
E2["EC2 Engine<br/>500 VUs"]
end
WS["Web Server<br/>Under Test"]
LT -->|Launch engines| E1
LT -->|Launch engines| E2
E1 --> WS
E2 --> WS
classDef controller fill:#fff,stroke:#333,stroke-width:1.5px,color:#333
classDef engine fill:#f5f5f5,stroke:#666,stroke-width:1.5px,color:#333
classDef server fill:#fff,stroke:#c0392b,stroke-width:1.5px,color:#333
class LT controller
class E1,E2 engine
class WS server
Best for: Multi-region testing, large-scale tests without infrastructure investment, testing from different geographic locations.
Advantages: No upfront hardware costs. You launch engines in any AWS region (US East, EU West, Asia Pacific), scale to thousands of virtual users on-demand, and pay only for actual test duration at hourly EC2 rates.
One-Button Cloud Setup¶
The non-obvious value here is what Load Tester does automatically. From the operator's side, running a distributed load test in AWS is one button. Click Run Load Test with cloud engines configured, and Load Tester:
- Provisions the EC2 instances in the regions and sizes you specified
- Installs the load generator software on each instance (you don't touch it, you don't even see it happen)
- Uploads every test case to every engine, including the multiple test cases running simultaneously in a mixed-workload test
- Distributes the datasets so each engine has the slice of usernames, passwords, search terms, and other per-user data it needs
- Load-balances virtual users across the engines based on each engine's actual capacity, so no single generator gets pushed to the point where its own resource consumption distorts measurements (the bad-data problem from the Virtual Browsers section)
- Terminates the instances when the test finishes, so you only pay for the minutes you actually ran
Load balancing across generators: solved. Software distribution: solved. Dataset distribution: solved. Multi-test-case orchestration: transparent. The complicated parts of running a hundred-load-generator distributed test happen below the surface. You don't manage them, you don't see them, you don't have to be right about them.
This is the actual win of cloud load testing in Load Tester. EC2 instances on demand are the visible part. The orchestration that makes a hundred-engine test as easy as a single-engine test is the real product.
See Cloud Load Testing for detailed AWS configuration.
LAN vs WAN Testing¶
LAN-Based Testing (Inside Your Network)¶
Setup: Load engines run on the same network as your application servers.
Advantages: Isolates server performance from network latency, avoids bandwidth charges from your hosting provider, provides the fastest test execution, and is ideal for identifying server bottlenecks without network noise clouding the picture.
Use when testing development or staging environments, measuring raw server capacity, debugging performance issues, or optimizing backend performance.
WAN-Based Testing (Over the Internet)¶
Setup: Load engines run outside your network, accessing your application over the public internet.
Advantages: Includes realistic network latency, tests CDN performance, validates load balancer behavior, and simulates real user experience from different geographic regions.
Disadvantages: Network performance can mask server issues, test traffic counts toward your hosting provider's bandwidth limits (and those charges add up), and tests run slower than LAN.
Use when testing production from real user locations, validating CDN effectiveness, or measuring end-user response times.
Bandwidth Costs
WAN-based load tests can generate massive bandwidth usage (e.g., 100 users × 1 MB/page × 60 pages/hour = 6 GB/hour). Coordinate with your hosting provider to avoid surprise bandwidth charges.
Recommended Testing Approach¶
Start with a small-scale LAN test to establish baseline performance. Then run a LAN-based load test to find server capacity limits. Finally, if needed, run a WAN-based test to validate end-user experience from real network locations. This approach minimizes bandwidth costs while still providing comprehensive performance data.
Next Steps¶
Now that you understand the concepts, the path forward is: record a test case to define your user workflow, configure it for successful replay, validate with a baseline replay, configure a load test with virtual users and a load profile, run it, and analyze the results to identify bottlenecks.
Related Topics: - Configuring a Load Test - Cloud Load Testing - Running a Load Test - Understanding Metrics - Server Monitoring