Datasets & Data-Driven Testing¶

A common problem when setting up a load testing configuration is figuring out how much test data you need. You know you need usernames and passwords, but how many? And why does using unique data for each user matter when you could just reuse the same username over and over?

The answer: real applications behave differently when multiple users share the same credentials simultaneously. Session conflicts, cache collisions, and server-side optimizations all produce unrealistic results when every virtual user logs in as the same person. Data-driven testing with datasets ensures you're measuring real-world performance, not artificial behavior caused by test artifacts.

This guide covers creating datasets, calculating how many rows you need, editing data efficiently, and using JavaScript data sources for dynamic generation.

Why Data-Driven Testing Matters¶

The Problem with Shared Data¶

When you record a test case, you authenticate once as yourself. When you run a load test with 1000 virtual users, you have three options:

Approach	What Happens	Result
All users share same credentials	1000 concurrent sessions with username "jsmith"	Server may invalidate earlier sessions, rate-limit the account, or behave unpredictably. NOT realistic.
Users cycle through small dataset	1000 users share 10 credentials (100 users per credential)	Better than option 1, but still artificial. Many applications don't handle 100 concurrent sessions for the same user gracefully.
Each user gets unique credentials	1000 users use 1000 different credentials	Realistic simulation. Each virtual user behaves like an independent real user. Server behavior matches production.

Use enough unique test data that each virtual user has its own identity during the test.

What Datasets Provide¶

A dataset is a collection of tabular data (like a spreadsheet) that Load Tester uses to dynamically change the actions of a test case.

Datasets contain:

Fields (columns): The types of data you need (e.g., username, password, email, product_id)
Rows: The actual values for each field (e.g., user1, P@ssw0rd1, user1@example.com, 12345)

Example dataset (usernames and passwords):

username	password
alice_smith	AlicePass123
bob_jones	BobSecret456
carol_white	CarolKey789
dave_brown	DaveAuth012
eve_green	EveToken345

When you link this dataset to your test case's username/password fields, each virtual user gets a different row. User 1 logs in as alice_smith, User 2 as bob_jones, and so on.

Understanding Dataset Configuration¶

Before creating a dataset, you should understand three critical configuration options that determine how Load Tester uses your data.

Lifespan: How Long Does a Row Last?¶

Lifespan controls how long a virtual user sticks with the same row of data before fetching the next row.

Lifespan	Behavior	When to Use
Virtual User	Virtual user uses the same row for all test case iterations	User identity should persist across multiple test case iterations. Most common for username/password datasets.
Test Case	Virtual user gets a new row for each test case iteration	Each iteration should use different data (e.g., different product searches each iteration).
Web Page	Virtual user gets a new row for each web page	Multiple pages use dataset values, and each page should get fresh data.
URL	Virtual user gets a new row for each transaction (URL)	Every HTTP request should use different data (rare).
Single Use	Every dataset value used gets a new row	Extremely rare; usually overkill.

Most common settings:

User credentials (username/password): Virtual User lifespan (user keeps same identity for entire test)
Form field data (product search, customer info): Test Case lifespan (new data each iteration) or Web Page lifespan (new data for each form)

Reusable: What Happens When Rows Run Out?¶

Reusable determines whether Load Tester can start over at the beginning of the dataset when all rows have been used.

Setting	Behavior	Impact
Reusable = ON (default)	When all rows are used, start over from row 1	Virtual users will reuse data. Common for load testing.
Reusable = OFF	When all rows are used, load test fails with an error	Ensures every virtual user gets unique data, but requires enough rows for the entire test.

When to enable Reusable:

You have limited test data (e.g., 100 credentials) but need to test with 1000 users
Data reuse is acceptable for your application (most applications)

When to disable Reusable:

Compliance requirement: Each virtual user MUST have unique data (e.g., financial regulations)
Data gets consumed: Application permanently modifies or deletes data during the test (e.g., draining an inventory)
You have more than enough rows for your test

Load Test Will Fail If Rows Run Out

If Reusable = OFF and your dataset doesn't have enough rows, the load test will terminate with an error when the last row is used. Always calculate row requirements carefully (see How Many Rows Do You Need? below).

Sharable: Can Multiple Users Use the Same Row?¶

Sharable determines whether multiple virtual users can simultaneously use the same row from a dataset.

Setting	Behavior	Impact
Sharable = OFF (default)	Each row can only be used by one virtual user at a time	Ensures unique data per user. Dataset must have at least as many rows as concurrent virtual users.
Sharable = ON	Multiple virtual users can use the same row simultaneously	Allows smaller datasets, but defeats the purpose of data-driven testing if many users share the same credentials.

When to enable Sharable:

Dataset contains non-identity data (e.g., product IDs, search terms) where multiple users searching for the same product is realistic
You're testing with more virtual users than available dataset rows and reuse is acceptable

When to disable Sharable (default):

Dataset contains user identity data (username/password, certificates)
You want to ensure each virtual user has unique data at any given moment

Sharable Requires Reusable

If a dataset is not reusable, it cannot be sharable. This makes sense: if you've disabled row reuse entirely, allowing simultaneous sharing would contradict that constraint.

Creating a Dataset¶

Load Tester provides two ways to create datasets: start with an empty dataset and fill it manually, or import existing data from an external file (CSV, text).

Method 1: Create Empty Dataset¶

When to use: Small datasets (<50 rows), or you'll use the Fill feature to generate data automatically.

Step 1: Open New Dataset dialog

In the Navigator view, locate the Datasets folder in your repository
Right-click on the Datasets folder (or any existing dataset)
Select: New → Dataset

Step 2: Define dataset structure

Enter dataset name: e.g., UserCredentials
Click: Add button
Enter field names (column names), one per line:
username
password
(Press Enter after each field name)
Click: OK

Load Tester creates the dataset with one row of sample data and opens the Dataset Editor.

Method 2: Import from External File¶

When to use: Large datasets (hundreds or thousands of rows), or data generated by external tools (database export, scripts).

Supported formats:

CSV (comma-separated values): Most common, works with Excel/Google Sheets
TSV (tab-separated values): Alternative to CSV
Custom delimiters: Any character can be used as field separator

Step 1: Prepare your data file

Create a CSV or text file with your data:

username,password,email
alice_smith,AlicePass123,alice@example.com
bob_jones,BobSecret456,bob@example.com
carol_white,CarolKey789,carol@example.com
dave_brown,DaveAuth012,dave@example.com
eve_green,EveToken345,eve@example.com

Step 2: Open Import Dataset dialog

In the Navigator view, right-click on Datasets folder
Select: Import

Step 3: Configure import settings

Choose the file to import: Browse to your CSV/text file
Import into: Select New dataset (or choose existing dataset to replace its data)
Field separator: Choose comma for CSV files (or tab, space, semicolon, or custom)
Trim whitespace: Leave enabled (default) to automatically remove leading/trailing spaces
Use first row as field names: Enable if your file has a header row (recommended)
Parse escaped characters: Enable if your data contains escaped newlines (\r\n) or special characters

Step 4: Preview and import

Preview section shows the first 10 rows with current settings
Verify the preview looks correct (columns align properly, field names are detected)
Click: OK to import

The dataset is created and the Dataset Editor opens with your imported data.

Preview Before Importing

The Preview section updates dynamically as you change import settings. Always verify the preview looks correct before clicking OK. If columns don't align properly, adjust the field separator or "Use first row" setting.

How Many Rows Do You Need?¶

Calculating the number of dataset rows you need comes down to three numbers:

Test duration: How long the load test runs (e.g., 60 minutes)
Test case duration: How long one iteration of the test case takes (e.g., 13 minutes)
Concurrent users: Maximum number of virtual users during the test (e.g., 4)

Load Tester provides all three.

Step 1: Find Test Duration¶

Test duration is set by you in the Load Test configuration:

Open Load Test Editor (double-click load test in Navigator)
Look for: Duration field (usually in minutes)
Note the value: e.g., 60 minutes

Step 2: Find Test Case Duration¶

Load Tester calculates test case duration automatically in the Test Case Editor:

Open Test Case Editor (double-click test case in Navigator)
Look at the bottom of the editor: Test case duration is displayed
Note the value: e.g., 13 minutes

Estimate Conservatively

As a rule of thumb, round to the nearest minute and round DOWN if possible. This number is not always accurate, as it does not include:

Test case looping (Restart Options in Test Case Properties)
Page processors that refresh a page repeatedly while waiting for a result
Think time variations

In general, it's a good idea to lowball the test case duration estimate, as the test will only fail if you come up short. Having too many dataset rows is never a problem for the test.

Step 3: Find Maximum Concurrent Users¶

Maximum users is set in the Load Test configuration:

Open Load Test Editor
Look for: Maximum Users field (calculated from user ramp settings)
Note the value: e.g., 4 users

Calculation #1: The Simplest Case (No Ramping)¶

If all users in a test started at the very beginning and continued to the very end, the calculation is simple:

Rows Needed = Users × (Test Duration ÷ Test Case Duration)

Example:

Users: 4
Test duration: 60 minutes
Test case duration: 13 minutes

4 × (60 ÷ 13) = 4 × 5 (rounded up from 4.6) = 20 rows

Visual representation:

User 1: |████████████████████████████████████████████████| (60 min)
User 2: |████████████████████████████████████████████████|
User 3: |████████████████████████████████████████████████|
User 4: |████████████████████████████████████████████████|
        0                                              60 min

All four users run the entire 60 minutes, each completing ~5 iterations (60 ÷ 13 = 4.6, rounded up).

Total rows needed: 4 users × 5 iterations = 20 rows

Calculation #2: The Normal Case (Ramping)¶

Most load tests ramp up during the course of the test instead of starting all users at the beginning. A typical load test looks like this:

User 1: |████████████████████████████████████████████████| (60 min)
User 2:          |███████████████████████████████████████| (45 min)
User 3:                   |███████████████████████████████| (30 min)
User 4:                            |██████████████████████| (15 min)
        0                                              60 min

Users start at different times (ramping), but all run until the test ends.

Estimating rows in this case is just like calculating the area of a triangle in high school geometry: divide the previous calculation by two.

Rows Needed = (Users × (Test Duration ÷ Test Case Duration)) ÷ 2

Example (same numbers as before):

(4 × (60 ÷ 13)) ÷ 2 = (4 × 5) ÷ 2 = 20 ÷ 2 = 10 rows

As long as the user ramp is even and terminates within one test case duration of the end of the test, this estimate will be accurate.

Calculation #3: Ramp + Hold (Complex Load Profile)¶

What about a test where you ramp up to a certain number of users, then hold at that level for an extended period?

In such a case, split the test into two parts for the purposes of the estimate:

Ramping period: Use Calculation #2 (triangle area)
Holding period: Use Calculation #1 (rectangle area)

Visual representation:

User 1: |████████████████████████████████████████████████| (60 min)
User 2:    |█████████████████████████████████████████████| (55 min)
User 3:        |█████████████████████████████████████████| (50 min)
User 4:            |█████████████████████████████████████| (45 min)
        0      10                                     60 min
              Ramp         Hold at 4 users

Ramp period (0-10 minutes): - Users ramp from 0 to 4 over 10 minutes - Use Calculation #2 (triangle): (4 × (10 ÷ 13)) ÷ 2 = (4 × 1) ÷ 2 = 2 rows

Hold period (10-60 minutes): - All 4 users run for 50 minutes (60 - 10 = 50) - Use Calculation #1 (rectangle): 4 × (50 ÷ 13) = 4 × 4 = 16 rows

Total: 2 + 16 = 18 rows

You can use this technique to estimate for any test, even one that ramps unevenly: simply divide up the test sections into ramping periods and load periods, estimate those, and then add them all together.

Padding: Protecting Against Load Engine Imbalance¶

Finally, add padding to protect against load engine imbalance.

When one load engine has significantly more virtual users than the others and the datasets are divided evenly among the engines, that engine risks running out of data before the others. The fix is simple: add at least 25% to your estimate.

Example:

Calculation #2 estimate: 10 rows
With 25% padding: 10 × 1.25 = 12.5 → round up to 13 rows

Padding Recommendation

We generally recommend adding at least 25% to your dataset row estimate. This provides a safety margin for:

Load engine imbalance (one engine gets more users than others)
Test case duration variations (some iterations take longer than average)
Ramp timing inaccuracies (users don't start exactly when you expect)

Having too many rows is never a problem. It's much better to have 50% extra than to have the load test fail halfway through because you ran out of data.

Editing Datasets¶

After creating a dataset (empty or imported), you can edit it using the Dataset Editor.

Adding and Removing Rows¶

To add a single row:

Double-click the last cell in the last row
Press Enter
A new row is created with sample data
Fill in values and press Enter to add another row, or Tab to finish editing

To add multiple rows at once:

Click: Add button (toolbar icon: + with row indicator)
Enter number of rows to add
Click OK

To remove rows:

Select rows to remove (click row number on left, or drag to select multiple)
Click: Remove button (toolbar icon: − with row indicator)
Rows are deleted immediately

Easier Dataset Editing (v4.2+)

Before Load Tester 4.2, deleting rows required either:

Selecting each row individually and manually deleting it
Exporting the dataset to Excel, removing data, and re-importing

With Load Tester 4.2 and later, you can simply:

Highlight all the rows you want to remove
Click the "remove dataset row" icon

Adding rows is just as easy: simply click the "add dataset rows" icon.

Adding and Removing Fields (Columns)¶

To add a field:

Click: Edit Fields... button
In the Edit Dataset Fields dialog:
Type new field name at the bottom of the list
Press Enter to add it
Click OK

The new field appears as the last column in the dataset with empty values.

To rename a field:

Click: Edit Fields... button
Select the field in the list
Click: Rename button (or double-click field name)
Enter new name
Click OK

To remove a field:

Click: Edit Fields... button
Select the field in the list
Click: Remove button
Click OK

All values for that field are deleted from all rows.

Editing Cell Values¶

To edit a single cell:

Double-click the cell
Type the new value
Press Enter to move to the next cell below, or Tab to move to the next cell right
Press ESC to cancel changes

To copy/paste data:

Select cells (click and drag, or Shift+click)
Ctrl+C (Cmd+C on macOS) to copy
Ctrl+V (Cmd+V on macOS) to paste

Filling Fields with Generated Data¶

Instead of manually entering hundreds of rows of data, Load Tester can automatically generate random or sequential values to fill a field.

Step 1: Select the Field to Fill¶

Click the column heading (field name) to select the entire field
Click: Fill... button

The Fill Dataset Field dialog opens.

Step 2: Choose Generation Method¶

Load Tester provides three generation types:

Method	Description	Best For
Random	Generate random alphabetic or numeric strings	Usernames, passwords, random IDs, test data variation
Sequence	Generate sequences of numeric strings	User IDs, order numbers, sequential identifiers
List	Select strings from pre-populated lists	Common names, email domains, product categories

Step 3: Configure Generation Settings¶

For Random or Sequence:

Quantity: Number of values to generate (defaults to total rows in dataset)
Width: Length of each generated value (e.g., 8 characters for passwords)
Data Type: Alphabetic, Numeric, or Alphanumeric

For List:

Select list: Choose from pre-populated lists (e.g., common first names, last names, email domains)

Step 4: Preview and Apply¶

Click: Generate Values button
Preview appears on the right showing the first several generated values
If satisfied, click OK to save values into the dataset
If not, adjust settings and regenerate

Example: Generating 1000 random usernames:

Method: Random
Quantity: 1000
Width: 10
Data Type: Alphabetic

Result: jkdfhgkslp, mnbvcxzaqw, poiuytrewq, etc.

Advanced: JavaScript Data Sources¶

Sometimes you need to generate dynamic data during the load test that can't be pre-populated in a dataset: a unique UUID for each request, a fresh timestamp, a computed hash.

Load Tester supports JavaScript data sources that execute during the test to provide these dynamic values.

Use Case: Generating Dynamic UUIDs¶

Problem: Your application uses UUIDs in URL path elements:

http://mysite/path1/123e4567-e89b-12d3-a456-426614174000

Each request needs a unique UUID. You can't pre-generate UUIDs in a dataset because you don't know how many requests will be made during the test.

Solution: Use a JavaScript data source to generate UUIDs dynamically.

Step 1: Select the Field to Configure¶

Click on the transaction with the UUID in the URL
Open Fields View: Window → Show View → Fields View
Switch to PATH view mode: Use the dropdown on the right to select PATH
Locate the UUID path element in the Fields View

Step 2: Configure Script Datasource¶

Double-click the path element with the UUID
Field Assignment dialog opens
Datasource: Select Script from the dropdown

Step 3: Write JavaScript Function¶

In the Script editor, enter:

function getValue(user_state) {
    return java.util.UUID.randomUUID();
}

How this works:

The return value from getValue(user_state) is substituted every time this URL is called during the load test
You can use any valid JavaScript
JavaScript can call Java functions using the java. prefix
In this case, we're calling java.util.UUID.randomUUID() to generate a UUID

Step 4: Test and Apply¶

The JavaScript dialog dynamically executes the script as you type
Results box shows the return value (you should see a UUID like a1b2c3d4-...)
If the result looks correct, click OK

Now, during the load test, every time this URL is requested, a new UUID is generated dynamically.

JavaScript Interop with Java

This technique is surprisingly powerful and can help with all sorts of tricky situations in complex web applications. JavaScript's standard library is modest, but Java's is enormous, and you can call any Java function from JavaScript using the java. prefix.

Examples: - java.util.UUID.randomUUID() - Generate UUIDs - java.lang.System.currentTimeMillis() - Get current timestamp - java.lang.Math.random() - Generate random numbers

Using Datasets in Test Cases¶

After creating a dataset, you need to link it to fields in your test case so the values are actually used.

Automatic Linking (User Identity Wizard)¶

For username/password datasets, the User Identity Wizard automatically links the dataset to login fields:

Right-click test case in Navigator → Properties
Navigate to: User Identity tab
Select: Use dataset for credentials
Choose dataset: Select your dataset from the dropdown
Map fields:
Username field: Select dataset column containing usernames
Password field: Select dataset column containing passwords
Click OK

Manual Linking (Fields View)¶

For any field in your test case (not just username/password), you can manually link a dataset using the Fields View:

Open test case in Test Case Editor
Open Fields View: Window → Show View → Fields View
Select the transaction containing the field you want to configure
Locate the field in Fields View (e.g., search_term, product_id)
Double-click the field to open Field Assignment dialog
Datasource: Select Dataset from dropdown
Choose dataset: Select the dataset from the list
Choose field: Select the dataset column (field) to use
Click OK

Now when the test case runs, Load Tester will substitute values from the dataset into that field.

Reloading Datasets from External Files¶

If you imported a dataset from an external file (CSV, text), you can easily re-import the data after modifying the original file. This is useful when:

You're generating test data from a database and need to refresh it
You're editing the data in Excel and want to reload changes into Load Tester

Automatic Reload¶

Open Dataset Editor (double-click dataset in Navigator)
Click: Reload button

Load Tester automatically re-imports the dataset using the same settings (file path, separator, etc.) that were used for the original import.

If the file location changed or the import settings need adjustment:

Click: ... button (next to Reload)
The Import Dataset dialog opens with the original settings pre-filled
Adjust settings as needed (choose new file, change separator, etc.)
Click OK to re-import

Troubleshooting Datasets¶

Load Test Fails: "Dataset rows exhausted"¶

Symptom: Load test terminates with an error message about running out of dataset rows.

Cause: Your dataset doesn't have enough rows for the load test, and Reusable is disabled.

Solution:

Option 1: Enable Reusable (recommended)

Open Dataset Editor (double-click dataset in Navigator)
Check: Reusable checkbox
Save: Ctrl+S (Cmd+S on macOS)

Option 2: Add more rows to the dataset

Calculate rows needed using the formulas in How Many Rows Do You Need?
Add rows using the Dataset Editor or re-import a larger file

Load Test Fails: "Dataset rows conflict"¶

Symptom: Load test fails with errors about dataset row conflicts or concurrent access.

Cause: Your dataset has Sharable disabled, but you're trying to run more concurrent users than available rows.

Example: 100 concurrent users, but dataset only has 50 rows and Sharable = OFF.

Solution:

Option 1: Enable Sharable (if data reuse is acceptable)

Open Dataset Editor
Check: Sharable checkbox
Save

Option 2: Add more rows to the dataset

Ensure dataset has at least as many rows as maximum concurrent virtual users
Use the Fill feature to quickly generate additional rows

Dataset Import Preview Looks Wrong¶

Symptom: When importing a dataset, the preview shows columns misaligned or data in wrong fields.

Likely causes:

Wrong field separator: You selected "comma" but the file uses tabs (or vice versa)
First row setting incorrect: You enabled "Use first row as field names" but the first row contains data (or vice versa)

Solution:

Try different field separator: Comma, Tab, Semicolon, Space
Toggle "Use first row as field names" and check the preview
Open the file in a text editor to verify what separator is actually used

JavaScript Data Source Not Working¶

Symptom: JavaScript datasource shows an error, or the generated value is blank/incorrect.

Likely causes:

Syntax error in JavaScript: Missing semicolon, unclosed brace, etc.
Wrong Java class path: java.util.UUID works, but UUID.randomUUID() doesn't (missing java. prefix)
Function doesn't return a value: Missing return statement

Solution:

Check the Results box in the JavaScript editor; it shows errors and return values
Test incrementally: Start with a simple script like return "test"; and add complexity
Use Java interop correctly: Always use java. prefix for Java functions

Ask the AI to Help with Datasets

If you're struggling with dataset configuration:

I need to create a dataset with 500 unique usernames and passwords for
a load test with 100 concurrent users running for 30 minutes. My test
case takes about 5 minutes per iteration. How many rows do I need, and
what lifespan/reusable settings should I use?

The AI can:

Calculate exact row requirements based on your test parameters
Recommend lifespan and reusable/sharable settings
Guide you through dataset creation and configuration
Help troubleshoot dataset-related errors during load tests
Explain JavaScript data source syntax for dynamic generation

Best Practices¶

1. Use Datasets for User Identity¶

Why: Real applications behave differently when 1000 users all log in as the same person. Session conflicts, server-side caching, and rate limiting can all produce unrealistic results.

How: Create a dataset with enough usernames/passwords for all virtual users, with Reusable = ON (so users cycle through credentials if needed).

2. Add 25% Padding to Row Estimates¶

Why: Load engine imbalance, test case duration variations, and ramp timing inaccuracies can cause the test to use more rows than calculated.

How: Calculate rows needed using the formulas above, then multiply by 1.25 and round up.

3. Use Fill Feature for Large Datasets¶

Why: Manually entering 1000 rows of test data is tedious and error-prone.

How: Create a dataset with the right fields, then use Fill... to generate random or sequential values automatically.

4. Import Real Data When Possible¶

Why: Real customer data (anonymized/sanitized) produces more realistic load test results than randomly generated strings.

How: Export data from your production database (after removing PII), import into Load Tester as CSV. Use representative product IDs, search terms, and transaction patterns.

5. Test Dataset Configuration Before Load Testing¶

Why: Dataset configuration errors (wrong lifespan, too few rows, sharable issues) are easier to diagnose with a small replay than a full load test.

How:

Create dataset and link to test case
Run a small replay (2-5 virtual users)
Verify each virtual user gets different data (check Fields View during replay)
Only then: Scale up to full load test

Next Steps¶

After configuring datasets, you're ready to customize other aspects of your test case:

Modifying Test Case Content - File uploads, JSON/XML editing, URL substitution, removing third-party trackers
Advanced Configuration - Cookies, hostname resolution, proxy settings, IP aliasing

For using datasets during load tests:

Configuring a Load Test - Setting up virtual users and load profiles
Running a Load Test - Executing load tests with dataset-driven test cases

For authentication datasets specifically:

Basic/Form Authentication - Using datasets for username/password credentials
Client Certificates - Using datasets for multiple client certificates