Under the hood

AI agents finding test cases and generating test steps

A web app is typically composed of user flows. A user flow lets a user accomplish a certain goal. “Sign-in with email & password” is an example. To perform a sign-in with email & password, a certain sequence of steps - interactions - is required. Our AI agents mimic human users (i.e., clicks input fields, signs up for newsletter) to navigate apps, interpret app intent, and identify all relevant user flows. We are recording and storing the interaction chain of a test case in an intermediate representation.

Playwright generates test code

After we record and store each test case’s interaction chain, we generate the corresponding Playwright code deterministically on the fly immediately prior to test execution. The interaction chain can be examined in the test detail view and the Playwright trace viewer.

playwright & AI diagram in Octomind tests

Use of AI agents and Playwright in Octomind end-to-end tests

Work in progress: AI auto-maintenance

We will follow a playbook to find out if a test failure is caused by a behavioral change of your user flows, the test code itself or a bug in your code. In the case of a behavioral change, we pinpoint failing interactions, and deploy the AI Agent to detect new desired interaction that will allow us to achieve the test case’s goals. This feature is under active development and not publicly accessible yet.

Issue pinpointing

When a test case fails, we help you quickly understand what went wrong. We are providing a set of tools to help you understand the issue.

Snapshots at the time of test failure
traces via Playwright trace viewer
open source Debugtopus tooling, which lets you run our tests localy, so you can set breakpoints to step through the code.

See more details in the maintain tests section.

Debugtopus interaction diagram

Parallel test runs for shorter runtime

Browser tests are not super fast since they are simulating a real user. To provide test results as fast as possible we parallelize test execution to the max.
To do so, we are fully cloud based and we scale instances up and down as needed. We are also working on techniques to separate test execution to avoid side effects for better scaling. Octomind tests are run in parallel, so your test suite will complete in 20 minutes or less, regardless of size.

Flakiness

Test flakiness is the biggest problem of browser tests. Fighting flakiness is an active focus of research on our end. Some of the strategies we follow are:

measure	status
Active interaction timing to handle varying response times	work in progress
Smart learning based retries to understand flakiness level	open
AI based analysis of unexpected circumstances to handle temporary pop-ups, toasts and similar stuff which would otherwise break the test	open
AI based regeneration in case of permanent user flow changes	open
Coding rules and best practices of how to write tests in a way to avoid common pitfalls.	work in progress

Get started

Build tests

Run tests

Maintain tests

Manage tests

Account

Advanced

More info

Data governance

AI agents finding test cases and generating test steps

Playwright generates test code

Work in progress: AI auto-maintenance

Issue pinpointing

Parallel test runs for shorter runtime

Flakiness

Get started

Build tests

Run tests

Maintain tests

Manage tests

Account

Advanced

More info

Data governance

​AI agents finding test cases and generating test steps

​Playwright generates test code

​Work in progress: AI auto-maintenance

​Issue pinpointing

​Parallel test runs for shorter runtime

​Flakiness

AI agents finding test cases and generating test steps

Playwright generates test code

Work in progress: AI auto-maintenance

Issue pinpointing

Parallel test runs for shorter runtime

Flakiness