Under the hood
If you are interested in how we make the tests work, this is your place.
AI agents finding test cases and generating test steps
A web app is typically composed of user flows. A user flow lets a user accomplish a certain goal. “Sign-in with email & password” is an example.
To perform a sign-in with email & password, a certain sequence of steps - interactions
- is required.
Our AI agents mimic human users (i.e., clicks input fields, signs up for newsletter) to navigate apps, interpret app intent, and identify all relevant user flows. We are recording and storing the interaction chain of a test case in an intermediate representation.
Playwright generates test code
After we record and store each test case’s interaction chain, we generate the corresponding Playwright code deterministically on the fly immediately prior to test execution.
The interaction chain can be examined in the test detail view and the Playwright trace viewer.
Use of AI agents and Playwright in Octomind end-to-end tests
Work in progress: AI auto-maintenance
We will follow a playbook to find out if a test failure is caused by a behavioral change of your user flows, the test code itself or a bug in your code. In the case of a behavioral change, we pinpoint failing interactions, and deploy the AI Agent to detect new desired interaction that will allow us to achieve the test case’s goals.
This feature is under active development and not publicly accesible yet.
Issue pinpointing
When a test case fails, we help you quickly understand what went wrong. We are providing a set of tools to help you understand the issue.
- Snapshots at the time of test failure
- traces via Playwright trace viewer
- open source Debugtopus tooling, which lets you run our tests localy, so you can set breakpoints to step through the code.
See more details in Debug your code.
Debugtopus interaction diagram
Parallel test runs for shorter runtime
Browser tests are not super fast since they are simulating a real user. To provide test results as fast as possible we parallelize test execution to the max.
To do so, we are fully cloud based and we scale instances up and down as needed. We are also working on techniques to separate test execution to avoid side effects for better scaling.
Octomind tests are run in parallel, so your test suite will complete in 20 minutes or less, regardless of size.
Flakiness
Test flakiness is the biggest problem of browser tests. Fighting flakiness is an active focus of research on our end. Some of the strategies we follow are:
measure | status |
---|---|
Active interaction timing to handle varying response times | work in progress |
Smart learning based retries to understand flakiness level | open |
AI based analysis of unexpected circumstances to handle temporary pop-ups, toasts and similar stuff which would otherwise break the test | open |
AI based regeneration in case of permanent user flow changes | open |
Coding rules and best practices of how to write tests in a way to avoid common pitfalls. | work in progress |