Why tests fail
What happens when Octomind tests fail? Why are some tests yellow or red?
This guide of 4 parts will show you how to go about diagnosing, debugging, and fixing Octomind test failures. We will be throwing in a few best practices for good measure too.
We will start by covering the different types of test failures, then dive into using Octomind’s test case diagnosis and debugging tools.
Failure types
Test cases fail for one of two reasons:
- Test creation failure (YELLOW alert): The AI agent (Agent) failed to implement the required steps for a test case when generating it and it needs your review and/or help.
- Active test failure (RED alert): An active test case step did not complete as expected.
This is a high-level overview of the test failure types. We will follow with general strategies for fixing them.
Test creation failure
Test case creation failures are usually the result of:
- test generation that was blocked from completing due to missing data (e.g. CAPTCHA prompt for newsletter signup, invalid discount code)
- a step that was blocked (e.g. timeout on search results page)
- a prompt that prevented the Agent from successfully implementing each step (e.g. prompt lacking sufficiently detailed step-by-step instructions)
- incomplete html of the test target with missing details like labels, accessibility attributes (Aria-*), test id, place holder, text, values of inputs, target links (href), select options and content.
A test step of an auto-generated login test failed due to CAPTCHA, 12/2024
Tests that fail to be created are disabled and excluded from test report runs. Once the test case is fixed and passes successfully, it will be automatically set to active and added to test report runs.
toggle set on active automatically for successful and inactive for failed test generation, 12/2024
For cases when missing data can be provided you can revise the prompt to include the required value (e.g. discount code) in the visual editor
.
Entering correct test discount code in visual locator picker, 12/2024
If a step couldn’t complete (e.g. selector timeout), you can review the steps generated by the prompt to identify why it may have failed, e.g. the search field could not be found because the placeholder text had changed from “Search here…” to “Search…”.
If the prompt fails to generate any steps or the steps are incomplete, first review the prompt to see if it can be improved with more precise instructions, then click regenerate steps
. Sometimes, an improved prompt and the Agent trying again is all that’s needed.
Improving prompt when test generation fails, 12/2024
If the target html is incomplete the Agent will not be able to identify the element you expect it to click on. In this case it will try out other options which most likely are not a fit. If this happens, help the agent along by modifying the step where it went wrong. You can use the visual locator picker
to do so. Once you select the right interaction element click regenerate steps
by keeping all steps including the one you just modified. Most likely this is all it needs.
The Agent will provide a reason and recommendations for fixing the failure, but reach out to support if you need help resolving the issue. See example below.
Agent providing the reason for failure, 12/2024
Active test failure
While active test case failures indicate the functionality being tested is now not working as expected, it doesn’t necessarily mean the feature under test is broken.
End-to-end tests failures often arise from intentional UI changes where previous locators no longer find the targeted element (e.g. link is now a button) or the expected text for an element has changed (e.g. page title).
Octomind’s test run timeline (in a test report
detail) is usually sufficient for diagnosing the cause of most failed tests (red). If not, there are numerous options for debugging which we’ll explore in the following sections.
Timeline of a test run in test report detail, 12/2024
Timeline carousel of a test run in test report detail, 12/2024
Test case status
Octomind makes it easy to stay on top of the state of your test cases. The test cases page
lists all tests, with filters
based on status or search
.
Test case list where tests with steps needing review are highlighted, 12/2024
The test reports
page also lists the failed tests for each run.
Test report run with passed and failed active tests, 12/2024
There are some other edge cases why tests might fail. Why have listed here the most common ones here.
Now, you can start debugging your tests. This is how.