why we finally allowed arbitrary waits in our tests

For years we had a firm rule: no arbitrary sleeps in Octomind tests. Whenever someone asked for them, we pushed back. A hard-coded wait only papers over real bugs - and how do you even choose the “right” number? Too short and the test still flakes; too long and the whole suite drags while the bug stays hidden.

We felt pretty proud of that stance… until we broke it. So what changed?

The users who changed our minds

Two customers arrived with the same head-scratcher: bugs on the page under test that only broke test automation while rage-clicking users just brushed it off. 

When the first user landed on the page under test for the first time, Playwright did the obvious thing: waited for DOMContentLoaded and dismissed the “Accept cookies” button. Closing that overlay coincided with the end of the page’s first hydration cycle, so everything was ready and the click succeeded.

The trouble appeared on the next navigation. Because the cookie banner is a one-shot component, it never shows up again. The framework takes care of hydration. During that window, the DOM looked complete in the markup but the JavaScript listeners that make it interactive weren’t attached yet. Any click events were fired into limbo. 

Humans barely notice: they click a button once, if nothing happens, they click again - and by then hydration has finished and the second click works. Automation, however, isn’t so forgiving - Playwright clicks once and expects success. From the test runner’s point of view the button is “unclickable,” and the entire suite becomes flaky.

Digging in, we found the culprit: the site uses Nuxt with the nuxt-delay-hydration plug-in. To win Lighthouse points, the plug-in deliberately delays hydration, leaving a half-alive DOM that ignores clicks. Great for scores, terrible for test runners.

In other words, the real bug isn’t the flaky test; it’s that the page lets the user interact before it’s actually ready. The app should either finish hydration faster or block pointer events until it’s done. But when the dev team has “bigger fish to fry” and testers still need reliable automation - is where a well-placed, deterministic wait comes in.

Waiting - with intent

Here’s the surprise twist: the plug-in also exposes a lifesaver - window._$delayHydration, a promise that resolves when hydration finishes. Instead of guessing a timeout, we could:

await page.waitForFunction(() => window._$delayHydration);

No arbitrary sleep, no hidden bugs - just a deterministic gate that says: “OK, the page is ready; click away.” We wrapped that in a “wait for” step and shipped it.

Now frameworks that surface a hydration promise get first-class support, our users get reliable tests, and we still sleep well at night.

“Fine - let’s hide that bug for you”

Our second case came from a team testing a classic magic-link login:

  1. Enter email → press send code
  2. Open the “Here’s your one-time code” email
  3. Copy the code, paste it back in the page
  4. Celebrate as the user is logged in

Humans breeze through this. Our automation… not so much. Roughly 70 % of runs failed with “invalid one-time code.” We replayed the run in our UI: right email, right timestamp, code typed perfectly. Nothing obvious.

It was time to check Playwright Trace Viewer for more details. There, hidden in the network tab, we saw it: the POST request sent oneTimeCode: null even though the input showed the correct value. After a few experiments we found that if we waited ~3 seconds before filling the email field on step 1, the bug never appeared.

Classic timing issue. The fix belongs in the app, but the responsible dev team couldn’t reproduce it outside automation and, frankly, had other stuff to do. Meanwhile, QA needed a working login test today.

So we asked ourselves:

  • Is three seconds reliable? (Yes.)
  • Does it unblock the customer? (Absolutely.)
  • Does it risk future flakiness? (Maybe, but we’ll monitor.)

Result: we added a “wait for fixed time to the “wait for” step that pauses exactly as long as the user specifies. It’s a band-aid, sure, but it keeps their CI green while the real bug sits in the backlog - and that, for now, is the difference between testing and not testing at all.

What we learned

Inside your own repo, you can stay pure: spot the timing issue, fix the code, push to prod, done. But once you’re shipping a testing platform for others, the equation changes. Testers aren’t always sitting next to the developers who own the bug - and even if they are, business priorities win over elegance. 

So we’ve learned to balance principle with pragmatism:

  • Idealism: Root‐cause every failure and fix it at the source.
  • Reality: Octomind users sometimes lack code access, dev bandwidth, or both
  • Middle ground: Offer a surgical wait that unblocks the workflow while the real fix makes its way through the backlog.

It’s not the romantic story we once told ourselves, but it keeps releases moving. And if that wait turns out to be unnecessary tomorrow - great, we delete it. Until then, it’s the tiny compromise that saves the day.

kosta welke senior engineer
Kosta Welke
code monkey at Octomind
read more blogtoposts
; ;