Your AI is a Genie, Not a Guru: How to Supervise Your New Assistant

Published on September 30, 2025

Let’s talk about our new AI assistants (soon to be overlords). After all, AI in software testing is all about help.

Everyone’s excited about the promise: they can boost our productivity, generate tests, and free us to do the real work of testing. They can even generate a test, run it, modify it if it fails, and run it again until it passes.

Here’s a newsflash: Nobody needs us anymore!

…Okay, not quite. We’re not there yet, thankfully.

With our AI assistants, our role changes. We’re no longer the ones hauling all the gear up the mountain; we become the sherpa. We have to guide the AI, point it toward the right path, and make sure it doesn’t wander off a cliff, taking our codebase with it. We want the productivity boost, but we can’t compromise on quality. Otherwise, we’re just getting more stuff of lower quality, and since we’re in the quality business, that’s a non-starter.

So, what do you need to watch out for?

1. The Firehose of Test Ideas Your AI can spit out a hundred test case ideas before you’ve started your morning coffee. That’s amazing, but it’s also a trap. With so many ideas, you need to step in and take hold and firehose. You have to review not just the validity of the cases, but whether they apply to where you are in the project right now. Are you just starting out? Do you already have automated tests for some of these ideas? Your job is to pick the right ones that matter most in your context.

2. The Test Design Black Box So you’ve got the ideas. Now, how do you perform the tests? What data do you need? Does it need to be repeatable (I’m guessing yes)? The genie is great at preparing what’s needed. But again, it’s efficient, not wise. Your role is to validate that what it generated is what you actually need. When it creates test data, you have to make sure you know how to use it. Some of it is perfect for the test, some is just a placeholder, and some is critical while the rest is just noise. You need to know which is which.

3. The Danger of Generated Code This is the big one. The genie generating code is both wonderful and terrifying. Think about it. If you don’t understand what that code does, you might as well be merging dark matter into your codebase. You could be breaking existing code or adding new, unknown dependencies. When the whole team does it, the amount of un-reviewed code becomes a tsunami, and it won’t get the scrutiny it needs.

You have two defenses. First, have tests that you trust – whether you write them or generate them, you have to trust them. They are the only proof that the code still works. Run them constantly. Second, go slow. Make small changes. Review and approve tiny portions of code so you can control the direction and ensure it’s going where you want it to.

4. The Refactoring Mirage Refactoring means changing the structure without changing the functionality. So when your genie does some spring cleaning (at your command), you must understand the impact of the changes. Automated tests will tell you if the functionality is the same, sure. But what about everything else? The non-functional stuff? Did it slow things down? Are the timeouts different? Is it making extra network calls you don’t know about? We usually don’t have automated tests for these things, which means you might find out too late. Use the same principle: move in small, deliberate steps. Make sure you understand the changes and can verify there are no nasty side effects.

Your AI assistant is a powerful genie, but it needs a master. It needs a chaperon. It’s tempting to let it loose, but that’s how you lose control and sacrifice quality. And we don’t want none of that.

So don’t let the genie out of the bottle. Not without supervision.


Feeling like you’re wrangling a genie instead of working with an assistant? Start with my AI quality cheat sheet. Check it out.

The post Your AI is a Genie, Not a Guru: How to Supervise Your New Assistant first appeared on TestinGil.