What Can’t GPT-5 Do?

Published on September 21, 2025
Off the script — What can’t GPT-5 do?

At the GPT-5 launch event, every demo followed the same format: someone casually asked the AI to build or tweak an app. While the AI worked, they filled the silence with small talk. A few minutes later, the AI produced the update — and then the demo shifted to the next step: people testing what the AI had built.

In fact, the entire production was a sequence of moments like this: ask the AI to make a change, wait, and then watch humans test the result. The pattern repeated again and again.

At one point, Greg Brockman, President of OpenAI, turned to Michael Truell, CEO of Cursor, and asked him directly: “What can’t GPT-5 do?”
Michael’s answer: he was “really excited about Computer Use capabilities about those getting better,” pointing toward a future where the model doesn’t just write code — but actually runs it, sees outputs, and QA’s itself. He expands on this, saying he’d like the model to “run the code, see the output, actually, you know, kind of QA every little bit itself and then react to it.”

That short exchange captured the core of the coming wave: AI may be on its way to becoming not just the developer — but the tester too.

Between Product Managers and Testers

Traditionally, software development has been a three-part relay race:

  • Product Managers (PMs) write specifications.
  • Developers implement features.
  • Testers find and report bugs.

Developers lived in the middle. They translated PM vision into code, then handed the software off for QA.

But now, with AI coding agents in the mix, that division of labor is eroding. Developers no longer spend most of their time painstakingly writing code. Instead, they prompt the AI:

  • “Add this new feature.”
  • “Fix this bug.”

The AI builds the feature or fixes the bug. And the developer? They wait.

What Developers Actually Do Now

Here’s the workflow of a modern AI-powered developer:

  1. Describe the feature or bug fix to the AI coding agent.
  2. Wait minutes — or hours — for the AI to produce new code.
  3. Review and test what the AI generated.

That’s the squeeze. Developers are spending less time coding and more time doing the jobs that used to sit on either side of them:

  • Product management work: breaking down feature requests, refining requirements, deciding scope.
  • Testing work: poking at the app, reproducing bugs, validating behavior, reading and reviewing AI-written code.

In many small projects, developers effectively absorb most of the PM function themselves because iteration is now so fast. And the bulk of their day? Testing. Developers are becoming the QA department for their AI pair programmer.

Faster AI, Higher Expectations

As AI gets better, product managers expect features to land faster. The cycle time between “idea” and “running code” is shrinking. That means more versions, more releases, and more to test.

The paradox: AI accelerates feature delivery, but it also explodes the surface area of testing. Developers can’t just ship blindly. They’re still accountable for quality — and now, they’re drowning in verification.

This is why developers feel squeezed. They’re no longer shielded by distinct hand-offs. The AI builds, and they test. The AI builds again, and they test again.

The Coming Reality: Developers as Testers

If you extrapolate this trend, the role of “developer” continues to shift. Coding itself becomes a smaller and smaller fraction of the job. Developers will still be in the loop, but the majority of their time will go into:

  • PM-like work: defining what to build, clarifying requirements.
  • Testing work: verifying AI-generated code, validating user flows, regression checking.

Think of it this way: developers may end up doing 2× as much PM work — but 10× as much testing work.

And here’s the uncomfortable truth: developers didn’t sign up to be testers. The craft and identity of “building” is being eroded by delegation to machines.
“Developers don’t want to be testers!”

Who Will Test the AI?

This is why the GPT-5 event matters. When even the creators of these tools highlight how much better the AI needs to get at Computer Use and QA, we’re only glimpsing the next wave.

At Testers.AI, we believe testing AI-generated code is the new bottleneck. That’s why we’re building AI testing agents — tools designed to do most of this testing work for developers. Our agents talk directly to the AI coding systems, file the bugs that need fixing, and rerun checks automatically — so developers can either enjoy another coffee or focus on the parts of coding and bug fixing that AI still can’t handle.

These testing agents are constantly watching code changes from both the developer and the AI. They can live directly inside the developer’s IDE, always on standby to run end-to-end tests, or plug seamlessly into CI/CD pipelines to validate every new build. In short: they free developers to be developers again.

And for teams that don’t want to set all this up themselves, there’s IcebergQA — where expert testers configure and run these AI testing agents on your behalf, providing managed AI-driven QA at scale.

Because the real future isn’t AI replacing developers.
It’s AI developers and AI testers working together — while human developers reclaim their role as builders.

Conclusion

The developer role is being squeezed:

  • Less coding.
  • More PM-style requirement work.
  • A whole lot more testing.

The AI developer writes the code. The human developer tests the AI. Until AI testers arrive in force, that squeeze will only get tighter.

The irony? The “AI developer” isn’t really a developer at all. It’s the human, sitting in the loop, who is being re-cast as the tester.

And unless we address this shift, every developer may wake up one day to find they’ve become exactly what they never wanted to be: QA.

— Jason Arbon, testers.ai