[D] Are there REAL success stories of autonomous AI dev agents working reliably in production?
I’m having a serious debate with a colleague, and I want to settle this with actual evidence instead of opinions.
The claim:
That it’s possible today to run orchestrated AI developer agents (multiple agents, coordinated workflows) that can autonomously build and maintain software — under supervision of a senior AI/dev — without running into unfixable errors or constant breakdowns.
I’m skeptical. He believes it’s already happening.
So I’m looking for real-world examples, not theory:
- Have you actually used autonomous dev agents in production?
- What was the setup? (tools, stack, orchestration method)
- What level of autonomy are we talking about?
- What still breaks?
- Did it scale beyond small experiments or toy projects?
Especially interested in:
- Multi-agent setups (not just Copilot-style assistance)
- Systems that run for extended periods (not one-off demos)
- Cases where human input is minimal but still controlled
If you’ve seen this work (or fail), I’d really appreciate detailed insights.
Trying to separate hype from reality here.
[link] [comments]
Want to read more?
Check out the full article on the original site