The Future of Software Engineering is Agent Orchestration

Software development used to be bounded by your typing speed and mental working memory. Now it’s bounded by how many agents you can coordinate without losing your mind.

That’s not hyperbole—it’s where we’ve arrived. While working on Apache Superset, I found myself juggling 5-10 parallel Claude sessions across as many tmux tabs and Docker environments. The agents were productive individually, but coordinating them was chaos. That pain led me to build Agor—but more importantly, it showed me developers everywhere were hitting the same wall.

The agentic coding tools got too good to use just one at a time, but we had no structure to orchestrate these agents.

The future of software engineering isn’t about writing more code or writing it faster. It’s about orchestrating agents that do the writing while you focus on the architecture, the vision, the coordination.

Let me trace how we got here—seven distinct phases that brought us from prompt engineering to full agent orchestration.

Phase 1: Prompt Engineering (2023–2024)

The early days. ChatGPT is magic, but the workflow is brutal:

Carefully craft prompts in a text window
Copy/paste code from your editor into the chat
Get a response back
Copy/paste the response into your editor
Repeat

The bottleneck wasn’t the AI—it was the context handoff. You spent more time explaining your codebase than coding. Every new conversation meant starting from scratch: “I have a React app using TypeScript and…”

The smart ones learned to build context libraries in text files they’d paste in. Prompt engineering was real work. You got good at being deliberate about what context you included, because you were paying for it in attention and copy/paste tax.

Phase 2: Early Agentic Coding (Early 2025)

Tools like Claude Code and Cursor changed the game by giving AI agents actual tools:

File read/write access
Bash execution
Web search
Codebase search

The magic moment: instead of you fetching context for the AI, the AI fetches its own context. You say “fix the auth bug” and it greps for auth code, reads the relevant files, understands the structure, and proposes a fix.

This felt like a step function. The agent could now close its own information gaps. The workflow compressed from hours to minutes.

But you were still in a single session. One chat window, one context window, one train of thought at a time. On your very first session compaction you realized you should probably start another project in parallel, clone another repo to avoid conflicts, and soon after you probably discovered git worktree.

Phase 3: Context Engineering (Mid 2025)

As codebases grew and teams adopted these tools, a new problem emerged: agent context drift. The AI would hallucinate patterns that didn’t exist or miss critical constraints buried in old conversations.

The solution: treat context like code. Stop dumping everything into CLAUDE.md and start structuring it:

context/ folder with bite-sized markdown files
Cross-linked, versioned, reviewed in PRs
Agents fetch what they need per task

This is context engineering—the realization that good AI-assisted development requires good information architecture. Your codebase’s context graph became as important as its dependency graph.

Anthropic formalized this in their September 2025 engineering blog: “Context engineering is the art and science of curating what will go into the limited context window from the constantly evolving universe of possible information.”

Many also realized that agents do much better when provided with [only] the required context. Context is precious. Many discovered the joys of putting together a solid PRD, /clear and subagents, and started obsessing over context. Your AGENTS.md / CLAUDE.md grew out of proportion, and realize you should start factoring out context into a collection of smaller files elswhere, and your AGENTS.md became more of an glossary pointing to a collection of context files.

Phase 4: Parallelization Chaos (Fall 2025)

Then Claude Code got really good. Claude 4 launched in May, Sonnet 4.5 dropped in September—good enough that waiting for one agent to finish felt like watching paint dry.

People realized: I can just open another terminal.

As Simon Willison wrote in October 2025: “I’ve become one of those people who run multiple coding agents at once—firing up several Claude Code or Codex CLI instances at the same time, sometimes in the same repo, sometimes against multiple checkouts or git worktrees.”

Suddenly developers were running 2, 5, 10 parallel Claude Code sessions:

Multiple terminal windows
Tmux pane hell
Multiple IDE instances
Different branches in different directories
“Wait, which Claude is on which branch?”

The limiting factor became cognitive overhead. You were now a manager of agents, but with none of the tooling managers have. No dashboard, no status board, just your spatial memory of which terminal window had which session.

The workflow worked, but it didn’t scale. You’d lose track of which agent did what. Merge conflicts became more common. Prompts went into the wrong terminal. Environment port collisions and confusion happened.

But something else was happening too: the quality kept improving. As Armin Ronacher observed in June: “Already today the code looks nothing like the terrible slop from a few months ago.” The agents were getting good enough to justify the coordination overhead.

Phase 5: Multi-Agent Tooling (Now)

This is where tools like Agor come in. The realization: if you’re managing multiple agents, you need multiplayer infrastructure.

I built Agor after hitting this wall on Apache Superset—first trying Claudette (project-specific tooling), then realizing this needed to be generalized. The key insights:

Git worktree management: One worktree per feature, isolated environments
Spatial boards: Inspired by Figma/Miro—leverage human spatial memory instead of terminal archaeology
Environment awareness: Connect projects to Docker, GitHub issues, dev servers with auto-assigned ports
Session trees: Fork to parallelize, spawn to delegate, maintain genealogy
Multiplayer: Share environments and agent context, collaborate in real-time with presence and cursors

Instead of tmux-hopping, you have a canvas. Drag a session into “Needs Tests” and trigger a test-writing agent. Fork a conversation to parallelize debugging. See the full genealogy of how a feature evolved.

Multi-agent work becomes structured, not chaotic.

Phase 6: Agent Orchestration (Emerging)

But visibility isn’t enough. What if agents could coordinate with each other?

This is starting to happen through Agor’s internal MCPs (Model Context Protocol servers). Agents in Agor can now:

Spawn new git worktrees and sessions programmatically
Check on each other’s status
Share context and artifacts
Cross-review each other’s work
Delegate sub-tasks automatically

Imagine an agent that:

Implements a feature
Spawns a test-writing agent
Spawns a review agent to check for edge cases
Spawns a docs agent to update documentation
Reports back with a complete PR

You prompted once. Four agents collaborated. You reviewed the output.

This isn’t science fiction—Agor’s MCP integration enables exactly this pattern. Early on while building Agor, I realized that exposing an internal MCP service would let agents interact with the environment they’re operating in—and with each other. That insight accelerated development dramatically. I was able to start using Agor to build Agor, and that’s how it got so good so quick.

The infrastructure is here: MCP (Model Context Protocol) launched in November 2024, and by 2025, the industry standardized around it. OpenAI adopted MCP in March 2025, Google DeepMind confirmed support in April. Thousands of MCP servers exist. The plumbing for agent-to-agent communication is production-ready.

Phase 7: Meta-Orchestration (The Future!?)

The next layer: autonomous orchestration with different agent roles—not just agents doing work, but supervisor agents managing other agents.

Think hypervisors for your development workflow.

Supervisors on a Schedule

Agor has a built-in scheduler. You can run supervisor agents on a schedule that:

Inspect the state of worktrees across zones
Make assertions about project health
Take automated actions to move work forward
Automate the job of the board operator

Imagine a supervisor prompt like this:

For all IDLE worktrees in the "In Development" zone:
1. Check that an environment is running
1. Run `pnpm check` and verify it passes
1. If it fails, prompt the agent to fix issues
1. If it passes, move the worktree to "Open a PR" zone
1. Use playwright MCP to verify the work
1. Trigger the PR creation workflow

Run that every hour. Your projects push themselves through your workflow.

With the right setup, everything eventually lands in either “Ready for Human Review” or “Needs Human Support”—automatically. The supervisor automates the operator’s workflow as your process becomes more defined.

Where to invest in this type of automation? Easy, wherever you spend significant time and effort doing things that can be automated with an agent.

Trigger Chains

The next evolution: on-completion triggers for zones.

A worktree completes in the “Development” zone
Automatically triggers a Codex review agent with a custom prompt
On review completion, move to the next Zone and trigger the next prompt in the chain

Agent workflows become declarative. Define the chain once; let it run.

The Runaway Problem

But here’s the concern: agent chain reactions. Agents triggering agents triggering agents could create runaway scenarios. You need containment:

Token budgets: “This board locks at $100 in API costs”
Concurrency limits: “No more than 12 sessions active at once”
Resource caps: “Only 10 dev environments running in parallel”
Circuit breakers: “Pause all automation if error rate exceeds 30%”

Policy systems for agent orchestration—like Kubernetes resource limits, but for AI workflows.

What This Enables

Picture a multi-agent event loop with supervisors:

Human: "Implement dark mode"
↓
Planner Agent: creates task breakdown, spawns implementers
↓
Feature Agents: work in parallel on different components
↓
Supervisor (hourly): checks all worktrees, runs tests
↓
Review Agent: triggered on test pass, checks for consistency
↓
Test Agent: spawned if coverage drops
↓
Docs Agent: triggered on review approval
↓
Integration Supervisor: runs full build checks
↓
PR Agent: generates summary, tags reviewers
↓
Human: reviews and merges (or rejects and triggers fixes)

You’d orchestrate the orchestra, not play every instrument. The supervisor keeps the rhythm.

As Addy Osmani described in July 2025: “Background agents turn coding into delegated background work: submit a task, let it run in the cloud, review a completed PR later—coding as queued, asynchronous workflow.”

This is software development reimagined as an async, delegated process—closer to how you’d manage a distributed team than how you’d code alone. Except your team runs 24/7, costs pennies, and never complains about meetings.

All the primitives for this exist in Agor today. The question is how much autonomy we’re ready to hand over.

The Multiplayer Question

Here’s where it gets interesting: software development has always been a team sport, but with strong boundaries around environments. Your machine, your branches, your terminal.

Agor (and tools like it) are breaking down those boundaries:

Shared dev environments
Shared agent sessions
Real-time collaboration on agent conversations
Forking sessions mid-conversation
Reviewing, commenting, and QA-ing live

Is this the future? Unclear.

Agent orchestration might be a solo sport—you, conducting your own symphony of agents. Or it might become genuinely multiplayer: teams sharing not just code but the entire AI-assisted development process.

Imagine:

“Hey, jump into this Claude Code session with me”
“I forked your session to try a different approach”
“The review agent found 3 issues—spawned fix agents”
Pair programming, but with 5 people and 10 agents

Call it group vibe coding. Call it multi-conductor orchestration. Whatever it is, Agor enables something completely new: synchronous, multiplayer, agent-assisted development with full visibility and structure.

Will people use it? I don’t know. But the option exists now in a way it didn’t before.

The Vibe Coding Divergence

There’s a parallel track here worth acknowledging: vibe coding—apps spawning into existence from prompts, landing pages built in minutes, MVPs shipped before lunch.

This branch of software development is real and getting more viable every day. The bar for what you can accomplish with “just vibes” keeps rising. But here’s the thing: as vibe coding gets augmented and accelerated, true engineered software will keep dominating the hard problems.

The commoditization bar keeps rising, forcing engineered software to innovate to stay relevant. What was “real engineering” five years ago is vibe-coded today. What requires engineering today will be commoditized tomorrow.

The two tracks will coexist, but they’re solving different problems at different scales.

What This Means

We went from “AI is a fancy autocomplete” to “AI is restructuring how we work” in less than a year. Each phase unlocked new workflows but revealed new bottlenecks:

Prompt engineering taught us to be deliberate about context
Agentic coding taught us to give agents tools
Context engineering taught us to structure information for retrieval
Parallelization taught us that more agents = more chaos without tooling
Multi-agent tooling is teaching us to treat agent coordination as a first-class problem
Orchestration layers are teaching us… we’re still figuring this out

The next phase won’t be about better models (though those help). It’ll be about better coordination infrastructure. About treating agent orchestration like we treat distributed systems: with primitives for spawning, messaging, state management, and failure handling.

Software development is evolving from “one developer, one machine” to “one developer, many agents, shared environments.” The tools that win will be the ones that make that transition feel natural instead of overwhelming.

But where does this end?

Are we headed toward a software utopia? What does that even look like? The absence of software? The digital world catering to all your needs in real time, materializing solutions at the speed of thought? More audio and visual interfaces everywhere, software that understands context without being told?

I don’t have answers. But I know it’s coming fast.

Whatever you think the software utopia looks like—go orchestrate agents to build it. Quick, before it’s too late. The velocity we’re seeing now is just the beginning. The tools exist. The models are good enough. The only question is whether we can coordinate the work faster than the future arrives.

If you’re juggling agents across terminals and losing track, try Agor. If you’re building orchestration tools or thinking about this space, I’d love to hear what you’re seeing. The shape of this is still forming, and the best insights are coming from people in the thick of it.

References

Anthropic: Effective context engineering for AI agents (September 2025)
Simon Willison: Embracing the parallel coding agent lifestyle (October 2025)
Armin Ronacher: Agentic Coding Recommendations (June 2025)
Addy Osmani: Coding for the Future Agentic World (July 2025)

— The velocity is real. Now go orchestrate your vision.

Agor Platform Context Engineering