19 What Comes Next

Everything in this book will be partially obsolete within eighteen months. The models will be better, the tools will be different, and capabilities we marked “directional” will be shipping. That is not a flaw in the book. It is the central argument. The methodology survives tool change. The primitives survive model change. The discipline survives everything.

This chapter applies the three-tier honesty framework (available now, emerging, directional) to the trajectory of the field itself. Where the evidence is strong, the predictions are specific. Where it is not, they are marked accordingly. And where the author is guessing, that is said plainly.

19.1 Near-Term: What Changes in the Next Twelve Months

Agent tool use becomes standard, not experimental. The shift from text generation to agents that execute — file operations, terminal commands, API calls, test runs — is underway but uneven.¹ Within a year, tool-using agents will be the default interaction mode. This makes Safety Boundaries more critical, not less. A model that generates bad code wastes review time. A model that executes bad commands corrupts state. Guardrails that felt conservative in a text-generation world become essential in a tool-execution world.

Multi-agent orchestration moves from research to practice. Teams today primarily use single-agent interactions. Multi-agent patterns — planning agents dispatching specialists, review agents evaluating output, agents collaborating through shared artifacts — exist in research and early tooling.² Within a year, they will ship in mainstream platforms. The orchestration disciplines in Chapters 10–12 — task decomposition, wave-based execution, escalation protocols — become operational necessities rather than advanced practices.

The agentic computing stack crystallizes through independent convergence. By mid-2025, at least three independent efforts arrived at the same layered architecture: manifest-based primitive distribution, framework-layer composition, and CI/CD-native execution. Anthropic’s plugin.json³, GitHub’s Agentic Workflows, and open-source frameworks like Squad⁴ and Spec-Kit⁵ didn’t coordinate — they converged because the layers reflect real boundaries in the problem. Open-source tools already provide manifest-based dependency resolution and security scanning at the primitive layer, the same architecture as npm or pip applied to agent configuration rather than runtime code. This is the pattern that produced HTTP → REST → Rails/Express → npm/pip → applications: each layer emerged when practitioners needed it, not when a standards body decreed it. Spec-Kit and Squad are to agentic development what Spring and React are to traditional computing — they make orchestration easier in one direction, constrain freedom in another, and consume primitives via package managers above the harness layer. The strategic signal: when independent implementations from different vendors converge on the same architecture, the architecture is real. Organizations investing in the primitive layer (Chapter 9) are building on the layer most likely to remain stable as the framework layer evolves.

19.2 Medium-Term: What Shifts Over One to Three Years

Agent governance becomes a first-class engineering discipline. Today, governance of agent output is handled through existing processes: pull requests, CI, manual approval. This works at current volumes. As output scales and multi-agent orchestration becomes common, dedicated governance infrastructure will emerge: audit trails for agent decisions, policy engines that enforce constraints at execution time rather than review time, cost controls that manage token spend across teams. The governance frameworks in Chapter 5 anticipate this, but the tooling barely exists. Within three years, agent governance platforms will be a category — the way CI/CD became a category over the past decade.

The boundary between “writing code” and “describing intent” blurs. As models improve at understanding architectural context and as context infrastructure matures, the human role shifts further toward specification and validation. The planning phase — defining what the system should do, what constraints it must respect, what trade-offs to accept — becomes proportionally more of the work. The execution phase becomes proportionally more automated. This does not eliminate engineering skill. It shifts where that skill applies: from implementation patterns to system design, constraint definition, and output evaluation. The practitioners who thrive will be the ones who treat specification as an engineering discipline, not a hand-wave before the “real work.”

19.3 Long-Term: Possibilities Over Three to Five Years

These predictions are directional. The author believes they describe where the field is heading. They are opinions, not forecasts.

Full lifecycle agent participation becomes achievable. The eight-phase lifecycle from Chapter 4 describes agent participation across requirements, design, code, test, review, deploy, operate, and iterate. Today, robust support exists primarily in code and review. Within five years, credible participation across all phases is plausible — not as autonomous replacements, but as capable participants handling routine work under human direction.

Context infrastructure becomes as foundational as CI/CD. Every serious engineering organization today has continuous integration and deployment. Context infrastructure (the context files, instruction hierarchies, and knowledge bases that make agents effective) will follow the same trajectory. Early movers treat it as competitive advantage. Eventually it becomes table stakes. Organizations without it will find agentic tools unreliable and conclude the technology “doesn’t work for us,” the same way organizations without CI concluded automated testing “doesn’t work at our scale.”

gantt
    title Three-Horizon Timeline
    dateFormat YYYY
    axisFormat %Y

    section Near-Term (0–12 mo)
    Tool-using agents standard              :active, n1, 2025, 2026
    Multi-agent orchestration ships         :active, n2, 2025, 2026

    section Medium-Term (1–3 yr)
    Agent governance as discipline          :m1, 2026, 2028
    Spec replaces implementation            :m2, 2026, 2028

    section Long-Term (3–5 yr)
    Full lifecycle agent participation      :l1, 2028, 2030
    Context infra foundational as CI/CD     :l2, 2028, 2030

Figure 19.1: Three-horizon technology timeline

19.4 What Will Not Change

These are the things the author is most confident about, precisely because they are structural rather than technological.

Context will remain finite and fragile. There will always be a limit to how much information an agent can effectively consider. The constraint that context must be structured, scoped, and curated is a property of the problem, not the current technology.

Output will remain probabilistic. Models will get better. They will not become deterministic. Reliability must be architected through constraints and validation, not assumed from model quality.

Explicit knowledge will remain more valuable than implicit knowledge. Agents will not read the minds of the team. Organizations that externalize their knowledge will outperform those that don’t.

Human judgment will remain the bottleneck and the differentiator. The scarce resource is the ability to define what should be built, evaluate whether it was built correctly, and decide what to do when it wasn’t.

Composition will remain necessary. No single agent will hold an entire large system in focus. The tools for composition will improve; the need for it will not diminish.

These five properties map directly to the PROSE constraints from Chapter 1. The constraints were not designed for today’s models; they were designed for the fundamental properties of human-AI collaboration.

19.5 Three-Tier Honesty Applied to This Chapter’s Own Claims

Claim	Tier	Confidence
Tool-using agents become the default interaction mode	Available now	High — shipping in multiple platforms
Multi-agent orchestration enters mainstream tooling	Available now	High — shipping in multiple tools
Agent governance becomes a distinct discipline	Emerging	Medium — need is clear, tooling is not
Specification replaces implementation as the core skill	Emerging	Medium — direction clear, timeline uncertain
Full lifecycle agent coverage becomes operational	Directional	Low-to-medium — plausible, not inevitable
Context infrastructure becomes as foundational as CI/CD	Directional	Medium — trajectory clear, timeline 5+ years
Agentic computing stack layers consolidate	Emerging	Medium — convergence visible, standardization not
The five core constraints hold	Structural	High — properties of the problem

The reader should calibrate accordingly. Invest confidently in the “available now” tier. Prepare for the “emerging” tier. Be aware of the “directional” tier without betting the organization on specific timelines.

19.6 When NOT to Use Agentic Workflows

Not every task benefits from agent orchestration. Applying the methodology where it does not fit wastes time and produces worse outcomes than working manually. Recognize these scenarios early:

The task requires fewer than 50 lines of change. If you can hold the full scope in your head, the overhead of persona design, wave planning, and checkpoint discipline is not worth it. Just write the code.

The domain knowledge is entirely implicit. If the conventions, constraints, and trade-offs cannot be externalized into instruction files – because they depend on political context, unwritten relationships, or organizational history that resists documentation – agents will produce plausible but wrong output. Instrument the codebase first (Chapter 9), then apply agents.

The cost of failure is low and iteration is cheap. For throwaway scripts, prototyping, and exploratory work, a single agent prompt with no orchestration is faster and sufficient. The methodology exists for production-grade work where reliability matters.

The work is inherently sequential and creative. Naming things, choosing abstractions, defining API contracts – these are judgment-dense tasks where agent suggestions help but orchestrated composition adds nothing. Use agents as sounding boards, not as orchestrated fleets.

The platform fights automation. The Growth Engine case study documents three automated approaches to Kit form automation, each hitting React’s virtual DOM. When the platform’s internals are undocumented and hostile to external manipulation, escalate to a human with a precise checklist rather than attempting a fourth approach.

The methodology’s value is recognizing which category a task falls into before committing to an approach.

19.7 Your First Week: What to Do Starting Monday

For leaders who read Block 1 and practitioners who read Block 2, here is the concrete version. Not principles. Actions.

19.7.1 Day 1: Audit One Module

Pick the module your team changes most frequently. Not the biggest module, the most-changed one. Run the methodology from Chapter 9: identify implicit knowledge, undocumented conventions, architectural decisions that exist only in your team’s memory. Write down what you find. You are not fixing anything today. You are measuring the gap between what an agent can see and what your team knows.

Deliverable: A list of 5–10 implicit conventions that an agent would violate on its first task in this module.

19.7.2 Day 2: Write Your First Three Primitives

Take the top three conventions from yesterday’s audit. Write each as an instruction file — one organizational standard, one architectural constraint, one domain-specific rule. Follow the format from Chapter 9, under the constraints from Chapter 10: scoped, testable, specific. Do not try to document everything. Three primitives that cover the most common mistakes are worth more than thirty that cover edge cases.

Deliverable: Three instruction files, committed to your repository.

19.7.3 Day 3: Test Against a Real Task

Pick a task from your current sprint — something an agent would plausibly handle. Run it twice: once without your new context files, once with them. Compare the output. Did the context files prevent the mistakes you predicted? Did they cause new problems? Record the before-and-after. This is your first data point, not your conclusion.

Deliverable: A before-and-after comparison with specific examples of what changed.

19.7.4 Day 4: Measure and Adjust

Review yesterday’s comparison honestly. Which files made a difference? Which were ignored or misinterpreted by the agent? Revise the ones that didn’t land. This is the calibration loop from Chapter 11: context files are not documentation, they are engineering artifacts that need testing and iteration like any other code.

Deliverable: Revised instruction files based on observed agent behavior.

19.7.6 For Leaders, Additionally

If you lead the organization rather than the team, Day 1 is different. Start with the readiness assessment from Chapter 7. Identify one team with the right combination of codebase maturity, process discipline, and cultural openness. Fund a structured pilot — not “give everyone licenses and see what happens,” but the phased adoption from the transition plan. Protect the investment in context infrastructure. It has the highest long-term return and the lowest short-term visibility, which means it is the one most likely to be cut.

19.8 What the Author Probably Got Wrong

Intellectual honesty requires identifying where this book’s assumptions are most likely to age poorly.

The pace of capability improvement may outrun governance. This book assumes organizations will have time to build governance infrastructure before agent capabilities demand it. If capabilities improve faster than organizational maturity — the historical pattern for every technology shift — many organizations will face a period where agents can do more than the organization is prepared to govern.

The emphasis on human-in-the-loop may prove too conservative. For high-stakes production code, human review will hold. For internal tooling, prototyping, and throwaway infrastructure, fully autonomous workflows may become practical sooner than this book suggests. The “always review” stance is safer but may leave real efficiency on the table in contexts where the cost of failure is low.

The multi-agent orchestration model may evolve past human orchestrators. The patterns in this book assume a human planner dispatching specialist agents. Future orchestration may involve agents that plan their own decomposition, negotiate resources, and maintain persistent state across sessions. The compositional principles will likely still apply, but the human-as-orchestrator model this book centers may be a transitional pattern, not an enduring one.

The documentation burden may not pay for itself. This book asks teams to externalize knowledge that was previously implicit. That is real work with real ongoing maintenance cost. If the productivity gains from agentic development are modest — 15–20% rather than the 2–3x some claim — then the time spent creating and maintaining context infrastructure could consume most of the gains. The break-even calculation is less obviously favorable than the book implies, and the author has not seen enough longitudinal data to be certain it tips the right way.

And the uncomfortable one: the author may be overestimating the durability of human judgment as the differentiator. This book argues that human judgment is the bottleneck agents cannot replace — and builds its entire methodology around that assumption. But there is a motivated reasoning risk in any book that argues humans are indispensable, written by a human who wants that to be true. If models develop genuine architectural reasoning — not pattern matching on training data, but the ability to evaluate trade-offs, anticipate failure modes, and make design decisions that hold up under pressure — then the “human judgment” moat this book describes is not structural. It is temporal. The author believes it is structural. The author also acknowledges that this belief is load-bearing for the entire framework, which means it is exactly the kind of assumption that deserves the most scrutiny and the least certainty.

19.9 The Closing Argument

Accept that you are early. The field is moving faster than any book can capture. The specific tools will change. The formats will evolve. The capabilities will exceed what is described here.

Use the principles, not the specifics. Structure your context, scope your tasks, compose simple building blocks, enforce safety boundaries, and organize your knowledge hierarchically. These disciplines work regardless of which model runs underneath or which tool wraps around it.

REST did not make HTTP better. It gave engineers constraints to reason about distributed systems. Twenty-five years later, the constraints still hold, even though every specific technology from that era has been replaced. The aspiration for the architectural constraints in this book is the same: durable reasoning tools for a field that will not stop changing.

The methodology is the floor, not the ceiling. Build on it.

GitHub Copilot’s “agent mode” (2025), Cursor’s agentic features, and similar integrations in VS Code, JetBrains, and other IDEs demonstrate this shift. The Model Context Protocol (MCP) standardizes how agents access external tools, accelerating adoption.↩︎
OpenAI’s Swarm framework, Microsoft’s AutoGen, and LangGraph represent early multi-agent orchestration libraries. GitHub Copilot coding agent and similar CI-integrated agents mark the beginning of production multi-agent workflows.↩︎
Anthropic, “Claude Code Plugins,” https://docs.anthropic.com/en/docs/claude-code/plugins↩︎
Brady Gaster, “How Squad Runs Coordinated AI Agents Inside Your Repository,” GitHub Blog, March 2026. https://github.blog/ai-and-ml/github-copilot/how-squad-runs-coordinated-ai-agents-inside-your-repository/↩︎
GitHub, “Spec Kit — Build High-Quality Software Faster,” https://github.com/github/spec-kit↩︎