Leadership

Stop Asking Your People to "Just Use AI"

B

Benjamin Hopwood

Operations Scaling | Agentic AI Orchestration

May 29, 2026|12 min read
Stop Asking Your People to "Just Use AI"

Most organizations have done the responsible things with AI. They have purchased licenses, issued AI-first mandates, established centers of excellence, and run pilots. In the majority of cases the returns have been modest: a measurable lift in a few tasks, little movement in aggregate, and nothing resembling the figures that dominate industry coverage.

The common conclusion is that the technology was oversold. The evidence points in a different direction. The shortfall is the predictable result of adding a capable tool to roles and structures that were never redesigned to use it. Most organizations have asked individuals to solve, from their own desks, a problem that is organizational in nature. They cannot do so, regardless of effort or ability.

The Ceiling on Bolt-On AI

A consistent ceiling appears whenever AI is layered onto unchanged roles. The most rigorous field study to date, conducted by Brynjolfsson, Li, and Raymond across roughly 5,000 customer-support agents, found "an average productivity improvement of about 14%, with negligible benefit for the most experienced workers." McKinsey's software research reaches a comparable conclusion, observing that providing developers with AI tools does not by itself change outcomes meaningfully, and that production-grade software cannot be reached through conversation alone. A controlled trial by METR found experienced developers working roughly 19% slower with AI on code they knew well, even as they reported feeling faster. MIT's 2025 analysis found that the large majority of enterprise generative-AI pilots produced no measurable effect on earnings.

The pattern holds across studies. Identical tools, applied to unchanged roles, yield single-digit-to-teens results and occasionally worse. The larger returns described in the same body of research consistently appear on the far side of a structural redesign.

What the Research Establishes, and What It Omits

McKinsey's 2025 work is now the most-cited evidence in this area. Of 25 organizational factors tested, the fundamental redesign of workflows had the single largest effect on whether a company realized bottom-line impact from AI. BCG's framing has become the shorthand: "roughly 10% of AI's value derives from the algorithm, 20% from the technology and data, and 70% from changes to people and process."

Two implications of this finding are routinely overlooked.

The first is that a workflow redesign cannot be executed by the people operating inside the current structure. Each is bounded by the scope of their own role and can see and improve only their own portion of the process. Asking a workforce organized for the existing model to invent the replacement, while still being measured against the existing model, is asking water to redesign its own riverbed. That work can originate only from leadership.

The second is that workflow redesign and organizational redesign are a single act. The flow of work cannot be rebuilt without also rebuilding who owns it, how large the teams are, and where the handoffs occur. A production line's output is not raised by speeding up one station while the rest of the flow stays as it was. It is raised by reorganizing the line as a whole. Knowledge work behaves the same way.

The Coordination Burden

There is a specific and measurable reason that larger organizations realize the smallest returns. As a team grows, the connections among its members grow far faster than the headcount. Fred Brooks documented this half a century ago, and his law followed from it: adding people to a late project tends to make it later. Those connections do not appear on any job description. They surface as status meetings, alignment sessions, repeated context-setting, and approval chains. Research on knowledge workers finds "the average person spending close to 60% of the day on coordination of this kind, and only about a quarter on the skilled work for which they were hired."

AI does not raise the ceiling on a team's collective attention. It accelerates the work beneath that ceiling, after which coordination absorbs much of the gain before it reaches the income statement. This explains why identical tools produce roughly 14% in a large organization and far larger multiples in a small one. The variable is the coordination burden each structure carries.

Organizations Have Discovered This Before

The structure that captures these returns is not new, and it did not originate with AI. Organizations across very different fields arrived at the same design independently, well before modern tools existed.

In 1943, Lockheed gave a young engineer named Kelly Johnson near-complete authority over a small team and a single objective: build a jet fighter. With roughly two dozen engineers and thirty mechanics, and freed from the company's normal approval chains, the team carried the aircraft from concept to a flying prototype in 143 days, ahead of schedule. The unit became known as Skunk Works, and Johnson's operating rules, still cited today, center on a small team holding complete control of its program and a short, direct path from decision to action.

Amazon reached a similar design from a different direction. Its two-pizza rule held that a team should be no larger than two pizzas could feed, and the principle later sharpened into the single-threaded owner: one team, accountable end to end for one outcome, with no competing obligations. Alexa, Prime, and AWS all emerged from that structure.

The pattern is not confined to technology. Buurtzorg, a Dutch home-care provider, organizes its roughly fifteen thousand nurses into neighborhood teams of no more than twelve, with no managers. Each team handles everything from intake and scheduling to the care itself, supported by a back office of about fifty people for the entire organization. Independent assessments credit the model "with overhead near 8% against an industry norm of roughly 25%, fewer care hours per patient, and consistently high patient satisfaction." When a team grows past twelve, it splits in two, which is how the organization expands without rebuilding the coordination layers it removed.

These organizations differ in era, industry, and technology. What they share is a single design: a small team given genuine permission to identify the problem, decide on the response, and carry it all the way through to delivered and validated value, without waiting on the approval layers that normally sit in between.

Permission, the Full Loop, and Speed

This is the point most AI initiatives miss. A small team produces its advantage when it is permitted to own the entire loop: to determine what is actually needed, to build and deliver a response quickly, and to confirm against reality whether it worked. Narrowing the team to a single task, or handing it a problem already defined from above, forfeits most of that value. A team that owns the full loop reacts in a fraction of the time a conventional structure requires, because the distance between recognizing a need and delivering something real is short and uninterrupted.

AI is what makes these proven models both faster to value and more scalable than their originators could achieve. It compresses each step of the loop, shortening the time from identifying a need to delivering and validating a response. It also changes the arithmetic that once forced teams to grow: delivering more previously required adding people, which reintroduced the coordination burden that slowed everything down. The same structures, equipped with AI, draw that additional capacity from the tool rather than from headcount, so a team stays small while its output rises and the model scales by multiplying small teams.

The economics are favorable, which matters for any leader weighing the risk. Because the unit of change is a single small team rather than the enterprise, the capital and operating exposure are modest. An organization can establish a team of three to six people, give it a real outcome and the authority to pursue it end to end, and observe the result within a short cycle. If the approach works, it has found a pattern to replicate. If it does not, very little has been spent, and the lesson arrives quickly.

Disruption and Its Management

A low financial cost does not imply low difficulty. This change collapses layers, redraws responsibilities, and removes coordination roles that exist precisely because the prior structure required them. People experience this directly, and managing it candidly is part of the work.

Klarna offers the relevant caution. The company reduced its workforce substantially on the expectation that AI would compensate, later acknowledged publicly that it had gone too far as quality declined, and resumed hiring. Its error was procedural. The company treated the change as a cost-reduction exercise rather than a genuine redesign, and proceeded without sufficient attention to what the work required. The belief that AI could enable more output with fewer people was not itself the mistake.

The appropriate approach is precise and limited. Select one high-value outcome, give a small team the authority to own it end to end, and measure quality alongside speed. Plan to redeploy the coordination capacity the redesign frees rather than removing it, since those individuals hold detailed knowledge of how the work actually moves, which the redesign depends on. Replicate success by creating more small teams rather than enlarging the ones that already work.

The Determining Factor: Organizational Will

Beneath all of the above lies a more fundamental question, and it is not a technical one. It is whether the organization possesses the will to change at the necessary depth, and whether it has people capable of leading that change.

The clearest precedent is from manufacturing. For decades Toyota was unusually open about its production system, inviting competitors to tour its plants. In 1984 it went further, partnering with General Motors to operate the NUMMI plant in Fremont, California, which gave GM direct access to the Toyota Production System using a largely American workforce. The methods were demonstrated, documented, and placed in competitors' hands. They were not secret.

American manufacturers nonetheless took the better part of a generation to adopt practices of comparable depth. The binding limitation was organizational will rather than the availability of information. Replication required rebuilding roles, authority, and the flow of work, and most firms were unwilling to go that far.

The parallel to AI is direct. The tools are available and the emerging playbook is increasingly public. What separates the organizations capturing large returns from those stalled at modest ones is the will to reorganize at the required depth, together with leaders who can carry that change through. Recognizing which category your own organization belongs to is the first and most consequential assessment a leader can make, because no amount of tooling will substitute for the absence of will.

Recommended Actions for Leadership

  1. Assess will before committing further investment. Determine honestly whether leadership and staff are prepared to change roles, authority, and workflow rather than tools alone.
  2. Choose one high-value outcome that a small team could own from end to end, rather than spreading AI thinly across many existing processes.
  3. Establish a small team of three to six people and grant it real permission: to identify what the problem and the need actually are, to build and deliver a response, and to validate it against reality, without routing each step through the usual approval layers.
  4. Optimize for the speed of that full loop and for validated learning. The measure that matters is how quickly the team moves from recognizing a need to delivering and confirming value.
  5. Scale on traction. When a team's approach works, replicate it quickly by creating additional small teams rather than enlarging the original.
  6. Measure quality and value contribution alongside speed. Establish a baseline beforehand and monitor for the quality decline that accompanies cost-led automation.
  7. Redeploy the coordination capacity the redesign frees rather than eliminating it, and retain ownership of the change. Removing obstacles across functions is leadership work and cannot be delegated to the technology organization.

The modest returns now common across the economy come from optimizing within structures that were never changed. The larger returns exist, they are documented across decades and industries, and they are reachable at a cost most organizations can absorb. Whether a given organization reaches them is a question of will more than of capability.


Agentic Solutions helps organizations move past "just use AI" by redesigning how work flows and unleashing the small, accountable teams that convert AI capability into measurable value. [Start the conversation](/#chat) to see what is possible.