What it takes to turn AI into real firm value

Everyone's talking about AI, but not everyone is actually using it to win. The gap between curiosity and capability is widening fast, and the firms pulling ahead are doing more than experimenting — they're rethinking how work happens. From custom-built financial systems to the rise of agentic finance and increasing reliability risks, this month's insights highlight a clear trend: Competitive advantage will belong to firms that move decisively from pilots to production.

 
 

What's in focus

The SaaS extinction event

What's new:

Writer and technologist Craig Mod just built a fully custom, multi-currency accounting system in five days — no dev team, no vendor, no subscription. Using Claude Code, he replaced a decade-long patchwork of Quicken, Google Sheets, Google Scripts and Japanese accounting software with a single Python application that handles cross-border income, automatic expense categorization, FX reconciliation, 1099 and K1 ingestion, and country-specific tax formatting. He calls it TaxBot2000, and the best accounting software he’s ever used.

How it works:

TaxBot2000 ingests CSV files from any bank or institution, pulls historical FX rates, categorizes transactions using learned patterns and reconciles international wire transfers — accounting for timing delays and rate discrepancies between originating and receiving banks. Mod feeds it prior tax returns as training data, drops in 1099s, K1s and PDFs, and the system categorizes and packages everything for his accountants in both US and Japanese formats. When anomalies surface, he talks directly to the AI to brainstorm batch fixes, often producing new features in the process.

Behind the news:

This isn’t a tech demo — it’s a working system replacing commercial software for a real taxpayer with genuinely complex financials: multi-country freelance income, publishing royalties, membership revenue, Shopify e-commerce, public and private investments, and expenses split across wire transfers and credit cards in multiple currencies. The pattern Mod describes, where off-the-shelf tools force painful workarounds that eventually collapse under complexity, is one every multi-entity practice or international client engagement will recognize. What’s new is that the build-it-yourself alternative just went from “hire a developer for six months” to “one motivated person, five days.”

Why it matters:

For accounting and finance professionals, this is a signal flare. The clients who today tolerate clunky integrations and manual reconciliation workarounds are about to discover they can build tools shaped precisely to their situation handling the edge cases commercial software ignores. That’s a threat to firms selling standardized workflows, and an opportunity for advisors who can help clients architect these bespoke systems or build them in-house. Mod’s TaxBot2000 already does what many firms charge advisory fees to manage: holistic cross-border financial visibility with automated categorization and tax-ready packaging. The five-day build timeline compresses a value proposition that used to justify annual engagements.

We're thinking:

The profession’s competitive moat has never been data entry or software operation; it’s judgment, interpretation and the trust clients place in that expertise. But a significant layer of billable work sits between those two poles: the wrangling, reformatting, reconciling and categorizing that consumes hours before judgment even begins. That middle layer just got automated by the client themselves. The firms that thrive won’t be the ones competing with TaxBot2000 on data processing speed. They’ll be the ones helping clients navigate an increasingly fragmented tool landscape — guiding how these systems are structured, validated and connected — and then providing the strategic advisory layer on top. The question for every practice leader isn’t whether clients will start experimenting with custom financial tools, it’s whether you’ll be the one they trust to make sense of them.

The orchestration gap

What's new:

Deloitte surveyed 9,000 business and HR leaders across 89 countries and found the thing most organizations know they need to do and almost none are actually doing. 88% of the leaders surveyed say the ability to dynamically orchestrate work — fluidly moving people, skills and resources toward problems as they emerge — is extremely or very important. While only 7% say they're making meaningful progress. The 81-point gap is one of the largest in the survey. For accounting firm leaders, that number isn't a workforce management statistic; it's a description of the competitive environment your firm is operating in right now.

How it works:

The report's central argument is that competitive advantage is shifting from what you own to how fast you can move. Scale used to be the answer — more partners, more staff, more service lines. More than half (67%) of leaders in the survey now say their primary competitive advantage over the next three years will come from speed and nimbleness, while only 28% still believe scale is the differentiator. AI is accelerating that shift by making previously scarce capabilities widely available. What it can't replicate is the organizational capacity to redirect expertise quickly, build the right team for the right problem and deliver before the window closes.

Behind the news:

The reason most firms aren't making progress is structural, not motivational. HR, finance, IT, legal — the functions that run professional services firms — were built for efficiency within silos, not speed across them. 67% of C-suite leaders say those functions need to fundamentally change to support this shift. The practice structure that optimized compliance delivery — service lines, fixed staffing models, partner-led relationship ownership — was designed for a predictable environment where the work came in a defined shape, but that environment is gone. Firms still organized around it aren't just moving slowly; they're moving in the wrong direction.

Why it matters:

The firms separating themselves in Deloitte's data aren't running transformation programs. They're embedding adaptability into how work actually gets done: real-time feedback, skills moving toward the work rather than work being assigned to fixed roles, learning that's continuous rather than episodic. For accounting firms, the translation is specific: the production of the deliverable is no longer the differentiator. AI is compressing that. What clients will pay for — and what firms will compete on — is the speed and quality of the judgment behind the deliverable, and the organizational agility to bring the right expertise to bear before the moment passes.

We're thinking:

The 81-point gap is a structural problem, not a strategy problem. Almost every firm leader we talk to understands that orchestration matters. What they're running into is that the billing model, the partner compensation model, the service line structure and the client relationship model all create friction against exactly this kind of fluid redeployment. You can't orchestrate your way to agility while the incentive system rewards the opposite. The firms that close the gap first won't do it by adding an AI task force or approving a new technology budget. They'll do it by asking the question most firms are still avoiding — what has to change about how we're organized, how we measure performance and how we price our work — before the firm down the street makes the answer obvious for us.

The architecture of agentic finance

What's new:

A UCL working paper published this month does something the AI-in-finance conversation mostly avoids: it builds a framework for what it means when AI doesn't just assist a financial workflow but runs it. The paper calls this agentic finance, settings where AI systems move through the full chain of information gathering, reasoning, decision-making and execution in ways that are economically consequential. The framing matters because it forces a harder question than "is AI useful." It asks what happens to markets, firms and professions when the workflow itself becomes automated.

How it works:

The paper's central architecture breaks financial AI agents into four layers: data perception, reasoning, strategy generation and execution with control. That last layer is where the profession-level implications live. Most deployed systems today operate under what the paper calls bounded autonomy (the agent handles the workflow, human oversight handles accountability). What varies across firms is where exactly that boundary sits, how visible the agent's reasoning is to supervisors and how much of the infrastructure is shared with everyone else running similar systems.

Behind the news:

The most important finding isn’t about individual AI capability, it’s about population behavior. When many institutions run similar models on similar data through shared infrastructure, their decisions correlate in ways no single firm intended. The systemic risk isn't the smart agent. It's 1,000 agents making the same call at the same time. The paper runs an event study on this directly around Anthropic's Feb. 2026 disclosures on legacy code modernization, firms whose revenues depended on legacy maintenance saw cumulative abnormal returns of -6.4% in three days. Markets priced workflow substitution before deployment was even measurable.

Why it matters:

Wherever revenue is tied to the labor-intensity of a deliverable, the substitution logic holds. That is not a trading-desk problem. Audit fieldwork, tax preparation, compliance documentation, due diligence — the same four-layer architecture applies. The governance principles the paper derives like traceability, bounded autonomy, diversity of tools, embedded controls and supervisory visibility are the questions accounting firms will be required to answer. The firms working through those answers now will have documentation, while the rest will be improvising under pressure.

We're thinking:

According to the paper, bounded autonomy is not a gap to close — it’s the correct architecture for a profession built on accountability and professional standards. The AI runs the workflow; the CPA owns the outcome. What the paper makes clear is that distinction has to be deliberately designed. Assume it and you'll get it wrong. The firms that define where the human sits, what the agent is authorized to do and how both can be audited aren't just building better workflows. They're building the answer to the question their clients and regulators are already starting to ask.

The fact-check tax

What's new:

Anthropic interviewed 80,508 Claude users across 159 countries and 70 languages – what it claims is the largest qualitative study ever conducted – to understand what people actually want and fear from AI. The headline number for finance professionals isn’t the scale, but this: people in high-stakes professions – law, finance, government, and healthcare – cite AI unreliability at nearly twice the average rate. Your peers are using AI heavily and getting burned at a disproportionate clip.

How it works:

Anthropic used its AI Interviewer – a version of Claude prompted to conduct conversational interviews – to ask each participant about their hopes and concerns, then used Claude-powered classifiers to categorize every response. The result bridges the traditional research trade-off between depth and volume: open-ended qualitative insight at survey-level scale. Concerns were multi-label, meaning a single interview could receive multiple codes, since respondents tended to articulate several distinct worries rather than one, yielding an average of 2.3 distinct concerns per person.

Behind the news:

The study reveals something more structurally interesting than a simple “people like AI” finding. Unreliability was the most common concern overall at 27%, and it was the only tension in which the negative overshadowed the positive, with 37% lamenting AI’s unreliability versus 22% citing decision-making benefits. Crucially, both sides were deeply rooted in experience: 88% of those describing decision-making benefits and 79% of those describing harms had witnessed it directly (think of it as actual scar tissue). One respondent described it precisely: a “permanent fact-check tax” – an assistant that sounds sure but is often wrong, forcing you to treat everything as suspect, so that instead of freeing attention, AI creates more of it.

Why it matters:

Finance and accounting sit exactly at the intersection of the two highest-stakes AI failure modes this study surfaces: unreliability in high-consequence decisions and cognitive atrophy from over-reliance. Nearly half of all lawyers in the study – the closest professional analog to CPAs in terms of judgment-dependent, liability-carrying work – reported encountering AI unreliability firsthand, yet they also report the highest rates of realized decision-making benefits. The profession is both the biggest winner and the most exposed. For practitioners in audit, tax advisory and financial reporting, this is precisely the risk profile to manage: AI accelerates your work until the moment it confidently gives you a wrong answer that compounds through a client deliverable. One researcher described being caught in “a large, slow hallucination — answers that were internally consistent, confident and wrong in subtle but compounding ways.” That sentence should be printed and pinned above every AI-assisted workflow in your firm.

We're thinking:

The study highlights a professional identity under pressure. The concern about cognitive atrophy was mentioned by 16% of respondents, and educators reported witnessing it in students at 2.5 to 3 times the baseline rate. The implication: atrophy is real, measurable and already showing up in people who use AI as a shortcut rather than a tool. For accountants and CPAs, the professional and legal stakes of that distinction couldn’t be higher. It’s not about the question of using AI or not, but rather whether your firm is building AI workflows that preserve the judgment that makes your work defensible or quietly outsourcing it.

Subscribe to the AI in Focus newsletter