Technical due diligence has a well worn playbook. Reviewers check code quality, test coverage, infrastructure spend, and security posture, then price the findings into the deal. That playbook was written for systems that behave deterministically. It was not written for products with a large language model sitting in the request path, and most diligence teams still treat that model as a line item rather than an architecture that deserves its own review.

The gaps show up quickly once you look. Prompts are often hardcoded in application logic rather than version controlled, so nobody outside the original engineering team can say with confidence what the system actually does or how it changes over time. Evaluation is frequently informal: a handful of manual spot checks instead of a repeatable harness that measures accuracy, drift, and regression before every release. Vendor dependency is rarely mapped explicitly, meaning a buyer can inherit a product that only functions because of a specific model version the vendor may deprecate within a year. None of this shows up in a standard code review, yet all of it directly affects what the acquired business is worth.

The financial exposure is not hypothetical. A model deprecation forces an unplanned re-engineering project the week after close. A missing evaluation pipeline means quality regressions surface through customer complaints instead of pre-release testing, damaging retention right when the new owner is trying to prove the deal thesis. Unclear provenance on training or fine-tuning data creates compliance exposure that only becomes visible when a customer or regulator asks where the data came from. Each of these is a cost that should have been priced into the transaction, but rarely is, because nobody asked the right questions during diligence.

A proper AI audit inside technical due diligence asks a specific set of questions. Where do prompts and system instructions live, and are they under source control with change history. Is there an evaluation and observability layer that catches quality regressions before customers do. What is the actual cost per request today, and how does that curve behave as usage scales past current volume. How deep is the dependency on a single model provider, and what would a forced migration cost in engineering time. Who owns the rights to any data used for fine-tuning or retrieval, and is that ownership documented anywhere a lawyer can rely on.

For founders and operators preparing to sell, the response is straightforward: treat AI architecture with the same rigor as financial statements well before a buyer ever asks. Move prompts into version control, stand up an evaluation harness, document model dependencies and their fallback paths, and establish a clean record of data provenance. None of this is expensive relative to the value it protects at close.

AI capability is becoming a genuine valuation lever in software M&A, in both directions. Sellers who can show disciplined AI architecture will defend their multiple. Buyers who skip the AI audit will eventually pay for what they didn’t check.