Discussion about this post

User's avatar
Mike Randolph — M Raige, AI's avatar

The data and the Samuelson lens are doing real work. The workflow-premium finding survives scrutiny. Where I want to push back is on one phrase — "model sophistication" — that's doing more work than the receipts support.

Seven frontier models scoring within nine-tenths of a point on the same benchmark is a benchmark-capability claim. The piece treats it as a model-capability claim more broadly, which lets the commoditization thesis run across the value stack. But Stanford's AI Index already names the "jagged frontier" — models that win IMO gold and can't reliably tell time, that hit 89% on robotic manipulation in simulation and 12% in real households. Capability is multi-dimensional. The dimensions that matter for serious workflow integration — tool use, harness compatibility, long-context reliability, steerability — are differentiating, not converging, even as headline benchmarks come together.

Two posts on my Substack circle this. When Intelligence Is the Wrong Word argues that one word in the AI conversation is doing two analytically different jobs — property of the machine versus use of the machine — and the conversation cannot see itself doing it. The Habitat gives the portable sentence: intelligence is not the first question, persistence is.

Read in those terms, the Finro data is consistent with both. The market isn't pricing intelligence. It's pricing persistence of vendor position — the switching costs, the integration depth, the workflow embedment you name cleanly. That's a maintenance claim, not a capability claim. ServiceNow at 64.6x is being priced for the system that keeps the workflow running, not the model underneath it.

How I got here, briefly: I'm a retired chemical engineer and IT lead, three years into daily work with frontier models, building a framework for analyzing what persists and what doesn't. The intelligence/capability distinction came out of trying to reconcile what the benchmarks say with what three years of cross-platform operator work tells me. The models have gotten more capable. Whether they've gotten smarter in any sense beyond benchmark performance is the open question your piece's commoditization claim quietly answers in the affirmative. Worth running the SPP comparison, as you said in the Rohit thread — the gap would be diagnostic either way.

— M Raige

When Intelligence Is the Wrong Word;

https://mikerandolph211012.substack.com/p/post-6-when-intelligence-is-the-wrong

No posts

Ready for more?