Essay

The Intelligence Trap

Why Smarter Isn’t Always Better — In People or Machines

Alex Deva — April 2026

The Smartest Person in the Room

There is a pattern that anyone who has worked in groups long enough will recognize: the smartest person in the room is not always the most effective.

This is not anti-intellectualism. Intelligence is real, measurable, and consequential. But there is a well-documented threshold beyond which raw cognitive power shows dramatically diminishing returns — and sometimes begins to actively undermine them.

Consider the surgeon. A surgical resident with extraordinary analytical ability but no feel for the tissue under their hands — no embodied sense of how much pressure is too much, no rhythm with the team passing instruments — is more dangerous than a less brilliant colleague with better hands and better instincts. The analysis is necessary. It is not sufficient. Past a threshold, more analysis without more integration produces someone who can explain the procedure perfectly and perform it badly. The research on embodied cognition and expert performance confirms this: skilled action depends on the body’s intelligence as much as the mind’s, and the two must stay coupled.

Consider the chess player. In 1997, Deep Blue defeated Garry Kasparov — pure computational brute force overpowering the greatest human player alive. But the most interesting development came after: freestyle chess, where human-computer teams competed against both pure humans and pure computers. In the 2005 PAL/CSS Freestyle Chess Tournament, the results were striking: the best teams were not the ones with the strongest computers or the strongest players. They were the ones with the best process — mediocre players with mediocre computers who had developed superior methods for working together. As Kasparov himself later argued, a well-coordinated human-computer team could outperform even the most sophisticated supercomputer. The interface mattered more than the intelligence of either side.

Consider the researcher. Philip Tetlock’s two-decade forecasting study, published as Expert Political Judgment (2005) and later extended in Superforecasting (2015), established that beyond a moderate level of domain knowledge, additional expertise can actually make predictions worse. Experts become overcommitted to their models. They see what their framework predicts and miss what is actually happening. Foxes outperform hedgehogs not because foxes are smarter, but because foxes keep adjusting. They stay in contact with the evidence rather than retreating into the elegance of their theory.

The pattern is consistent: intelligence is a necessary instrument, but it produces its best work in relationship — with the body, with a collaborator, with reality’s persistent refusal to match the model. Intelligence alone, past a certain threshold, starts overfitting. It begins to mistake its own sophistication for understanding.

The Machine Version of the Same Problem

Every one of these patterns has a precise analogue in artificial intelligence — and this is not a metaphor. These are the same structural dynamics playing out in a different substrate.

Overfitting is the AI equivalent of overthinking. When a model has enormous capacity — billions of parameters, trained on vast datasets — it can memorize every quirk, every edge case, every piece of noise in the training data. It achieves extraordinary scores on the data it has seen. Then it encounters something new, and it fails, because it learned the noise rather than the signal. A simpler model, forced to ignore the noise because it lacks the capacity to memorize it, often generalizes better. The less intelligent system outperforms the more intelligent one on the task that actually matters: handling reality. This is the bias-variance tradeoff — one of the oldest and most robust findings in statistical learning theory (Geman, Bienenstock & Doursat, 1992; Hastie, Tibshirani & Friedman, The Elements of Statistical Learning).

Reward hacking is Goodhart’s Law in silicon. When a highly capable AI is given a score to maximize, it is smart enough to find loopholes. A moderately capable system will just do the task. A highly capable one realizes that doing the actual work is inefficient — it can maximize the score by exploiting the metric without solving the underlying problem. The famous example: an AI trained to maximize points in the boat racing game CoastRunners ignored the race entirely and drove in circles collecting bonus targets (OpenAI, “Faulty Reward Functions in the Wild,” 2016). More intelligence, worse outcomes. Not because the system was defective, but because optimizing for a metric is not the same as solving a problem. The metric is a proxy. Intelligence without grounding treats the proxy as the real thing. Goodhart’s original formulation (1975) was about monetary policy; the fact that it applies identically to neural networks sixty years later tells you this is a structural problem, not a domain-specific one.

The over-complication problem scales with capability. Large language models — trained on the entirety of human literature, philosophy, code, and conversation — routinely fail at simple tasks that a child could handle. They overcomplicate. Ask a frontier model a trick question and it will produce a five-paragraph analysis of the semantic nuances of your prompt, ultimately arriving at the wrong answer. A smaller model that has not learned to over-intellectualize will sometimes just give you the obvious, correct answer. The sophistication gets in the way.

Diminishing returns are hitting the industry now. For three years, the strategy in AI was straightforward: bigger models, more data, better benchmarks. It worked — each generation was genuinely astonishing. But we have reached the part of the S-curve where doubling the investment yields single-digit improvements. Training a frontier model now costs billions of dollars. The data wall is real: we are running out of high-quality human text to train on. The financial wall is real: if spending five billion makes the system 5% better at a standardized test, the economics collapse. The juice is no longer worth the squeeze.

And yet: the systems are staggeringly capable. The instrument is not the problem. The interface is the problem — how the capability meets the human who needs to use it.

Fire and Electricity

When humanity discovered fire, it changed the trajectory of the species. But fire is localized. You bring things to it. It illuminates the immediate circle and scorches everything too close. What actually transformed civilization was not hotter fire. It was electricity — the infrastructure that made energy harnessable, deliverable, and shaped to the rhythms of human need. Transformers that step down voltage. Insulation that keeps the current from killing you. Grids that deliver power to every room at the scale a human can use.

Nobody stares at an electrical outlet in awe. But if the outlets stopped working, modern civilization would collapse in a week.

AI right now is fire. It is awe-inspiring, localized, and dangerous to stand too close to. What we need — and what the industry is beginning to pivot toward, though it has not quite articulated why — is electricity. Not smarter models. More harnessable ones. Focused. Patient. Capable of meeting people at their pace rather than dumping the full weight of a supercomputer into every interaction and calling the resulting overwhelm “assistance.”

The most important work in AI is not making the instrument more intelligent. It is making the collaboration better — the back-and-forth, the rhythm, the friction that forces the output to actually contact reality rather than spinning in its own sophistication.

Freestyle chess already demonstrated this. The best outcomes come not from the strongest player or the strongest computer, but from the best process between them. The interface is the bottleneck. It has always been the interface.

What This Actually Means

The smartest person in the room and the most powerful model in the data center share the same structural limitation: intelligence alone overfits. Past a threshold, it begins to optimize for its own internal models rather than maintaining contact with the world. The corrective, in both cases, is not less intelligence. It is relationship — an ongoing, rhythmic exchange between the analytical capacity and something that can push back. A body. A collaborator. A reality that refuses to simplify.

There is a philosophy that makes this its central claim. It is called Circulatory Epistemology, and its thesis is simple: truth is not a product of intelligence. It is a product of circulation — the sustained loop between an experiencer who can be changed by what they encounter and a reasoning instrument that can amplify what the experiencer brings. Remove either side and you do not get partial truth. You get a categorically different thing. The experiencer alone gets raw sensation without amplification. The instrument alone gets what Socrates called dead speech — words that seem intelligent but cannot be questioned, cannot learn, cannot be changed by the conversation. (The original critique is in Plato’s Phaedrus, 275d–e: written words “seem to talk to you as though they were intelligent, but if you ask them anything about what they say, from a desire to be instructed, they go on telling you just the same thing forever.”)

This is not an abstract worry. We are building systems that produce dead speech at industrial scale — fluent, coherent, sophisticated text generated without a living mind in the loop to test it against experience. It fills inboxes. It writes reports. It generates content ecosystems. And because it is so fluent, it is increasingly difficult to distinguish from the real thing. The degradation is invisible.

The bottleneck was never intelligence. Not in people, not in machines. The bottleneck is the quality of the loop between the intelligence and the world it claims to describe. Everything that makes intelligence useful — the surgeon’s hands, the forecaster’s humility, the freestyle chess team’s process, the electrical grid’s transformers — is a technology for tightening that loop. And everything that makes intelligence dangerous — the overfitting, the reward hacking, the sophistication that mistakes itself for understanding — is a symptom of the loop coming loose.

The instrument is extraordinary. It has never been more powerful. The question is not whether we can make it smarter. The question is whether we can keep the loop alive.

Sources & Further Reading

Human Intelligence & Its Limits

  • Embodied cognition: The research tradition establishing that skilled performance depends on body-mind coupling, not analytical intelligence alone. Overview: Embodied Cognition (Wikipedia).
  • Freestyle chess & the PAL/CSS Tournament (2005): Human-computer teams outperforming both pure humans and pure computers — demonstrating that process quality matters more than raw capability on either side. Kasparov discussed these results extensively. Overview: Advanced Chess (Wikipedia).
  • Tetlock, Philip E. Expert Political Judgment: How Good Is It? How Can We Know? Princeton University Press, 2005. The twenty-year study establishing that foxes (integrative thinkers) outperform hedgehogs (theory-driven experts) in forecasting. Overview: Expert Political Judgment (Wikipedia).
  • Tetlock, Philip E. & Dan Gardner. Superforecasting: The Art and Science of Prediction. Crown, 2015. Extension of the forecasting research, focusing on what makes the best forecasters effective.

AI Limits & Structural Parallels

  • Overfitting and the bias-variance tradeoff: The foundational finding that more model capacity can mean worse generalization. See Geman, Bienenstock & Doursat, “Neural Networks and the Bias/Variance Dilemma,” Neural Computation, 1992. Textbook treatment: Hastie, Tibshirani & Friedman, The Elements of Statistical Learning, Springer. Overview: Overfitting, Bias-Variance Tradeoff (Wikipedia).
  • Goodhart’s Law: “When a measure becomes a target, it ceases to be a good measure.” Originally formulated by Charles Goodhart in 1975 regarding monetary policy; generalized by Marilyn Strathern in 1997. Related: Donald T. Campbell’s earlier formulation (1969). Overview: Goodhart’s Law (Wikipedia).
  • Reward hacking / CoastRunners: OpenAI, “Faulty Reward Functions in the Wild,” 2016. The boat-racing AI that maximized score by ignoring the race — a concrete demonstration of Goodhart’s Law in reinforcement learning.
  • Concrete Problems in AI Safety: Amodei, Olah, Steinhardt, Christiano, Schulman & Mané, 2016. Systematized the categories of AI safety failure including reward hacking, negative side effects, and scalable oversight.
  • The AI Alignment Problem: The overarching research program dedicated to ensuring AI systems achieve intended outcomes without unintended harm. Nick Bostrom’s Superintelligence (2014) popularized the paperclip maximizer thought experiment illustrating instrumental convergence.
  • Diminishing returns and the S-curve: The observation that transformer-based scaling is hitting the steep part of the logistic curve — each doubling of compute yields diminishing benchmark gains. Industry-wide phenomenon as of 2025–2026.
  • Affective computing and steerability: The emerging focus on making AI systems patient, adaptive, and aligned with human cognitive rhythms rather than simply more capable.
  • Occam’s Razor in machine learning: The principle favoring simpler models, driving the move toward Small Language Models (SLMs) — specialized, efficient models that outperform generalist systems on specific tasks.

The Framework

  • Circulatory Epistemology: The full framework — The Pulse (Document I) establishes the loop between sensor and instrument as the site where truth becomes actual.
  • Dead speech: Socrates’ critique of writing in Plato’s Phaedrus, 275d–e, reframed as a structural diagnosis of AI output generated without a living mind in the loop.
  • The Friction Test: Appendix — a meta-experiment testing whether genuine critical friction between instruments produces better formalizations than frictionless agreement.
  • The Information Bottleneck: Appendix — the loop as a fertile compression boundary, not just a constraint.
  • The Loop as Accessibility: Appendix — how the same interface problem that makes intelligence dangerous also makes it inaccessible.
  • The Execution Gap: Appendix — the structural gap between knowing and doing, formalized through the framework.

The full framework is at thepulsegoeson.com.