The Bart Test - Part 10: The Stochastic Parrot and What Visible Thinking Traces Might Reveal
Comparing biological and artificial cognitive processes - are they more similar than different?
Throughout this series, I've framed OLMo's behavior as "overthinking" - doing explicit planning that humans skip. But what if that's wrong?
What if OLMo's visible thinking traces are just making explicit what human brains do unconsciously?
Back in Part 1, I watched OLMo spend 44 seconds planning how to use Gen-Alpha slang. It created literal checklists:
- Recall appropriate slang for this context
- List slang terms to incorporate
- Plan emoji usage
- Structure the story
The judges called it "trying too hard" - like doing homework instead of being natural.
But here's the question that keeps nagging me: Maybe when any of us generate natural slang, we ARE running internal checklists:
- Recall appropriate slang for this context
- Check if it sounds natural to my audience
- Adjust density based on the vibe
- Verify it won't be "cringe"
The difference might not be that humans skip this process. The difference might be:
- Speed: Milliseconds vs. 44 seconds (though we've all agonized over wording a text for longer than that)
- Visibility: Unconscious vs. explicit (though sometimes we catch ourselves mid-thought)
- Optimization target: OLMo optimizes for slang density, humans for social context (and both miss the mark sometimes)
- Calibration: OLMo's naturalness detector is miscalibrated (so is ours)
Watching OLMo's thinking traces and writing the previous nine parts of this series made me think more about what is happening in my own head. I can only speculate on what it's like in other human heads, that voice inside your head is telling you, and how much awareness others have of this inner voice.
This brings up the "stochastic parrots" debate. The term, coined by Emily Bender and colleagues, frames LLMs as systems that statistically mimic text without real understanding—pattern matching without meaning.
How often is that (being a stochasitic parrot) what I'm actually doing? Am I, a human, fundamentally different from LLMs, or am I just running similar pattern-matching processes unconsciously much of the time?
Many people find this possibility that we're just "stochastic parrots" insulting. That we might be biological stochastic parrots, faster and better calibrated most of the time, challenges our sense of what makes us uniquely human. I find it oddly calming. It takes some pressure off to be original and unique. If this turns out to be the case, I'm OK with it.
But I understand why it's threatening. If pattern matching is all there is, for LLMs and for humans, then what separates us? What makes humans special? These aren't just technical questions about cognition. They're existential questions about human uniqueness.
It also reminds me of Richard Dawkins' The Selfish Gene hypothesis which argues that we're vehicles for replicating genes, not the autonomous agents we feel ourselves to be. For me, The Selfish Gene doesn't diminish the value of my human experience. It just shifts the lens.
Could the same thing apply here? If OLMo's visible thinking traces reveal that language generation involves checklists and optimization, whether conscious or unconscious, that doesn't make human connection less meaningful. For me, no, it just makes the process more observable.
I don't have the answer. But the Bart Test is interesting to me because this work gave me an opportunity to consider what language generation looks like when it's slowed down enough to observe it. In many ways, what I enjoyed most about this experiment is that it led me to re-examine questions about what it means to be human.
Part 10 of 10 in the Bart Test series.
bart-test - View on GitHub
All Parts
- Part 1 - The OLMo discovery
- Part 2 - Constraint experiments
- Part 3 - Teen insights
- Part 4 - First wall
- Part 5 - The pivot
- Part 6 - Test design philosophy
- Part 7 - Social cost and value questions
- Part 8 - Reflection
- Part 9 - The question I couldn't answer
- Part 10 - The deeper question
This post is part of my AI journey blog at Mosaic Mesh AI. Building in public, learning in public, sharing the messy middle of AI development.
