> The OthelloGPT world-model story faced a new complication when, in mid-2024 a group of student researchers released a blog post entitled “OthelloGPT Learned a Bag Of Heuristics.” The authors were part of a training program created by DeepMind’s Neel Nanda, and their project was to follow up on Nanda’s own work, and do careful experiments to look more deeply into OthelloGPT’s internal representations. The students reported that, while OthelloGPT’s internal activations do indeed encode the board state, this encoding is not a coherent, easy-to-understand model like, say, an orrery, but rather a collection of “many independent decision rules that are localized to small parts of the board.” As one example, they found a particular neuron (i.e., neural network unit) at one layer whose activation represents a quite specific rule: “If the move A4 was just played AND B4 is occupied AND C4 is occupied, then update B4 and C4 and D4 to ‘yours’ [assuming the mine, yours, or empty classification labels]”. Another neuron’s activation represents the rule “if the token for B4 does not appear before A4 in the input string, then B4 is empty.” I’ve never understood why this is a problem. This is how all thinking actually works. When I’m solving a problem, im not inventing a solution de novo every time I do it. I’m using heuristics. This thing X generally leads to thing Y and therefore I need to do A, B and C to correct for it. That’s a heuristic. It’s also how we predict the weather. Low pressure meets high pressure means rain likely so wear a poncho or carry an umbrella. We’d call that thinking, but it’s basically using heuristics X -> Y requiring solution A. When I solve math equations, it’s nothing but procedures and heuristics. PEMDAS as order or operations, the math operations being basically procedures. And when using mathematics to solve a problem, then you’ll basically be deciding which heuristics of mathematics to use. Even making political predictions are based on heuristics gleaned from history. XY and Z happen during the rise of revolutionary thinking. Therefore if you see this predict revolution. To be Frank, even a world model is basically a systematically constructed bag of heuristics. The religious world view: God exists, gave us rule book X, and those who follow rule book X get rewarded. Therefore do what rule book X says. The secular world view replaces Rule Book X with principles derived from science and neoliberalism, but the basic building blocks are the same. These heuristic principles lead to good outcomes, thus doing them is a good idea. It covers more domains than your ad hoc heuristics as AI is using them today, but the difference isn’t the approach, it’s the scale. A person living by Torah or KJV Bible is still using heuristics to figure out how to live, the question is scale, as the sacred book in question covers everything where AI tends to be bound to whatever applicable training sets it was given.
It is very much absolutely positively 100% not. You are using LONG TERM heuristics. Your world model is of the world, generalized across your life-long experience. The ruleset for Tai Chi and the ruleset for ballroom dance have overlap with the ruleset you learned skipping rope and the ruleset you learned playing hopscotch. The quote you listed above illustrates that there are no heuristics that even apply to the whole board of Othello. The LLM didn't even learn that all squares on the board are equal. This, again, is a world model. If you train an LLM on math, it will not come up with PEMDAS. It will come up with thousands of patchwork rules covering individual numbers because there is no methodology to markov chains that gives you an overall picture. THIS IS THE IMPORTANT BIT. It all goes back to autocomplete on your keyboard, which all goes back to Robert Mercer, which all goes back to Renaissance Capital, which all goes back to Markov chains, which CAN. NOT. BE. complete sets. "Informally, this may be thought of as, "What happens next depends only on the state of affairs now." Your heuristics are "here's how to play chess." The LLM's heuristics are "if this was the last move, here are the list of legal next moves" times literally every possible permutation of the board. Your heuristics of "here's how to play chess" can be extrapolated to "here's how you would probably play 3d chess" and "here's what the rules for 'battle chess' might be" and "here are the similarities and differences between chess and checkers." The LLM's heuristics are "I have no training data for that" three ways. The key phrase there is "systematically." That makes it a set of heuristics. They are interrelated, interdependent and extensible. They are portable from situation to situation and they can be generalized. The word "bag" is used instead of "set" because there is no system. There are no interrelations, there is no interdependency and there is no extensibility. LLMs never learned that there are only five fingers on a hand. Six-fingered men had to be hand-coded out of the model. LLMs never learned about perspective. Perspective had to be (pain-stakingly) coded into the model. There is no adaptability to LLMs. In order to solve their blind alleys they have to be hand-coded around them - you cannot teach an LLM "do not reproduce copyrighted material" you have to give it a laundry list of the parts of training data it cannot reproduce within a certain percentage match. It cannot go "I must not draw Mario, therefore I must not draw Luigi." It's not the scale, it's the approach. Any creature that thinks will create generalizations. That's how T-mazes work: will the rat associate the left turn with a reward, and take that left turn reward to the next maze. LLMs do not create generalizations, they create ad-hoc frameworks to report the stochastic mean of the problem in front of them right now. Here's Manhattan as mapped by an LLM: Will it navigate? No. Will it give mostly-accurate turn-by-turn directions based on the inputs and outputs of the training data? Yes. But it will NEVER generalize to "street intersections are almost always ninety degrees." This is how all thinking actually works.
When I’m solving a problem, im not inventing a solution de novo every time I do it. I’m using heuristics.
PEMDAS as order or operations, the math operations being basically procedures.
To be Frank, even a world model is basically a systematically constructed bag of heuristics.
It covers more domains than your ad hoc heuristics as AI is using them today, but the difference isn’t the approach, it’s the scale.