what can language model embeddings tell us about whales speech, and decoding ancient texts? (on The Platonic Representation Hypothesis and the idea of *universality* in AI models)
Grateful for this read, which introduced me to the convergent representation hypothesis. I'm a neuroscience instrumentation engineer, not a computer scientist, and I've been following closely the developments in compressive sensing because I think these ideas may be important to brain recording. Anyway, I'm not sure I grasp your final point, or how these ideas relate--you suggest that we can hope to decode the few samples of Linear A we have by leveraging an otherwise complete corpus of language embeddings? At some point, the limited amount of Linear A we have still makes this a very hard inversion problem. (Luckily we can continue to record the whales...)
This is a very bold claim. If I take this idea to, what I think, is its logical conclusion, then are you suggesting that representations in neural networks have the converge to representations inside our brains? In other words, are you claiming that the platonic representations are completely agnostic of the substance that is generating those representations?
Thank you. So some thoughts: first multiple different organisations are ploughing bazillions each into creating more or less the same thing- somewhat redundant and inefficient duplicative effort n’est pas?; second, this is a platonic ai 2025- platonic ai 1925 was v different (racist, misogynist etc) and so will platonic ai 2125 because we are using time stamped culturally curated data; finally; whale song, bee dances and ant smells will be super cool if we haven’t bumped them off by 2050…
Fascinating read! Two questions that came up while reading:
(1) What if these models converge to the same but "wrong" abstraction in a domain? "Wrong" as in it doesn't reflect the actual truth of the world.
(2) How much does training data overlap contribute to true convergence?
Imo, the first question matters more because humans can create value if they find places where models' convergent representations veer away from reality.
Very nicely written, thank you very much for this blog, on a side note I am a bit skeptical on decoding Linear A , whale language, doesn't PRH rely on the fact that all the models are seeing the same world, where as the world as seen by Linear A or whale language are quite different from our own
Isn't it the case that the world isn't different but rather that it's different animals with different brains seeing a subset of the world ? There's much more overlap of brains and what's being seen in the case of the Ancient Greeks, ofc.
But even for animals with less overlap, it's possible that they have a representation of the world not too dissimilar from our own.
Grateful for this read, which introduced me to the convergent representation hypothesis. I'm a neuroscience instrumentation engineer, not a computer scientist, and I've been following closely the developments in compressive sensing because I think these ideas may be important to brain recording. Anyway, I'm not sure I grasp your final point, or how these ideas relate--you suggest that we can hope to decode the few samples of Linear A we have by leveraging an otherwise complete corpus of language embeddings? At some point, the limited amount of Linear A we have still makes this a very hard inversion problem. (Luckily we can continue to record the whales...)
Great article, it gave me inspiration for exploring the embedding space in a new way!
This is a very bold claim. If I take this idea to, what I think, is its logical conclusion, then are you suggesting that representations in neural networks have the converge to representations inside our brains? In other words, are you claiming that the platonic representations are completely agnostic of the substance that is generating those representations?
Thank you. So some thoughts: first multiple different organisations are ploughing bazillions each into creating more or less the same thing- somewhat redundant and inefficient duplicative effort n’est pas?; second, this is a platonic ai 2025- platonic ai 1925 was v different (racist, misogynist etc) and so will platonic ai 2125 because we are using time stamped culturally curated data; finally; whale song, bee dances and ant smells will be super cool if we haven’t bumped them off by 2050…
Fascinating read! Two questions that came up while reading:
(1) What if these models converge to the same but "wrong" abstraction in a domain? "Wrong" as in it doesn't reflect the actual truth of the world.
(2) How much does training data overlap contribute to true convergence?
Imo, the first question matters more because humans can create value if they find places where models' convergent representations veer away from reality.
1. Humans do this too, they're called optical illusions and cognitive biases
Very nicely written, thank you very much for this blog, on a side note I am a bit skeptical on decoding Linear A , whale language, doesn't PRH rely on the fact that all the models are seeing the same world, where as the world as seen by Linear A or whale language are quite different from our own
Isn't it the case that the world isn't different but rather that it's different animals with different brains seeing a subset of the world ? There's much more overlap of brains and what's being seen in the case of the Ancient Greeks, ofc.
But even for animals with less overlap, it's possible that they have a representation of the world not too dissimilar from our own.