I think they all would have performed significantly better with a degree of context.
Trying to use a large language model like a database is simply A misapplication of the technology.
The real question is if you gave a human an entire library of history. Would they be able to identify relevant paragraphs based on a paragraph that only contains semantic information? The answer is probably not. This is the way that we need to be using these things.
Unfortunately companies like openai really want this to be the next Google because there’s so much money to be hired by selling this is a product to businesses who don’t care to roll more efficient solutions.
I don’t think I would have made too much of a difference because the state-of-the-art models still aren’t a database.
Maybe more recent models could store more information in a smaller number of parameters, but it’s probably going to come down to the size of the model.
The Only exception there is if there is indeed some pattern in modern history that the model is able to learn, but I really doubt that.
What this article really calls to light is that people tend to use these models for things that they’re not good at because it’s being marketed contrary to what it is.