It’s actually a fairly involved process because the tokens representing 1 and 4 don’t have any mathematical correlation with the numbers 1 and 4 so you can’t math them directly to get to 5.
Apparently how they do it is by a series of approximations from big numbers to small numbers, not too dissimilar from the way a human would do it. The anthropic team published a paper about it recently, I can dig it up if you’re interested.
I’m surprised it managed to do addition correctly.
Additional is fairly trivial for a neural network to learn.
Weight 1 plus weight 2 equals output is literally the baseline model structure.
It’s actually a fairly involved process because the tokens representing 1 and 4 don’t have any mathematical correlation with the numbers 1 and 4 so you can’t math them directly to get to 5.
Apparently how they do it is by a series of approximations from big numbers to small numbers, not too dissimilar from the way a human would do it. The anthropic team published a paper about it recently, I can dig it up if you’re interested.