Z4rK@lemmy.worldOPtoTechnology@lemmy.world•Andrej Karpathy endorses Apple IntelligenceEnglish
0·
7 months agoHe sort of invented it, so you have to think he’s commenting on the concept here, not the implementation.
I have tried a lot of medium and small models, and there it just no good replacement for the larger ones for natural text output. And they won’t run on device.
Still, fine-tuning smaller models can do wonders, so my guess would be that Apple Intelligence is really 20+ small and fine tuned models that kick in based on which action you take.
It goes a tad bit beyond classical conditioning… LLM’a provides a much better semantic experience than any previous technology, and is great for relating input to meaningful content. Think of it as an improved search engine that gives you more relevant info / actions / tool-suggestions etc based on where and how you are using it.
Here’s a great article that gives some insight into the knowledge features embedded into a larger model: https://transformer-circuits.pub/2024/scaling-monosemanticity/