Wondering about services to test on either a 16gb ram “AI Capable” arm64 board or on a laptop with modern rtx. Only looking for open source options, but curious to hear what people say. Cheers!
Wondering about services to test on either a 16gb ram “AI Capable” arm64 board or on a laptop with modern rtx. Only looking for open source options, but curious to hear what people say. Cheers!
That koboldcpp is pretty interesting. Looks like I can load a draft model for spec decode as well as a pile of other things.
What local models have you been using for coding? I’ve been disappointed with things like deepseek-coder and the qwen-coder, it’s not even a patch on Claude, but that damn cost for anthropic has been killing me.
As much as I’d like to praise the open-weight models. Nothing comes close to Claude sonnet in my experience too. I use local models when info are sensitive and claude when the problem requires being somewhat competent.
What setup do you use for coding? I might have a tip for minimizing claude cost you depending on what your setup is.