AI Training

ekZepp@lemmy.world · edit-2 3 months ago

AI Training

dependencyinjection@discuss.tchncs.de · 3 months ago

Can you elaborate on what you mean, for a layman?

Dadifer@lemmy.world · 3 months ago

The neural network is 100s of billions of nodes that are connected to each other with connections of different strengths or “weights”, just like our neurons. Open source weights means that they released the weight of connections between the nodes, the blueprint of the neural network, if you will. It is not open source because they didn’t release the material that it was trained on.

dependencyinjection@discuss.tchncs.de · 3 months ago

Thanks.

Are there any models that are truly open source where they have shown the datasets it was trained on?

Dadifer@lemmy.world · 3 months ago

Not that I know of

HappyFrog@lemmy.blahaj.zone · 3 months ago

There are probably several “mnist” or other smaller networks that are fully open sourced. But that’s not the kind of neural networks most are talking about.

Lemminary@lemmy.world · 3 months ago

It is not open source because they didn’t release the material that it was trained on.

I’m not sure if I’m missing a definition here but open source usually means that anyone can use the source code under some or no conditions.

Dadifer@lemmy.world · 3 months ago

You can’t use the source code, just the neural network the source code generated.

LostWon@lemmy.ca · edit-2 3 months ago

deleted by creator

Spaceballstheusername@lemmy.world · 3 months ago

I’m pretty sure open source means that the source code is open to see. I’m pretty sure there is open source things that you need to pay to use.

Johanno@feddit.org · 3 months ago

Open source means bx definition that the code is open the usage is open and anybody can use it.

This includes in theory the training material for the model.

But in common language open source means: i can download it and it runs on my machine. Ignoring legal shit.

Schadrach@lemmy.sdf.org · 3 months ago

In parallel to what Hawk wrote, AI image generation is similar. The idea is that through training you essentially produce an equation (really a bunch of weighted nodes, but functionally they boil down to a complicated equation) that can recognize a thing (say dogs), and can measure the likelihood any given image contains dogs.

If you run this equation backwards, it can take any image and show you how to make it look more like dogs. Do this for other categories of things. Now you ask for a dog lying in front of a doghouse chewing on a bone, it generates some white noise (think “snow” on an old TV) and ask the math to make it look maximally like a dog, doghouse, bone and chewing at the same time, possibly repeating a few times until the results don’t get much more dog, doghouse, bone or chewing on another pass, and that’s your generated image.

The reason they have trouble with things like hands is because we have pictures of all kinds of hands at all kinds of scales in all kinds of positions and the model doesn’t have actual hands to compare to, just thousands upon thousands of pictures that say they contain hands to try figure out what a hand even is from statistical analysis of examples.

LLMs do something similar, but with words. They have a huge number of examples of writing, many of them tagged with descriptors, and are essentially piecing together an equation for what language looks like from statistical analysis of examples. The technique used for LLMs will never be anything more than a sufficiently advanced Chinese Room, not without serious alterations. That however doesn’t mean it can’t be useful.

For example, one could hypothetically amass a bunch of anonymized medical imaging including confirmed diagnoses and a bunch of healthy imaging and train a machine learning model to identify signs of disease and put priority flags and notes about detected potential diseases on the images to help expedite treatment when needed. After it’s seen a few thousand times as many images as a real medical professional will see in their entire career it would even likely be more accurate than humans.

Hawk@lemmynsfw.com · 3 months ago

An LLM is an equation, fundamentally. Map a word to a number, equation, map back to words and now llm. If you’re curious write a name generator using torch with an rnn (plenty of tutorials online) and you’ll have a good idea.

The parameters of the equation are referred to as weights. They release the weights but may not have released:

source code for training
there source code for inference / validation
training data
cleaning scripts
logs, git history, development notes etc.

Open source is typically more concerned with the open nature of the code base to foster community engagement and less on the price of the resulting software.

Curiously, open weighted LLM development has somewhat flipped this on its head. Where the resulting software is freely accessible and distributed, but the source code and material is less accessible.