

The crawlers for LLM are not themselves LLMs.
The crawlers for LLM are not themselves LLMs.
Your examples where an LLM is defending a position you chose for it while producing obviously conflicting arguments actually proves what the others have been telling you. This is meaningless slop. It clearly has no connection to any position an LLM might have appeared to have on a subject. If it did, you would not be able to make it defend the opposite side without objections.
The article kind of fumbles the wording and creates confusion. There are, however, some passages that indicate to me that the actual data was recovered. All of the following are taking about the NAND flash memory.
The engineers quickly found that all the data was there despite Tesla’s previous claims.
…
Now, the plaintiffs had access to everything.
…
Moore was astonished by all the data found through cloning the Autopilot ECU:
“For an engineer like me, the data out of those computers was a treasure‑trove of how this crash happened.”
…
On top of all the data being so much more helpful, Moore found unallocated space and metadata for snapshot_collision_airbag‑deployment.tar’, including its SHA‑1 checksum and the exact server path.
It seems that maybe the .tar file itself was not recovered, but all the data about the crash was still there.
Forensic analysis managed to retrieve this data, so it must have been stored in non-volatile memory.
I see a few top level comments agreeing with the sentiment that users are being entitled or abusive, but what are they actually referring to? The linked image certainly has no evidence of such behavior. Someone who claims to be the developer filed a deletion request for the duckstation-git AUR package on the AUR and they say:
Every time, it turns into abuse towards me, as you can also see in the comments for the package.
I read through a few pages of the comments here and they’re mostly people talking about fixing issues with the package, and what to do about the dev purposely breaking the build… I only found a single message that could be called abuse:
@eugene, not really but i suspect it’s an uphill battle, check the commit message: https://github.com/stenzek/duckstation/commit/30df16cc767297c544e1311a3de4d10da30fe00c
FWIW, I’m moving to pcsx-redux, I rather run a little bit less advanced PSX emulator than software by this upstream asshat. Regardless, much thanks for maintaining the AUR package so far.
And even this is not a good example of what stenzek is describing. For one, it’s obviously a reaction to stenzek’s hostile changes and not the sort of user coming for support and being abusive that stenzek is talking about. The user is also explicitly moving to a different emulator and not expecting any change from duckstation.
I remember the maintainer claiming they had permission from all contributors to change the license but I can’t find a link to it now.
This makes no sense. There might be various reasons a person might want/need to be on facebook. Does that mean they waive all right to privacy in every aspect of their life forever?
No, there’s no way to automatically make something become law. A successful petition just forces the European Commission to discuss it and potentially propose legislation. Even though it’s not forcing anything to happen, there is an incentive for the commission to seriously consider it as there is probably a political cost to officially denying a motion that has proven that it concerns a large amount of people.
Sign the petition even if it’s surpassed 1mil signatures by the time you read this! The signatures will be verified after the petition is complete. This could lead to removal of any number of them. We don’t want to barely make it. Let’s go as high as possible!
“Fair use” is the exact opposite of what you’re saying here. It says that you don’t need to ask for any permission. The judge ruled that obtaining illegitimate copies was unlawful but use without the creators consent is perfectly fine.
Of course they’re not “three laws safe”. They’re black boxes that spit out text. We don’t have enough understanding and control over how they work to force them to comply with the three laws of robotics, and the LLMs themselves do not have the reasoning capability or the consistency to enforce them even if we prompt them to.
Many times these keys are obtained illegitimately and they end up being refunded. In other cases the key is bought from another region so the devs do get some money, but far less than they would from a regular purchase.
I’m not sure exactly how the illegitimate keys are obtained, though. Maybe in trying to not pay the publisher you end up rewarding someone who steals peoples’ credit cards or something.
They work the exact same way we do.
Two things being difficult to understand does not mean that they are the exact same.
NVMEs are claiming sequential write speeds of several GBps (capital B as in byte). The article talks about 10Gbps (lowercase b as in bits), so 1.25GBps. Even with raw storage writes the NVME might not be the bottleneck in this scenario.
And then there’s the fact that disk writes are buffered in RAM. These motherboards are not available yet so we’re talking about future PC builds. It is safe to say that many of them will be used in systems with 32GB RAM. If you’re idling/doing light activity while waiting for a download to finish you’ll have most of your RAM free and you would be able to get 25-30GB before storage speed becomes a factor.
From the article:
Those joining from unsupported platforms will be automatically placed in audio-only mode to protect shared content.
and
“This feature will be available on Teams desktop applications (both Windows and Mac) and Teams mobile applications (both iOS and Android).”
So this is actually worse than just blocking screen capturing. This will break video calls for some setups for no reason at all since all it takes to break this is a phone camera - one of the most common things in the world.
The only thing I’ve been claiming is that AI training is not copyright violation
What’s the point? Are you talking specifically about some model that was trained and then put on the shelf to never be used again? Cause that’s not what people are talking about when they say that AI has a copyright issue. I’m not sure if you missed the point or this is a failed “well, actually” attempt.
It can’t be both. It’s not self-driving. That’s just what they call it to oversell it. I’m assuming they had to add the “Supervised” part for legal reasons.
Learning what a character looks like is not a copyright violation
And nobody claimed it was. But you’re claiming that this knowledge cannot possibly be used to make a work that infringes on the original. This analogy about whether brains are copyright violations make no sense and is not equivalent to your initial claim.
Just find the case law where AI training has been ruled a copyright violation.
But that’s not what I claimed is happening. It’s also not the opposite of what you claimed. You claimed that AI training is not even in the domain of copyright, which is different from something that is possibly in that domain, but is ruled to not be infringing. Also, this all started by you responding to another user saying the copyright situation “should be fixed”. As in they (and I) don’t agree that the current situation is fair. A current court ruling cannot prove that things should change. That makes no sense.
Honestly, none of your responses have actually supported your initial position. You’re constantly moving to something else that sounds vaguely similar but is neither equivalent to what you said nor a direct response to my objections.
The NYT was just one example. The Mario examples didn’t require any such techniques. Not that it matters. Whether it’s easy or hard to reproduce such an example, it is definitive proof that the information can in fact be encoded in some way inside of the model, contradicting your claim that it is not.
If it was actually storing the images it was being trained on then it would be compressing them to under 1 byte of data.
Storing a copy of the entire dataset is not a prerequisite to reproducing copyright-protected elements of someone’s work. Mario’s likeness itself is a protected work of art even if you don’t exactly reproduce any (let alone every) image that contained him in the training data. The possibility of fitting the entirety of the dataset inside a model is completely irrelevant to the discussion.
This is simply incorrect.
Yet evidence supports it, while you have presented none to support your claims.
I agree with you that the one liner isn’t a good example, but I do prefer the “left to right” syntax shown in the article. My brain just really likes getting the information in this order: “Iterate over Collection, and for each object do Operation(object)”.
The cost of writing member functions for each class is a valid concern. I’m really interested in the concept of uniform function call syntax for this reason, though I haven’t played around with a language that has it to get a feeling of what its downsides might be.