Does a license like this exist?

  • slazer2au@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 month ago

    Licensing only works as well as enforcing it. How do you show a LLM consumed your code as part of its training data?

    • lobut@lemmy.ca
      link
      fedilink
      arrow-up
      2
      ·
      1 month ago

      Some authors typed the first few sentences of their book and the LLM spit out the rest.

      • FaceDeer@fedia.io
        link
        fedilink
        arrow-up
        0
        arrow-down
        1
        ·
        1 month ago

        That generally only happens in cases of overfitting, where the model was trained on a poorly de-duplicated data set that contains many copies of that book (or excerpts, quotes, and so forth). This is considered a flaw by AI trainers and a lot of work goes into sanitizing the training data to prevent it.