I don't believe that you require to do much to claim copyright over an output of an LLM.
The input prompt is under copyright - a simple modification to the source code will grant copyright to you.
In a world where artists’ livelihood depends on their output, yeah. But that's not a force of nature. Our society is making the choice of letting them starve for not producing enough art. We need to decouple this. UBI now.
You should realize that this is happening not only in the space of images(where conglomerates aren't a thing), but also in music.
Music conglomerates have money and their lawsuits will probably settle the issue.(unless they settle) That will be applied for all copyrighted works, regardless of the medium.
I believe going against the big guys is the reason why the big ones don't yet have music generation LLMs.
Most people can't even imagine the complexity it would require to actually build a system that correctly tracks down the sources for image generation.
Not to mention that each image is generated from literally every single training image in a very small percentage.
It's not hard when someone inputs "create in style of studio ghibli" to say that studio Ghibli should get a cut. It's very different when you don't specify the source for the origin style.
And if you tried to identify the source material owner, the percentage of the output image that their work contributed to would be extremely - if not infinitely - small. You'd get minuscule payouts.
Funny thing is that building an LLM isn't as complex as you might think.
But the problem of attribution is easily understandable to any human with a modicum of intelligence.
Imagine that you have a trillion input images, with every single one having a source associated.
When training they go through lots of processes and every single image contributes a varying degree to a subset of 8billion parameters. That alone would produce a dataset that is 1T * 8B to just say how much a particular image contributed to the output...
To mimic intelligence the output is also randomized - the association is not static and every single pixel in the output has it's own lineage.
So as you can probably imagine that to calculate the actual source weights on the output you'd require to do at least 8e+21 calculations per output pixel... and require double precision floating point while you do it.
We know how to do it. It's just ridiculously expensive.
(The above example is for demonstrative purposes only)
Insults aside, you chose a very expensive way to solve the attribution problem. But my rebuttal is simple: we are literally commenting in a thread about an AI image generator that paid people. It didn't work, but if a company I've never heard of can try an experiment like this, I'm sure our billion dollar AI overlords could if they wanted to.
As an artist your license didn't ban learning from your work. Unless your content was acquired without a license at all - you absolutely gave them permission to use it in training sets.
No I didn't. It's use in a software product without my permission. That's never been allowed.
Just because you obfuscate what's happening by calling it "learning" and pretending your model is actually just looking at pictures the same as a human, doesn't make it true.
Unfortunately you did grant that permission. Once you granted the permission for someone to hold a copy, they have the permission to process it.
I can assure you, that you didn't grant a license with an exclusive list of operations that can be performed on your work of art. At best you may have had something like "no commercial use" clause and general broad terms.
> I'm fairly certain that "style" is not something protected by Copyright
To a degree it is protected, but not by copyright. Design patents are a thing and companies have sued each other over them (Apple vs Samsung during the "smartphone wars" comes to mind)
reply