IP law and AI & LLM. Mostly governed by copyright law.

@mistersql

IP law and AI & LLM.
Mostly governed by copyright law.

Self-replies

---
(me: Who owns a sentient document?)
AI generated code... is it patentable? Is it copyrightable?

Patents on executing an algorithm are getting harder to defend.
---
Software code is a "Literary Work" for purpose of copyright.
---
Google vs Oracle on APIs expression is covered by copyright but ideas are not.
---

Licensing might matter as much as copyright law
- proprietary, public domain, copyleft or permissive OS
- google page rank is patented but expired (public domain)
License law built on top of copyright/only enforceable because copyright law exists.

Copyleft
- weak - ... (undefined)...
- strong GNU - derivatives must use same license (or more restrictive). Enforced by SFC vs proprietary commercial companies. SFC has standing because they are the intended beneficiary. (i.e. Companies violate GNU because, hey what is that nerd going to do)

Authorship
- e.g. can who is the author for a prompt that create a cool app?
- Author is AI! AI can't be a copyright holder. Monkey-self law governs here (no kidding!). Only human authors can be authors.

me: I'm still listing ChatGPT in AUTHORS.md

Training data is black box & could be copyrighted content or any license you can think of. Could have direct infringement (accidentally encoded a copy into the LLMs weights). Being fought in court right now.

Is output a copyright infringement? Already decided that unrelated text is not infringement, but identical regurgitation is? could be?

Is it fair use?
- Transformation. 2 court cases say that LLM is transformation. (google book scan case applies here). Google v Oracle - google made a new language in a new environment so it was okay transformative fair use.
- Economic impact. Unclear that there isn't a negative impact on the creator. 25+ cases about this in court.

Advice
- use code you understand
- follow same advice as for using any 3rd party code (similar concerns)
- small pieces are safer (i.e. even if sort of a copy)
- larger is safe if it isn't a copy of something

Use automated tools (too find accidental verbatim copies?)
Documentation - Show what is AI created.
Get AI/LLM company to indemnify you