Tentative LLM contribution guidelines.

Discussions about the development and maturation of the platform code (UXP).
Warning: may contain highly-technical topics.

Moderators: trava90, athenian200

User avatar
athenian200
Contributing developer
Contributing developer
Posts: 1887
Joined: 2018-10-28, 19:56
Location: Georgia

Tentative LLM contribution guidelines.

Post by athenian200 » 2026-06-02, 00:42

I'd like to propose some basic guidelines for using LLMs to contribute to UXP. These aren't official/approved yet, just what I think would make sense.

The purpose of the guidelines is to ensure that LLM use helps move UXP forwards rather than increasing reviewer workload and causing the introduction of low-quality or untested code. LLMs are tools, but the responsibility for the code generated still lies with the human submitting the PR. Do not forget that, and what follows should make sense to you.

1. Don't use LLMs to generate a patch, assume it's right, and then not check if it compiles and does what you expect.

This is the #1 thing that leads to the worst problems of vibecoding: untested code, or worse, code that was never even compiled. If you don't have an environment setup where you can compile and locally test the code the LLM generates to ensure correctness, then you shouldn't be submitting patches. LLMs can hallucinate, but with code, the easiest way to catch that is often careful testing because often such code will not compile or will not do what is expected.

2. Don't let the LLM generate a whole bunch of stuff at once and then submit it all as one big patch that no one can understand.

This is another thing that makes LLM-generated patches unattractive. If you use an LLM to generate an entire feature and then can't split it into reasonably-sized commits a reviewer can parse and follow the logic behind, then don't expect anyone to merge it.

3. Do ask the LLM to break its process down into steps or milestones, and then do one small thing at a time, testing each step carefully as you go, committing once that step is squared away, rather than trying to test the whole feature at once near the end of the process.

LLMs have limited context windows, and reviewers have limited mental capacity to focus on a whole bunch of unrelated changes or things happening out of order or in a non-logical progression. Keep your changes sane and create a step-by-step plan so that both the LLM's context window doesn't run out, and the reviewer's working memory doesn't run out.

4. Make sure the LLM follows the UXP coding guidelines, possibly with an AGENTS.md file, or you may be asked to clean up the output.

If your commits are not named according to our guidelines, or you let the LLM code in such a way that the generated code doesn't match our preferred coding style, then you may have to fix this before your PR is accepted.

5. Make sure the LLM doesn't introduce pointless refactors of surrounding code or do a lot of stuff unrelated to the stated purpose of your PR.

We've always been against pointless refactors. If they are caught in review, you may be asked to either manually rewrite or at least regenerate the code so that it is acceptable.

6. Study the PR and try your best to understand what it does before submitting it, looking for anything you think a reviewer might ask questions about.

You can even ask the LLM why something was done a certain way if anything jumps out at you that you think might get flagged by a reviewer. It's not unreasonable to ask the AI to justify or explain its coding decisions until you understand them and either decide to change the code, or at least understand enough to justify the changes to any potential reviewer. If something looks wrong, or you strongly get the feeling it's hallucinating, it's a good idea to ask another LLM from a different architecture family's opinion, or even consult a search engine to sanity-check what you're being told.

7. If you have the means, have an LLM from a different family check the work of the LLM that generated the code as a final pass before submitting for review.

LLMs from different architecture families (like Claude from Anthropic and Codex from OpenAI), are better at catching one another's mistakes than models in the same family. That means, for example, that code generated by Claude should be checked by Codex, and code generated by Codex should be checked by Claude. Failing that, it's worth at least mentioning which model and version you used to generate the code.

8. Disclose the use of LLMs in the PR, clarifying exactly what model and version were used, and for which parts of the patch.

This information could be valuable to future maintainers that need to understand how the code was generated, so it's worth recording. If possible, also locally save the prompts used to generate the code.

9. You are responsible for ensuring the correctness of any code you submit, and also for maintaining it, within reason.

You cannot pass this on to the LLM... if you submit it, it's your code and you are the one that will be responsible if anything goes wrong with it. Which means you are responsible for fixing it, whether manually or otherwise. You should commit to following up and being available to fix anything that goes wrong as a result of your code contribution for a reasonable amount of time. Complaining that you lost access to the LLM for whatever reason (token limits, expired subscription, etc) will be treated the same as saying, "I don't feel like fixing this, I have better things to do," and will impact your reputation as a contributor accordingly. To put it simply, loss of access to your chosen LLM for any reason will not be accepted as a legitimate excuse for submitting bad code, and then failing to fix it in a timely manner.
"Linux makes everything difficult." -- Lyceus Anubite
"Linux is a cancer that attaches itself in an intellectual property sense to everything it touches. That's the way that the license works." -- Steve Ballmer
"We always overestimate the change that will occur in the next two years and underestimate the change that will occur in ten." -- Bill Gates

User avatar
Mæstro
Board Warrior
Board Warrior
Posts: 1205
Joined: 2019-08-13, 00:30
Location: Casumia

Re: Tentative LLM contribution guidelines.

Post by Mæstro » 2026-06-02, 02:05

Another difficulty with LLM use in open-source programming might have to do with licencing. If an employed LLM draws on source code licenced under the GPL, the derivative work would most likely need follow that licence, regardless of the developer’s wishes. If it draws on GPL and incompatible licences which it finds in the wild, I understand the result might be illegal. Attribution would be complicated matter in any case, thanks to the nature of LLM. How do you plan to address this?
Life is a fever dream Mæstro would enjoy.
All posts 100% organic. Ash is the best letter.
What is being nice online?
Debian 10 ELTS / Official PM build

User avatar
athenian200
Contributing developer
Contributing developer
Posts: 1887
Joined: 2018-10-28, 19:56
Location: Georgia

Re: Tentative LLM contribution guidelines.

Post by athenian200 » 2026-06-02, 04:09

Mæstro wrote:
2026-06-02, 02:05
Another difficulty with LLM use in open-source programming might have to do with licencing. If an employed LLM draws on source code licenced under the GPL, the derivative work would most likely need follow that licence, regardless of the developer’s wishes. If it draws on GPL and incompatible licences which it finds in the wild, I understand the result might be illegal. Attribution would be complicated matter in any case, thanks to the nature of LLM. How do you plan to address this?
Well, I believe legally an LLM trained on code is considered more to have read it than to have copied it verbatim, so the code it outputs is not licensed under the original license of whatever it was trained on. If that weren't the case, then basically no project could use LLM for coding. And a lot of big names who definitely have legal teams are in fact using them, which suggests to me this isn't the concern you're implying.

From what I've gathered, GPL obligations only arise from reproducing GPL code verbatim, not from using a tool that was trained on it. Without verbatim reproduction, there’s no derivative work and no licensing issue. And modern LLMs are specifically trained to avoid verbatim reproduction, making the only scenario where GPL obligations could apply extremely unlikely.
"Linux makes everything difficult." -- Lyceus Anubite
"Linux is a cancer that attaches itself in an intellectual property sense to everything it touches. That's the way that the license works." -- Steve Ballmer
"We always overestimate the change that will occur in the next two years and underestimate the change that will occur in ten." -- Bill Gates

User avatar
Moonchild
Project founder
Project founder
Posts: 39492
Joined: 2011-08-28, 17:27
Location: Sweden

Re: Tentative LLM contribution guidelines.

Post by Moonchild » 2026-06-02, 08:18

Great write-up! All makes good sense reading through it and although to me it's mostly just common sense, we do have to spell these things out for generations less gifted with that power.

The licensing issue pondered about shouldn't be a problem. By definition, LLM output is generated from mathematical probabilities and learned patterns, not directly from source material (unless the LLM is doing something really wrong, was trained on too little data to draw inferences, or similar). Unlike in, say, prose or poetry where you can claim an idea or pattern, in code that is much less so. e.g. even if you study GPL code and find the ideas interesting and write very similar code from scratch yourself, that code isn't able to be claimed by the GPL.
athenian200 wrote:
2026-06-02, 00:42
9. You are responsible for ensuring the correctness of any code you submit, and also for maintaining it, within reason.

You cannot pass this on to the LLM... if you submit it, it's your code and you are the one that will be responsible if anything goes wrong with it.
This is a very important point to stress (so should probably be near the top of the document). If someone makes a PR based on LLM generated code, they are putting their proverbial signature on it. Just as much as lawyers have got into trouble submitting LLM-generated documents to courts with hallucinations in them, anyone submitting code will be responsible for that code regardless of how it is generated.
"Praise from a narcissistic person is always a poison dart. They don't share the stage, so discernment matters." - Dr. Ramani
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
BenFenner
Keeps coming back
Keeps coming back
Posts: 952
Joined: 2015-06-01, 12:52
Location: US Southeast

Re: Tentative LLM contribution guidelines.

Post by BenFenner » 2026-06-02, 21:35

Speaking of generative AI and copyright, in the US it is not a given that the product of generative AI can be copyrighted at all. If the prompts aren't skillful enough or iterated enough and the result isn't reviewed enough and/or altered enough after the fact (i.e. you are vibe coding/vibe music creating/vibe movie directing/etc.) then no copyright will apply to the resulting product*. I think this means in theory that kind of output product could not by copyrighted under the MPL or other licenses. Which makes spelling out how a generative AI contribution to the project will be accepted important in another way, potentially convincing those who would need convincing that enough work has been done to allow the product to be copyrighted.

Of course I could be wrong, and who knows what the IP landscape in this regard will be in 5-10 years from now. And of course the US isn't the end-all be-all for copyright...



*Not saying I agree with this. If the output of a tool like a digital camera has copyright assigned to the entity that actuated the shutter (for example) then surely the output of a tool like a generative AI should enjoy the same.
Last edited by BenFenner on 2026-06-02, 22:07, edited 1 time in total.

User avatar
Moonchild
Project founder
Project founder
Posts: 39492
Joined: 2011-08-28, 17:27
Location: Sweden

Re: Tentative LLM contribution guidelines.

Post by Moonchild » 2026-06-02, 21:46

I understand your concern, but if it's flat out un-copyrightable then it will effectively be license-free (i.e. public domain). That's the most permissive of any license and very obviously allows it to be included under a more restrictive license if the submitter so chooses. Copyright law doesn't apply to anything that doesn't include a creative element (there' have been plenty of court cases decided on this finding), which is where the "altered significantly enough after generation" comes from if someone wants to argue they have exclusive rights to something. The exclusivity part is what matters there, not that it can be used or included in a work, or not.
"Praise from a narcissistic person is always a poison dart. They don't share the stage, so discernment matters." - Dr. Ramani
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
Gemmaugr
Astronaut
Astronaut
Posts: 692
Joined: 2025-02-03, 07:55

Re: Tentative LLM contribution guidelines.

Post by Gemmaugr » 2026-06-05, 16:37

It seems LadyBird devs take on this is to close public pull requests: https://ladybird.org/posts/changing-how ... -ladybird/
||OS: Win 10 | CPU: i7 10700 | GPU: GeForce RTX 3070||
"Judge a person not by their superficial identity attributes, but by the content of their character."
"Organized Identity Politics are the bane of civilized society."

User avatar
Moonchild
Project founder
Project founder
Posts: 39492
Joined: 2011-08-28, 17:27
Location: Sweden

Re: Tentative LLM contribution guidelines.

Post by Moonchild » 2026-06-06, 06:39

Gemmaugr wrote:
2026-06-05, 16:37
It seems LadyBird devs take on this is to close public pull requests: https://ladybird.org/posts/changing-how ... -ladybird/
I get where they are coming from, but just closing all existing PRs in the queue and not even looking at them is just disrespectful to those who did actually write code for it, IMO.
I think the problem they have is similar to what we dealt with a few times before: people dropping pretty extensive code changes, getting it merged and then vanishing and not actually "owning" the code submitted or being available for follow-ups or tweaks. That isn't an LLM issue, though. It's a people issue... but I see how LLM submissions could amplify it.
"Praise from a narcissistic person is always a poison dart. They don't share the stage, so discernment matters." - Dr. Ramani
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

User avatar
Mæstro
Board Warrior
Board Warrior
Posts: 1205
Joined: 2019-08-13, 00:30
Location: Casumia

Re: Tentative LLM contribution guidelines.

Post by Mæstro » 2026-06-06, 16:25

Moonchild wrote:
2026-06-02, 21:46
I understand your concern, but if it's flat out un-copyrightable…
It might be relevant to note that, in the United Kingdom and Hong Kong, LLM output can be copyrighted for half a century. I am thinking now of some of my own experiences with other kinds of works, where somebody based in the USA simply states that he waives the copyright or ‘dedicates such-and-such to the public domain’. Since this is impossible in Germany, and since he did not use CC-0 or another valid licence, I was in the unhappy position of needing to treat it as if it were still copyright.
Life is a fever dream Mæstro would enjoy.
All posts 100% organic. Ash is the best letter.
What is being nice online?
Debian 10 ELTS / Official PM build