Wikipedia talk:Large language models: Difference between revisions
→Unblock requests: Agree. |
|||
Line 186: | Line 186: | ||
::+1 [[User:Snowmanonahoe|Snowmanonahoe]] ([[User talk:Snowmanonahoe|talk]] '''·''' [[Special:Contributions/Snowmanonahoe|contribs]] '''·''' [[User:Snowmanonahoe/Typos|typos]]) 18:59, 17 June 2023 (UTC) |
::+1 [[User:Snowmanonahoe|Snowmanonahoe]] ([[User talk:Snowmanonahoe|talk]] '''·''' [[Special:Contributions/Snowmanonahoe|contribs]] '''·''' [[User:Snowmanonahoe/Typos|typos]]) 18:59, 17 June 2023 (UTC) |
||
::I have to agree that this is a persuasive argument and completely valid on all points. - [[User:Aoidh|Aoidh]] ([[User talk:Aoidh|talk]]) 02:35, 18 June 2023 (UTC) |
::I have to agree that this is a persuasive argument and completely valid on all points. - [[User:Aoidh|Aoidh]] ([[User talk:Aoidh|talk]]) 02:35, 18 June 2023 (UTC) |
||
::On procedural grounds, one could argue that using CHATGPT in such circummstances is a ''de facto'' demonstration of Lack of [[WP:COMPETENCE]]. {The poster formerly known as 87.81.23.195} [[Special:Contributions/46.65.228.117|46.65.228.117]] ([[User talk:46.65.228.117|talk]]) 04:39, 18 June 2023 (UTC) |
Revision as of 04:39, 18 June 2023
This is the talk page for discussing improvements to the Large language models page. |
|
Archives: Index, 1, 2, 3, 4, 5, 6, 7Auto-archiving period: 14 days |
This project page has been mentioned by a media organization:
|
"Do not use LLMs to write your talk page or edit summary comments."
Do not use LLMs to write your talk page or edit summary comments.
What was the reason for adding that? I don't understand. Schierbecker (talk) 02:18, 18 May 2023 (UTC)
- My guess is that someone was concerned that some malicious editor could weaponize LLMs to generate walls of text to WP:BLUDGEON the process. Well-intentioned rule, I think, but I think prohibiting all LLM-generated content on talk pages goes too far. Schierbecker (talk) 02:27, 18 May 2023 (UTC)
- Also a guess, but allowing either might prevent inter-peer assessment of WP:COMPETENCE. Those items used to be considerably fewer paragraphs, and it was in the one that began "You are responsible for ensuring your use of LLMs does not disrupt Wikipedia," if I remember correctly. Sandizer (talk) 02:31, 18 May 2023 (UTC)
- I'm deleting that line until someone can explain their reasoning for why LLM content in edit summaries is worse than the same in article space, Schierbecker (talk) 22:34, 18 May 2023 (UTC)
- Because an automated sealioning machine that is able to spit out endless paragraphs of "civil" POV pushing is the last thing that we need? Heaven knows we have enough problems already with debates being "decided" or "resolved" because one faction was the last one standing. LLM-generated content on talk pages is discussion poison. XOR'easter (talk) 22:57, 18 May 2023 (UTC)
- And ultimately if someone cannot competently express their view without an AI to do it... they really shouldn't be on Wikipedia at all. Der Wohltemperierte Fuchs talk 23:31, 18 May 2023 (UTC)
- I could imagine someone who doesn't write in English very well using LLM to clean up their grammar. One hardly wants to say "Sorry, subject of this article, but if you want to tell us about the error in this article, you need to spend the next couple of years studying English first, because otherwise you're incompetent to express your view."
- There's an editor I'm kind of wishing would use something like LLM right now. So far, the editor keeps asking if I've ever been uploaded to a website. Since Mind uploading isn't a thing, the answer is no, but I believe that's not really what the editor wants to know. WhatamIdoing (talk) 02:29, 14 June 2023 (UTC)
- An automated system for condensing endless paragraphs into a short summary might be helpful. WhatamIdoing (talk) 02:30, 14 June 2023 (UTC)
- @Fuzheado was experimenting on something like this. Schierbecker (talk) 03:15, 14 June 2023 (UTC)
- And ultimately if someone cannot competently express their view without an AI to do it... they really shouldn't be on Wikipedia at all. Der Wohltemperierte Fuchs talk 23:31, 18 May 2023 (UTC)
So, the admonition forbidding using LLMs to generate talk page comments has been removed, even though the obvious consensus above was against doing so. I'm restoring it. Sandizer (talk) 15:06, 31 May 2023 (UTC)
Verifying content
Has anyone else considered using LLMs as a way to speed up verification checks? I do a lot of COI clean up and it is very common for references to be present which don't actually support the content. It would be helpful if statements could be checked more quickly and in a more automated fashion. I just did a test using chatgpt to ask whether this was supported by the reference and it correctly identified that it was not. I suspect that there have probably been similar attempts to create a tool to do this, but I can't think of any specifics. SmartSE (talk) 09:05, 24 May 2023 (UTC)
- This is actually a good test of how ChatGPT responds to questions that are beyond its capabilities.
- User input:
Please tell me if the following sentence is supported by the reference provided: Prior, she was a consultant to the Arlington Hospital, Virginia (1995-1996) as well as a consultant to the VNA Lancaster County, Pennsylvania (1994-1995). She was also a bioethics consultant to the Virginia Bioethics Network at the University of Virginia (1993-1994).[1]
- ChatGPT output:
I apologize, but I cannot access the content of the reference provided as my training data only goes up until September 2021. Therefore, I cannot verify whether the sentence is supported by the given reference. However, based on the sentence you provided, it states that the person in question was a bioethics consultant to the Virginia Bioethics Network at the University of Virginia from 1993 to 1994, which is not directly related to the Arlington Hospital or VNA Lancaster County."
- The first two sentences are correct: ChatGPT cannot verify sources like this. The last sentence happens upon the correct answer by chance but uses completely faulty reasoning, since there is no reason that the two entities that she consulted for would need to be be directly related. (If anyone else has a better way of writing the prompt that produces different results, please share.)
- This reminds me of the story about the Texas A&M instructor who used ChatGPT to accuse an entire class of using AI to write their papers. These models will readily respond to questions that they have no way of answering and absolutely should not be relied on for this sort of thing. This is an inherent part of how they work, and they only refuse to do so if the programmers have hard-coded a "guardrail" for that specific scenario. –dlthewave ☎ 13:14, 24 May 2023 (UTC)
- @Smartse: please see [1], in particular the second of two scripts there. I let that languish for too long and should have time to turn it into a proper tool soon. Sandizer (talk) 07:57, 1 June 2023 (UTC)
- I think such a tool shouldn't just give a yes/no answer; it should present the relevant sections of both texts side by side (our article, and the source). That's what Facebook's Side AI does (see demo). It's open source so you may find inspiration there. DFlhb (talk) 11:57, 3 June 2023 (UTC)
- That's a great idea, but the Facebook system doesn't have an article parser, just a giant dataset where someone (crowdworkers?) already decided which article text segment applies to any given citation,[2] which is proving to be a very difficult and crucial problem here. Sandizer (talk) 17:30, 3 June 2023 (UTC)
- I think such a tool shouldn't just give a yes/no answer; it should present the relevant sections of both texts side by side (our article, and the source). That's what Facebook's Side AI does (see demo). It's open source so you may find inspiration there. DFlhb (talk) 11:57, 3 June 2023 (UTC)
References
- ^ "The Rutgers Journal of Bioethics" (PDF). 2019.
Non-compliant LLM text: remove or tag?
This draft currently recommends tagging non-compliant contributions with {{AI generated}}
. It should recommend removal instead.
The tagging recommendation is incoherent with the user warn templates like {{uw-ai1}}
. If LLM text were worth keeping, then why would we warn people who add it? There's no point in trying to fix non-compliant LLM text, which will either have no sources or likely-made-up sources. It's better to remove. Do LLMs write more accurate text than the deleted WP:DCGAR? I doubt it.
Let's try to keep this discussion organized: should the draft recommend removal, or tagging? Note that we're only talking about deleting raw LLM outputs added to existing articles. For deleting fully LLM-generated articles through WP:CSD, there's a current discussion elsewhere.
DFlhb (talk) 11:50, 3 June 2023 (UTC)
- Friendly ping for editors who participated in a previous discussion on this: Novem Linguae, Barkeep49, Thryduulf. DFlhb (talk) 12:49, 3 June 2023 (UTC)
- My opinion has not changed since the previous discussions - whether text is AI-generated or not is irrelevant (and impossible to determine reliably even if it was relevant). In all cases you should do with AI-generated content exactly what you would do with identical content generated by a human - i.e. ideally fix it. If you can't fix it then tag it with what needs fixing. If it can't be fixed then nominate it for deletion using the same process you would use if it was human-generated. Thryduulf (talk) 13:07, 3 June 2023 (UTC)
- Thanks, this logic is compelling. If we can't tell them apart then our response must be the same. In that case, do we need
{{AI generated}}
, or would the existing templates suffice? DFlhb (talk) 13:21, 3 June 2023 (UTC)- I don't see any need for that template - it's speculation that even if correct doesn't contain anything useful that existing templates do not (and in at least most cases they also do it better). Also, even if we could reliably tell human and AI-generated content apart, our response should be identical in both cases anyway because the goal is always to produce well-written, verifiable encyclopaedic content. Thryduulf (talk) 13:36, 3 June 2023 (UTC)
- Thanks, this logic is compelling. If we can't tell them apart then our response must be the same. In that case, do we need
- I disagree that experienced editors can't figure out what is AI-generated and what is not. According to this template's transclusion count, it is used 108 times, which is good evidence that there are at least some folks who feel confident enough to spot AI-generated content. I definitely think that the wording of this template should recommend deletion rather than fixing. AI-generated content tends to be fluent-sounding but factually incorrect, sometimes complete with fake references. It reminds me a lot of a WP:CCI in terms of the level of effort to create the content (low) versus the level of effort to clean it up (high). Because of this ratio, I consider AI-generated content to be quite pernicious. –Novem Linguae (talk) 14:06, 3 June 2023 (UTC)
- In other words, it's a guess and the actual problem is not that it is written by an AI but that it tends to be factually incorrect. Why should AI-written content that is factually incorrect be treated differently to human-written content that is factually incorrect? Why does it matter which it is? Thryduulf (talk) 14:16, 3 June 2023 (UTC)
- Because given it's structure it would take 5 times as much work to try to use the material (in the context of a Wikipedia article) than it would be to delete & replace it. North8000 (talk) 14:15, 3 June 2023 (UTC)
- A good analogy is badly written, unstructured and undocumented software. 10 times less hours to nuke and replace than to reverse engineer and rebuild a herd of cats. North8000 (talk) 14:21, 3 June 2023 (UTC)
- I'm not arguing that material in that state shouldn't be deleted, I'm arguing that whether it's in that state because it was written by AI or whether it is in that state because it was written by a human is irrelevant. Thryduulf (talk) 19:02, 3 June 2023 (UTC)
- That's true in theory, but a big difference is that human-written content is usually presumed to be salvageable as long as the topic is notable while AI should be treated more like something written by a known hoaxer. –dlthewave ☎ 20:15, 3 June 2023 (UTC)
- IMO the context of how it was generated is important in trying to figure out how to deal with it. For example, if you (Thryduulf) wrote "the sky is green" It would probably be worth the time to find out what you intended.....e.g. maybe in certain contexts / times. Or (knowing that there must have been some reason) take to see if there are instances when the sky is actually is green and build upon what they wrote. If the "sky is green" was built by a random word generator or typed by a chimpanzee, it would be silly to waste my time on such an effort. North8000 (talk) 21:56, 3 June 2023 (UTC)
- @Dlthewave and @North8000 These require you to know whether text was generated by a human or by a LLM. There is no reliable way to do this, so what you are doing is considering the totality of the text and making a judgement about whether it is worth your (or someone else's) effort to spend time on it. The process and outcome are the same regardless of whether the content is human-written or machine-written, so it's irrelevant which it is. Thryduulf (talk) 22:02, 3 June 2023 (UTC)
- We're really talking about two different things. I was answering: "presuming that one knew, should it be treated differently?" You are asserting one premise that it is impossible to know, and then based on that saying that that if that premise is true, then the question that I answered is moot. Sincerely, North8000 (talk) 01:29, 4 June 2023 (UTC)
- I was explaining why the question is irrelevant - it isn't possible to know, so there is no point presuming. However, even if we were to somehow able to know, there is no reason to treat it differently because what matters is the quality (including writing style, verifiability, etc) not who wrote it. Thryduulf (talk) 09:28, 4 June 2023 (UTC)
- Why wouldn't it be possible to know? I've seen several users make suspect contributions, those users were asked if they used an LLM, and they admitted they did; in cases where they admit it, we know for sure, we're not presuming.
- I'm not convinced that users can never tell it's an LLM. There were several cases at ANI of users successfully detecting it, including a hilariously obvious instance from AfD. What I said below seems to work: if we identify it, delete; if we can't identify, by default we do what we normally do. DFlhb (talk) 10:53, 4 June 2023 (UTC)
- Plus whenever the discussion gets more detailed I think it will almost inevitably come out. For example, let's say that there is a phrase in there that makes no sense and you ask the person who put it in "what did you mean by that?" Are they going to make up lies that they wrote something stupid in order to cover for the bot? Or blame the bot? .North8000 (talk) 13:43, 5 June 2023 (UTC)
- I was explaining why the question is irrelevant - it isn't possible to know, so there is no point presuming. However, even if we were to somehow able to know, there is no reason to treat it differently because what matters is the quality (including writing style, verifiability, etc) not who wrote it. Thryduulf (talk) 09:28, 4 June 2023 (UTC)
- We're really talking about two different things. I was answering: "presuming that one knew, should it be treated differently?" You are asserting one premise that it is impossible to know, and then based on that saying that that if that premise is true, then the question that I answered is moot. Sincerely, North8000 (talk) 01:29, 4 June 2023 (UTC)
- @Dlthewave and @North8000 These require you to know whether text was generated by a human or by a LLM. There is no reliable way to do this, so what you are doing is considering the totality of the text and making a judgement about whether it is worth your (or someone else's) effort to spend time on it. The process and outcome are the same regardless of whether the content is human-written or machine-written, so it's irrelevant which it is. Thryduulf (talk) 22:02, 3 June 2023 (UTC)
- I'm not arguing that material in that state shouldn't be deleted, I'm arguing that whether it's in that state because it was written by AI or whether it is in that state because it was written by a human is irrelevant. Thryduulf (talk) 19:02, 3 June 2023 (UTC)
- A good analogy is badly written, unstructured and undocumented software. 10 times less hours to nuke and replace than to reverse engineer and rebuild a herd of cats. North8000 (talk) 14:21, 3 June 2023 (UTC)
- Because given it's structure it would take 5 times as much work to try to use the material (in the context of a Wikipedia article) than it would be to delete & replace it. North8000 (talk) 14:15, 3 June 2023 (UTC)
- Note that every transclusion of that template is on drafts, not articles, and those drafts are just tagged so no one wastes time working on them (because presumably MfD/CSD would fail).
- I've just reviewed those drafts. They're all very blatantly promotional, and have all the hallmarks of LLM text: stilted writing,
"In conclusion..."
. There's no identification problem there, and indeed we should delete that stuff, pointless to fix. - When it's easy to identify, we should delete, since it's basically spam. When it's not easy to identify, people won't come to this draft/policy for advice anyway, and they'll just do what they normally do. So I guess it's fine for this draft to recommend deletion for
identif[ied] LLM-originated content that does not to comply with our core content policies
. DFlhb (talk) 15:31, 3 June 2023 (UTC)
- My thought is that although AI-generated content should generally be removed without question, it's not always black-and-white enough to delete on sight. Just as we have Template:Copyright violation and the more extreme Template:Copyvio, there are cases where it's unclear whether or to what extent AI was used. Maybe the editor wants to wait for a second opinion, circle back to deal with it later or keep the article tagged while it's going through AfD/CSD. –dlthewave ☎ 18:22, 3 June 2023 (UTC)
- Good point. In essence saying that it LLM content should be removed but saying that there should be normal careful processes when there is a question, which is/will be often. North8000 (talk) 15:07, 6 June 2023 (UTC)
Great work
Just wanted to drop a note here and say thanks to everyone who's been working on this - back when I looked at it in March/April I honestly found it hard to work out what it was saying, but now with the upfront 'basic guidance', it seems much clearer, and much more likely that people will be able to read it and take it on board. Look forward to seeing this in an RFC soon!
A couple of thoughts -
- For the nutshell, points #1 and #8 from the basic guidance (do not generate article content, do not generate talkpage comments) could reasonably be added to the nutshell in some way - I think these are the really key points to get across to a user who's casually wondering, hey, can I do this, more so than the point about disruptiveness.
- Disclosure - perhaps an example of an edit summary would be helpful here - it's a bit odd to say "must disclose" without saying how. (Something like "Section copyedited with ChatGPT"?).
Andrew Gray (talk) 22:56, 5 June 2023 (UTC)
Interweaving of things that should be policy and things that should be guidelines
This page for the most part would fall into being an editing guideline, maybe a content guideline. But these bits should probably be in a policy section(s):
- Required disclosure in edit summary
- Do not use to write talk page comments or edit summaries
- Perhaps the bit that articles written solely by an LLM with no useful content are candidates for deletion, but first we need to clarify what "candidate for deletion" means here.
Snowmanonahoe (talk · contribs · typos) 00:23, 13 June 2023 (UTC)
- I think you'll find that the difference between policies, guidelines and essays is more subtle and obscure than that.
- (Fun fact: The policy/guideline page that uses words like must and do not the most is a guideline.) WhatamIdoing (talk) 02:22, 14 June 2023 (UTC)
- That essay helpfully lists a bunch of things that are not the answer to the question in the title, without answering it. The interpretation of the difference I have reached, while probably imperfect, is "while both policies and guidelines are subject to common sense, policies are far less likely for such exceptions to be necessary". There are many potential reasons why you may not want to follow the exact prescripted article structure of Wikipedia:Manual of Style/Video games. On the other hand, there are very few potential reasons you would not want to be civil. I think that applies here. We should make it clear that disclosure of LLM use in an edit summary, and using it only for content and not discussion, are both bright-line, absolute requirements. Snowmanonahoe (talk · contribs · typos) 22:25, 14 June 2023 (UTC)
- I'm concerned that both of those are unenforceable in practice.
- On edit summaries:
- How would you know if I were posting LLM-generated content? How could you prove it if I denied it? How would a new editor discover that this rule exists, so that they could choose to comply with it?
- What would you do with the information in the edit summary? How do you imagine people using that information in actual practice? What if I decided to engage in Malicious compliance and used a script to include "This edit may or may not have used LLM" in every edit summary? Or that I claimed all my edits involved LLMs, since you propose no rule against lying about that? What if I use an LLM but then heavily modified it, so it's mostly my own work? Should I claim the untruth and pretend it's all LLM-generated content, with ensuing copyright complications? Explain in detail?
- On talk pages:
- All of the above about how you could know, but this injunction seems to be driven by a fear of verbosity. Have you heard what Pascal wrote, "Je n’ai fait celle-ci plus longue que parce que je n’ai pas eu le loisir de la faire plus courte" ("I have made this longer than usual because I have not had time to make it shorter")? What if people used LLMs to make their points more clearly and concisely? I believe that a look through my own contributions would prove to any impartial observer that verbosity is not the exclusive domain of LLMs. What are you trying to achieve? WhatamIdoing (talk) 03:03, 15 June 2023 (UTC)
- That essay helpfully lists a bunch of things that are not the answer to the question in the title, without answering it. The interpretation of the difference I have reached, while probably imperfect, is "while both policies and guidelines are subject to common sense, policies are far less likely for such exceptions to be necessary". There are many potential reasons why you may not want to follow the exact prescripted article structure of Wikipedia:Manual of Style/Video games. On the other hand, there are very few potential reasons you would not want to be civil. I think that applies here. We should make it clear that disclosure of LLM use in an edit summary, and using it only for content and not discussion, are both bright-line, absolute requirements. Snowmanonahoe (talk · contribs · typos) 22:25, 14 June 2023 (UTC)
"Editors should have enough familiarity with the subject matter to recognize when an LLM is providing false information"
This is essentially an impossible bar. Already in baby versions of LLM, the vast majority of Wikipedia editors would not be able to recognize false information from an LLM in any subject,[a] and it might even be challenging for all but post-docs to spot the false information in the topic of their Ph.D. thesis. As the software gets better, even Ph.D.'s may have to comb through sources to be certain about false information in the topic of their specialization. This sentence appears in § Specific competence is required, and I'm not sure what to do about it, but imho as written, it excludes all editors from using LLM in any subject—which maybe is okay, but in that case it should be stated categorically. Mathglot (talk) 14:11, 13 June 2023 (UTC)
- The idea is just that there is an increased likelihood of detection of false information when an editor is familiar with the subject matter. The intended meaning is not that "enough familiarity" guarantees that one will be able to recognize etc. It should be reworded.—Alalch E. 14:27, 13 June 2023 (UTC)
- To ensure verifiability, editors would need to comb through sources to confirm that every fact is supported by the cited source which would uncover any false information in the process. This is more dependent on Wikipedia experience than subject matter knowledge, since even a PhD likely wouldn't be able to tell you whether a specific fact appears in a specific source. –dlthewave ☎ 15:38, 13 June 2023 (UTC)
- I've removed the paragraph. But what about this alternative (didn't put much effort in it, but my thinking goes something like this):
—Alalch E. 15:59, 13 June 2023 (UTC)If editors use an LLM to paraphrase source material or existing article content, they should have some familiarity with the topic to be able to identify whether the meaning has changed along with the wording. Noticing subtle (but possibly quite significant) unintended changes to sourced content could be beyond the ability of an editor, even an experienced one, with no deeper understanding of the topic, despite their best efforts to recheck the claims against the sources and see if any deviations have appeared.
- The required familiarity includes knowledge gained by reading the sources in the added text. isaacl (talk) 16:48, 13 June 2023 (UTC)
- Citations are important to editors, but verifiability isn't contigent on them. The verifiability of statements doesn't depend on whether the material is cited, or if it is, whether the cited source is a good one. Consider:
- Smoking cigarettes increases the risk of lung cancer.
- Smoking cigarettes increases the risk of lung cancer.<fake ref>
- Smoking cigarettes increases the risk of lung cancer.<weak ref>
- Smoking cigarettes increases the risk of lung cancer.<high-quality ref>
- The claim is verifiable every single time, because I am able to verify that this claim appears in at least one reliable source (e.g., by spending a minute with a web search engine).
- When you cite a claim, you're making it easier for other editors to figure out that the claim is able to be verified, but it is still verifiable whether you make it easy for them or not. (Making it relatively easy is often required by policy.) When another editor determines that the material in the Wikipedia article matches the material in the cited source, that makes it "verified", not "verifiable". WhatamIdoing (talk) 02:10, 14 June 2023 (UTC)
- I've removed the paragraph. But what about this alternative (didn't put much effort in it, but my thinking goes something like this):
- To ensure verifiability, editors would need to comb through sources to confirm that every fact is supported by the cited source which would uncover any false information in the process. This is more dependent on Wikipedia experience than subject matter knowledge, since even a PhD likely wouldn't be able to tell you whether a specific fact appears in a specific source. –dlthewave ☎ 15:38, 13 June 2023 (UTC)
- When editors add text without the assistance of a program, they must understand the topic well enough to know that what they are writing is accurate and to be able to provide appropriate citations. The same remains true when a program is used. isaacl (talk) 16:48, 13 June 2023 (UTC)
- I agree, but there are layers of complexity here. I can tell that some things are unverifiable at a glance: HIV really has been scientifically proven to cause AIDS, measles vaccines really do not cause autism, horse de-wormer really has not been scientifically proven to help COVID-19 patients (unless they have intestinal parasites as well, I guess), etc., so claims to the contrary are not verifiable. I don't have to do a detailed review of sources to figure out whether smoking cigarettes increases the risk of lung cancer.
- There are other things that I don't know off hand, but that I would expect to be able to find a source for, and there are claims that could be true but different sources have different information. Last I checked, Cancer gave two different numbers for the percentage of cancer deaths caused by tobacco. They can't both be right, but they are both verifiable, they are both cited, and they are both directly supported by high-quality reliable sources.
- I think it's helpful to know the subject area, but it's also helpful to understand your own limits. WhatamIdoing (talk) 02:20, 14 June 2023 (UTC)
- The point is there isn't a special exception because you used a program to help you write the text. For any submission you make, you're responsible for ensuring the content is verifiable. If you include a citation to a source, you have to read it and understand it sufficiently to know that your content is adequately backed up by the source. isaacl (talk) 02:30, 14 June 2023 (UTC)
- I would agree with this. Mathglot (talk) 02:08, 15 June 2023 (UTC)
- Yes, I agree with this, too. In Germany, many stores post signs that say "Eltern haften für ihre Kinder" ("Parents are liable for their children"). I think the fundamental rule of a wiki is "Editors are liable for their edits". Whatever method you use to make the edit, you have to stand behind it. There is no get-out-of-responsibility-free card for any tool – script, bot, LLM, or anything else. WhatamIdoing (talk) 03:08, 15 June 2023 (UTC)
"There isn't a special exception because you used a program to help you write the text"
I think this is the best way to get the point across. Maybe I'm reading too far into it but the "familiarity with the subject matter" idea seems like it might lead to editors claiming that otherwise-competent editors can't use LLMs because they don't meet some arbitrary knowledge threshold or, conversely, that subject matter experts have free rein to do so. The lead already includes "As with all their edits, an editor is fully responsible for their LLM-assisted edits" which is in keeping with our current standards that folks are already familiar with and encompasses the necessary level of familiarity without coming out and saying it outright. –dlthewave ☎ 03:58, 15 June 2023 (UTC)
- I would agree with this. Mathglot (talk) 02:08, 15 June 2023 (UTC)
- The point is there isn't a special exception because you used a program to help you write the text. For any submission you make, you're responsible for ensuring the content is verifiable. If you include a citation to a source, you have to read it and understand it sufficiently to know that your content is adequately backed up by the source. isaacl (talk) 02:30, 14 June 2023 (UTC)
As noted above, the paragraph in question has been removed by Alalch E.. Looking at the nutshell at the top, it appears to me to retain a trace of the removed text in summary form; unless I'm reading it wrong or it was intended to summarize some other part of the page, a portion of the nutshell should be removed as well, or reworded to more clearly represent what it's trying to convey. Mathglot (talk) 02:04, 15 June 2023 (UTC)
- There was indeed a trace of that sort in the lead, and I've removed it, but I'm not seeing it in the nutshell. That part of the nutshell has always referred only to the first paragraph of "Specific competence is required".—Alalch E. 15:55, 15 June 2023 (UTC)
notes
- ^ Anticipating huffy responses on the order of, "Nonsense; I've found dozens of examples, it's easy!" I would just say, "Sure, me too", but also that anyone editing this page is a very highly self-selected group, and a tiny, tiny minority of the 116,969 active users. And also that LLM is in its infancy, and our hubris will soon be challenged by future versions. Remember 1997.
Unblock requests
CAT:RFU (unblock request) patrollers such as myself are more frequently seeing unblock requests generated by large language models. These are uniformly terrible, basically content-free and failing to address the reason for the block. The basic guidance described on the draft would already help (assuming people followed it... ahem), but it might be helpful to expand on point 8 ("Do not use LLMs to write your talk page or edit summary comments."), perhaps to "Do not use LLMs to write your talk page or edit summary comments or make unblock requests." This is a polite suggestion and I will not be at all put out if you think this isn't a great idea. --Yamla (talk) 19:00, 16 June 2023 (UTC)
- I've seen a fair share of LLM-generated unblock requests as well and I think making the unblock part explicit is a good idea so that there is no confusion on that point when people read the page. That it can be pointed to and make clear that it's not merely the admin's opinion that the unblock request shouldn't be generated in that way, but that the community consensus is that you should not use LLMs to generate an unblock request. - Aoidh (talk) 19:06, 16 June 2023 (UTC)
- If the requests are poor quality, it doesn't matter whether or not a tool was used to help with their creation. I don't think it's a good idea to try to list all the ways poor quality posts can be used: the list is neverending, and it may give the impression that the list is all-inclusive. isaacl (talk) 20:19, 16 June 2023 (UTC)
- While it doesn't need to be an exhaustive list and it's not an attempt to include all scenarios, it should include the more common scenarios. Using LLMs to write unblock requests is becoming very common. In terms of non-article LLM usage it is by far the most common usage I've encountered. - Aoidh (talk) 20:50, 16 June 2023 (UTC)
- People have always been writing content-free appeals. We should focus on rejecting them quickly, regardless of how they were written. We shouldn't have to create meta-instructions saying please follow the instructions for X, Y, and Z. We already have guidance on how to write appropriate appeals, talk page comments, edit summaries, and so forth. isaacl (talk) 20:58, 16 June 2023 (UTC)
- While it doesn't need to be an exhaustive list and it's not an attempt to include all scenarios, it should include the more common scenarios. Using LLMs to write unblock requests is becoming very common. In terms of non-article LLM usage it is by far the most common usage I've encountered. - Aoidh (talk) 20:50, 16 June 2023 (UTC)
- As these requests are made on Talk pages this seems to already fall under the current language in this draft: "Do not use LLMs to write your talk page or edit summary comments."
- With that said, I think it might be helpful to more explicitly state that the guidance is about communicating with other editors. For example, we could write: "Do not use LLMs to communicate with other editors e.g., write edit summaries, make requests or suggestions on Talk pages." ElKevbo (talk) 21:17, 16 June 2023 (UTC)
- People who make LLM-generated unblock requests won't read this page anyway. This might merit inclusion on Wikipedia:Guide to appealing blocks though. Snowmanonahoe (talk · contribs · typos) 22:10, 16 June 2023 (UTC)
- WP:BEANS applies. Schierbecker (talk) 00:10, 17 June 2023 (UTC)
- It's only fair to let editors know they need to acknowledge why they were blocked and commit to change in their own words, so they don't just keep plugging new prompts into ChatGPT until it comes up with something convincing.
- But there are also bigger-picture considerations: If they're using a LLM to write unblock requests, there's a high likelihood that they're also using it to edit articles. This would be a good opportunity to inform them of our policy and also check their edits for factual accuracy. –dlthewave ☎ 02:59, 17 June 2023 (UTC)
- Hate to sound like a dick, but I want the kind of people who would use LLMs for unblock requests to actually do so, and to get rejected for it. It's a great way to detect if a person has poor judgment and shouldn't be trusted to edit articles. Aoidh's argument is better, but I think it's better to keep this unwritten; the current wording is general enough to give legitimacy to these admin actions, without needing to spell it out. WP:BEANS applies, because if we spell it out, those people will still use LLMs, and just paraphrase it to make it less recognizable. We would lose a valuable way to detect WP:CIR and nip it in the bud. Unwritten rules are rarely appropriate, but I just can't imagine any valuable editors doing this, so we might as well preserve this "tell". DFlhb (talk) 12:07, 17 June 2023 (UTC)
- I have to say, this is a pretty compelling argument. Also, it made me laugh. Thanks, DFlhb. :) --Yamla (talk) 18:12, 17 June 2023 (UTC)
- +1 Snowmanonahoe (talk · contribs · typos) 18:59, 17 June 2023 (UTC)
- I have to agree that this is a persuasive argument and completely valid on all points. - Aoidh (talk) 02:35, 18 June 2023 (UTC)
- On procedural grounds, one could argue that using CHATGPT in such circummstances is a de facto demonstration of Lack of WP:COMPETENCE. {The poster formerly known as 87.81.23.195} 46.65.228.117 (talk) 04:39, 18 June 2023 (UTC)