Wikipedia:Large language models: Difference between revisions

Content deleted Content added

Inline

Revision as of 15:46, 15 June 2023

The following is a draft working towards a proposal for adoption as a Wikipedia policy.
The proposal must not be taken to represent consensus, but is still in development and under discussion, and has not yet reached the process of gathering consensus for adoption. Thus references or links to this page should not describe it as policy, guideline, nor yet even as a proposal.

Shortcut

WP:LLM

This page in a nutshell: Use of large language models (LLMs) must be rigorously scrutinized, and only editors with substantial prior experience in the intended task are trusted to use them constructively. Repeated LLM misuse is a form of disruptive editing.

“

Large language models have limited reliability, limited understanding, limited range, and hence need human supervision.

”

— Michael Osborne, Professor of Machine Learning in the Dept. of Engineering Science, University of Oxford,
January 25, 2023^[1]

Large language models (LLMs) are natural language processing computer programs that use artificial neural networks to generate text. Some notable ones are GPT-3, GPT-4, LaMDA (Bard), BLOOM, and LLaMA. LLMs power many applications, such as AI chatbots and AI search engines. They are used for a growing number of features in common applications, such as word processors, spreadsheets, etc. In this policy, the terms "LLM" and "LLM output" refer to all such programs and applications and their outputs.

LLM-generated content is often an outright fabrication, complete with fictitious references, which are emblematic of hoaxes. It is routinely non-verifiable, comprising the machine-generated equivalent of original research. It may also be biased, may libel living people, and may violate copyrights. Editors who are not fully aware of said risks must not edit with the assistance of these tools. LLMs must not be used for tasks with which the editor does not have substantial familiarity. Their outputs must be rigorously scrutinized for compliance with all applicable policies. As with all their edits, an editor is fully responsible for their LLM-assisted edits. Furthermore, LLM use to generate or modify text must be declared in the edit summary.

Basic guidance

Do not publish content on Wikipedia obtained by asking LLMs to write original content or generate references. Even if such content has been heavily edited, seek other alternatives that don't use machine-generated content.
You may use LLMs as a writing advisor, i.e. asking for outlines, asking how to improve paragraphs, asking for criticism of text, etc. However, you should be aware that the information they give to you can be unreliable and flat out wrong. Use due diligence and common sense when choosing whether to incorporate the LLM's suggestions or not.
You may use LLMs for copyediting, summarization, and paraphrasing, but note that they may not properly detect grammatical errors or keep key information intact. Use due diligence and heavily edit the response. You may also ask the LLM to correct its deficiencies such as missing information in a summary or an unencyclopedic, e.g. promotional tone.
You are responsible for making sure that using an LLM will not be disruptive to Wikipedia.
You must denote that a LLM was used in the edit summary.
LLM-created works are not reliable sources. Unless their outputs were published by reliable outlets with rigorous oversight, they should not be cited in our articles.
Wikipedia is not a testing ground for LLMs. The use of Wikipedia for experiments or trials is forbidden.
Do not use LLMs to write your talk page or edit summary comments.

Risks and relevant policies

Shortcut

WP:AIFAIL

Copyright violations

Relevant policy: Wikipedia:Copyrights

Tip: If you want to import text that you have found elsewhere or that you have co-authored with others (including LLMs), you can only do so if it is available under terms that are compatible with the CC BY-SA license.

Further: Wikipedia:Large language models and copyright. See also: m:Wikilegal/Copyright Analysis of ChatGPT

An LLM can generate copyright-violating material.^[a] Generated text may include verbatim non-free content or be a derivative work. In addition, using LLMs to summarize copyrighted content (like news articles) may produce excessively close paraphrases. The copyright status of LLMs trained on copyrighted material is not yet fully understood. Their output may not be compatible with the CC BY-SA license and the GNU license used for text published on Wikipedia.

Original research and "hallucinations"

Relevant policy: Wikipedia:No original research

Tip: Wikipedia articles must not contain original research – i.e. facts, allegations, and ideas for which no reliable, published sources exist. This includes any analysis or synthesis of published material that serves to reach or imply a conclusion not stated by the sources. To demonstrate that you are not adding original research, you must be able to cite reliable, published sources. They should be *directly related* to the topic of the article and *directly support* the material being presented.

While LLMs may give accurate answers in response to some questions, they may also generate responses that are biased or false, sometimes in subtle ways, sometimes not so subtle. For example, if asked to write an article on the benefits of eating crushed glass, they will sometimes do so. This can be dangerous, and therefore, editors using LLMs to assist with writing Wikipedia content must be especially vigilant in not adding instances of such LLM-generated original research to the encyclopedia.

LLMs are pattern completion programs: they generate text by outputting the words most likely to come after the previous ones. They learn these patterns from their training data, which includes a wide variety of content from the Internet and elsewhere, including works of fiction, conspiracy theories, propaganda, and so on. Because of this, LLMs can make things up, which, in addition to being considered original research, are also called hallucinations.

Asking LLMs about obscure subjects, complicated questions, or telling them to do tasks which they are not suited to (i.e. tasks which require extensive knowledge or analysis) makes these types of errors much more likely.

And since LLMs answer with an air of confidence, this makes their mistakes easily accepted as facts or credible opinions.

Unsourced or unverifiable content

Relevant policy: Wikipedia:Verifiability

Tip: Readers must be able to check that any of the information within Wikipedia articles is not just made up. This means all material must be attributable to reliable, published sources. Additionally, quotations and any material challenged or likely to be challenged must be supported by inline citations

LLMs do not follow Wikipedia's policies on verifiability and reliable sourcing. LLMs sometimes exclude citations altogether or cite sources that don't meet Wikipedia's reliability standards (including citing Wikipedia as a source). In some case, they hallucinate citations of non-existent references by making up titles, authors, and URLs.

LLM-hallucinated content, in addition to being original research as explained above, also breaks the verifiability policy, as it can't be verified because it is made up: there are no references to find.

Algorithmic bias and non-neutral point of view

Relevant policy: Wikipedia:Neutral point of view

Tip: Articles must not take sides, but should explain the sides, fairly and without editorial bias. This applies to both what you say and how you say it.

LLMs can produce content that is neutral-seeming in tone, but not necessarily in substance. This concern is especially strong for biographies of living persons.

Loss of volunteer effort

Wikipedia relies on obtaining volunteer effort to review new content. An important factor in obtaining this effort and keeping the required quantity of it manageable is that there was a commensurate investment by a human editor in creating the material. Allowing insertion of large volumes of AI generated content would degrade this factor and its beneficial effects on obtaining volunteer efforts. Some AI generated promotional articles took many hours of volunteer time to clean up. This can overwhelm and demotivate the volunteers.

Using LLMs

Specific competence is required

Shortcut

WP:LLMCIR

LLMs are assistive tools, and cannot replace human judgment. Careful judgment is needed to determine whether such tools fit a given purpose. Editors using LLMs are expected to familiarize themselves with a given LLM's inherent limitations and then must overcome these limitations, to ensure that their edits comply with relevant guidelines and policies. To this end, prior to using an LLM, editors should have gained substantial experience doing the same or a more advanced task without LLM assistance.^[b]

Experience is required not just in relation to Wikipedia practices but also concerning the proper usage of LLMs. For example, this applies to the issue of how to formulate good prompts.

Some editors are competent at making unassisted edits but repeatedly make inapproprate LLM-assisted edits despite a sincere effort to contribute. Such editors are assumed to lack competence in this specific sense. They may be unaware of the risks and inherent limitations or be aware but not be able to overcome them to ensure policy-compliance. In such a case, an editor may be banned from aiding themselves with such tools (i.e., restricted to only making unassisted edits). This is a specific type of limited ban. Alternatively, or in addition, they may be partially blocked: from a certain namespace or namespaces.

Disclosure

Every edit which incorporates LLM output must be marked as LLM-assisted in the edit summary. This applies to all namespaces.

Writing articles

Large language models can be used to copy edit or expand existing text and to generate ideas for new or existing articles. Every change to an article must comply with all applicable policies and guidelines. This means that you must become familiar with relevant sources for the content in question and then carefully evaluate the output text for its verifiability. It also includes neutrality, and absence of original research as well as compliance with copyright and all other applicable policies and guidelines. Compliance with copyright includes respecting the copyright licensing policies of all sources. As part of providing a neutral point of view, you must not give undue prominence to irrelevant details or minority viewpoints. If citations are generated as part of the output, you must verify that the corresponding sources are non-fictitious, reliable, relevant, and suitable sources, and check for text–source integrity.

Equally, raw LLM outputs must not be pasted directly into drafts. Drafts are works in progress and their initial versions often fall short of the standard required for articles. However, they still should not have the serious problems outlined in the 'Relevant policies and associated risks' section above. In particular, this concerns copyright problems and original research. Enabling editors to develop article content by starting from an unaltered LLM-outputted initial version is not one of the purposes of draft space or user space.

Using sources with LLM-generated text

All sources used for writing an article must be reliable, as described at Wikipedia:Verifiability § Reliable sources. Many sources written by LLMs fail this requirement. Before using them, you must verify that the content was evaluated for accuracy.

Be constructive

Wikipedia relies on volunteer efforts to review new content for compliance with our core content policies. This is often time consuming. The informal social contract on Wikipedia is that editors will put significant effort into their contributions, so that other editors do not need to "clean up after them". Editors must ensure that their LLM-assisted edits are a net positive to the encyclopedia, and do not increase the maintenance burden on other volunteers.

You must not use LLMs for unapproved bot-like editing (WP:MEATBOT), or anything even approaching bot-like editing. Using LLMs to assist high-speed editing in article space is always taken to fail the standards of responsible use. For such edits, it is impossible to rigorously scrutinize content for compliance with all applicable policies.

Wikipedia is not a testing ground for LLM development, for example, by running experiments or trials on Wikipedia for this sole purpose. Edits to Wikipedia are made to advance the encyclopedia, not a technology. This is not meant to prohibit editors from responsibly experimenting with LLMs in their userspace for the purposes of improving Wikipedia.

You must not use LLMs to write your comments. Communication between editors is at the root of Wikipedia's decision-making process and it is presumed that editors contributing to the English-language Wikipedia possess the ability to communicate effectively. It matters for communication to have one's own thoughts and find an authentic way of expressing them. Using machine-generated text fails this requirement since it is not a surrogate for putting in personal effort and engage constructively.

Repeated misuse of LLMs forms a pattern of disruptive editing, and may lead to a block or ban.

Handling suspected LLM-generated content

Identification and tagging

Editors who identify LLM-originated content that does not to comply with our core content policies should consider placing {{AI-generated|date=December 2024}} at the top of the affected article or draft, but only if they don't feel capable of quickly resolving the identified issues themselves. In biographies of living persons, such content should be removed immediately—without waiting for discussion, or for someone else to resolve the tagged issue. The template {{AI generated notification}} may be added to the talk page of the article.

For editors violating this policy, the following can be used to warn them on their talk pages:

{{uw-ai1}}
{{uw-ai2}}
{{uw-ai3}}
followed by the generic {{uw-generic4}}

Removal and deletion

All suspected LLM output must be checked for accuracy and is assumed to be fabricated until proven otherwise. LLM models are known to falsify sources such as books, journal articles and web URLs, so be sure to first check that the referenced work actually exists. All factual claims must then be verified against the provided sources. LLM-originated content that is contentious or fails verification must be removed.

If removal as described above would result in deletion of the entire contents of the article or draft, it then becomes a candidate for deletion.^[c] If the entire page appears to be factually incorrect or relies on fabricated sources, speedy deletion via WP:G3 (Pure vandalism and blatant hoaxes) may be appropriate.

^ This also applies to cases in which the AI model is in a jurisdiction where works generated solely by AI is not copyrightable.
^ For example, someone skilled at dealing with vandalism but doing very little article work should probably not start creating articles using LLMs. Instead, they should first gather actual experience at article creation without the assistance of the LLM. The same logic applies to other areas, such as creating modules, templates, etc.
^ As long as the title indicates a topic that has some potential merit, it may be worth it to stubify and possibly draftify, or blank-and-redirect, articles. Likewise, drafts about viable new topics may be convertible to "skeleton drafts", i.e. near-blanked, by leaving only a brief definition of the subject. Creators of such pages should be suitably notified or warned. Whenever suspected LLM-generated content is concerned, editors are strongly discouraged from contesting instances of removal through reversal without discussing first. When an alternative to deletion is considered, editors should still be mindful of any outstanding copyright or similar critical issues which would necessitate deletion.

References

^ Smith, Adam (2023-01-25). "What is ChatGPT? And will it steal our jobs?". www.context.news. Thomson Reuters Foundation. Retrieved 2023-01-27.

[2] This also applies to cases in which the AI model is in a jurisdiction where works generated solely by AI is not copyrightable.

[3] For example, someone skilled at dealing with vandalism but doing very little article work should probably not start creating articles using LLMs. Instead, they should first gather actual experience at article creation without the assistance of the LLM. The same logic applies to other areas, such as creating modules, templates, etc.

[4] As long as the title indicates a topic that has some potential merit, it may be worth it to stubify and possibly draftify, or blank-and-redirect, articles. Likewise, drafts about viable new topics may be convertible to "skeleton drafts", i.e. near-blanked, by leaving only a brief definition of the subject. Creators of such pages should be suitably notified or warned. Whenever suspected LLM-generated content is concerned, editors are strongly discouraged from contesting instances of removal through reversal without discussing first. When an alternative to deletion is considered, editors should still be mindful of any outstanding copyright or similar critical issues which would necessitate deletion.

[1] Smith, Adam (2023-01-25). "What is ChatGPT? And will it steal our jobs?". www.context.news. Thomson Reuters Foundation. Retrieved 2023-01-27.

[1]

[a]

[b]

[c]

@@ Line 7: / Line 7: @@
 LLM-generated content is often an [[Hallucination (artificial intelligence)|outright fabrication]], complete with
-[[Wikipedia:Fictitious references|fictitious references]], which are emblematic of hoaxes. It is routinely [[Wikipedia:V|non-verifiable]], comprising the machine-generated equivalent of [[Wikipedia:OR|original research]]. It may also be [[Wikipedia:NPOV|biased]], may [[Wikipedia:Libel|libel]] [[WP:BLP|living people]], and may violate [[Wikipedia:Copyrights|copyrights]]. Editors who are not fully aware of said risks must not edit with the assistance of these tools. LLMs must not be used for tasks and in subject areas with which the editor does not have substantial familiarity. Their outputs must be {{em|rigorously scrutinized}} for compliance with all applicable policies. As with all their edits, an editor is fully responsible for their LLM-assisted edits. Furthermore, LLM use to generate or modify text must be declared in the [[Help:Edit summary|edit summary]].
+[[Wikipedia:Fictitious references|fictitious references]], which are emblematic of hoaxes. It is routinely [[Wikipedia:V|non-verifiable]], comprising the machine-generated equivalent of [[Wikipedia:OR|original research]]. It may also be [[Wikipedia:NPOV|biased]], may [[Wikipedia:Libel|libel]] [[WP:BLP|living people]], and may violate [[Wikipedia:Copyrights|copyrights]]. Editors who are not fully aware of said risks must not edit with the assistance of these tools. LLMs must not be used for tasks with which the editor does not have substantial familiarity. Their outputs must be {{em|rigorously scrutinized}} for compliance with all applicable policies. As with all their edits, an editor is fully responsible for their LLM-assisted edits. Furthermore, LLM use to generate or modify text must be declared in the [[Help:Edit summary|edit summary]].
 == Basic guidance ==

Wikipedia:Large language models: Difference between revisions

Revision as of 15:46, 15 June 2023

Basic guidance

Risks and relevant policies

Copyright violations

Original research and "hallucinations"

Unsourced or unverifiable content

Algorithmic bias and non-neutral point of view

Loss of volunteer effort

Using LLMs

Specific competence is required

Disclosure

Writing articles

Using sources with LLM-generated text

Be constructive

Handling suspected LLM-generated content

Identification and tagging

Removal and deletion

See also

Demonstrations

Related articles

Notes

References