User:Nescientist/AI draft

This article is a proposed guideline for Bulbapedia.

Please discuss the proposed guideline and suggest possible changes on the article's talk page.

While AI tools, including but not limited to large language models (LLMs) like ChatGPT, can be very useful to assist editors when trying to improve Bulbapedia, they also pose significant risks. This guideline seeks to explain those risks, and enable editors to tell cases where the use of AI tools may be appropriate.

Contributions are valued by what they contribute, not based on the means by which they were achieved. In cases where the risks of using AI tools cannot be overseen, or are unbearable to take, AI tools should not be employed. In cases where such risks are tangible, however, it is possible for editors to employ AI tools in a very careful and responsible way. Ultimately, both AI-assisted edits and edits done without any assistance can be helpful, or detrimental.

Editors take full responsibility for any and all of their edits, regardless of whether or not AI tools have been used in the process. For transparency reasons, the use of AI tools should be disclosed, such as by marking the tool's specifics in relevant edit summaries. If an AI-assisted edit violates any of Bulbapedia's policies (such as the speculation policy or the code of conduct), its author is liable just the same regardless, and may face consequences up to and including blocks. Upon using AI tools irresponsibly, staff can also prohibit users from using AI tools in accordance to the banning policy.

Usage

Specific competence and care

AI tools are only meant to be assistive, and cannot replace human judgment. Editors using AI tools should not only familiarize themselves with policies (including its views on AI tools presented here), but also with the specific AI tool they are using, and how its used methods and limitations may be in contrast to policy.

Whatever task an editor chooses to have AI tools to be assisted with, the editor should be able to assess the quality of the result. If the result is unsatisfactory, the editor needs to either refine it manually, or not publish it at all. If the editor is unable to assess the quality of the result (such as by being inexperienced in working with that specific group of tasks), they should refrain from publishing the result.

Examples

If an LLM such as ChatGPT is tasked with writing the plot summary of an anime episode, its output will most certainly not meet Bulbapedia's quality standards both by being out of style, and by being factually incomplete, inaccurate or incorrect. It might even contain outright plagiarism. (See the Risks section for more details.) Therefore, its output is not to be used on Bulbapedia, and will most likely be useless even as the basis of an original piece of episode summary.

Using machine translations for Japanese media (such as newly released TCG cards) may be possible, but can often be inferior to translations made by proficient experts. For example, Google Translate will usually give valid translations, but may often fail to capture the specific Pokémon-related usecase adequately; resident TCG experts, however, may know how specific terms or phrases are usually translated in a consistent manner (even without the use of a machine translation tool).

On the other hand, dedicated tools can be exceptionally useful for tasks such as cropping images, editing a set of articles in a predefined way, or doublechecking large amounts of text to potentially indicate questionable grammar etc. These tools may only pose minimal risks and may only require minimal supervision/correction from experienced editors.

Disclosure

For transparency reasons, it is helpful to disclose the use of AI tools, such as by marking the tool's specifics in relevant edit summaries. This allows other users to view the edit in light of being aided by AI tools, potentially spotting AI limitations more easily. Failure to disclose the use of AI tools on inquiry can lead to a block.

Risks

One common problem of LLMs such as ChatGPT is their tendency to hallucinate, i.e. to produce output that is in fact false and misleading, and sometimes entirely fabricated. This can occur for any given problem, and may frequently occur for problems that involve new information that was not yet available when the LLM was trained. The exact reason for these hallucinations (i.e. whether this misinformation ultimately comes from bad training data, or bad inferences the tool makes) is irrelevant; it is relevant, however, that these hallucinations are checked, and not propagate to become Bulbapedia content. This can be particularly problematic when LLMs are inherently designed to give plausible-sounding responses (even when the response is all but certain). Just because the output of a tool would typically be entirely accurate, it cannot be assumed it will always be.

Another common problem involves copyright violations. Content published to Bulbapedia requires being licensed as detailed at Bulbapedia:Copyrights. However, LLMs can generate copyright-violating material, such as verbatim snippets of copyrighted works, or plagiarism. Any such content is incompatible with Bulbapedia, and must not be published.