Jump to content

Wikipedia:Village pump (policy)

From Wikipedia, the free encyclopedia
 Policy Technical Proposals Idea lab WMF Miscellaneous 

The policy section of the village pump is intended for discussions about already-proposed policies and guidelines, as well as changes to existing ones. Discussions often begin on other pages and are subsequently moved or referenced here to ensure greater visibility and broader participation.

  • If you wish to propose something new that is not a policy or guideline, use Village pump (proposals). Alternatively, for drafting with a more focused group, consider starting the discussion on the talk page of a relevant WikiProject, the Manual of Style, or another relevant project page.
  • For questions about how to apply existing policies or guidelines, refer to one of the many Wikipedia:Noticeboards.
  • If you want to inquire about what the policy is on a specific topic, visit the Help desk or the Teahouse.
  • This is not the place to resolve disputes regarding the implementation of policies. For such cases, consult Wikipedia:Dispute resolution.
  • For proposals for new or amended speedy deletion criteria, use Wikipedia talk:Speedy deletion.

Please see this FAQ page for a list of frequently rejected or ignored proposals. Discussions are automatically archived after 7 days of inactivity. To keep this page's size accessible, discussions with more than about 100 comments should be split to a separate page.

Proposed edit to the BLP

[edit]

Hello, at the BLP talk page, I proposed adding 17 words to the policy. That was a couple days ago, and there's been no comment. So here I am!

I would like to suggest a small modification: "A living person accused of a crime is presumed innocent until convicted by a court of law, and upon conviction there is an exception to the usual BLP policy on the inclusion of denials." This would clarify an exception to the usual BLP policy ("If the subject has denied such allegations, their denial(s) should be reported too"). We shouldn't feel obliged to always include denials of convicted people.

This small modification matches what the essay WP:NOTMANDY has said for years: "the Biographies of Living Persons (BLP) policy does not require denials be mentioned if a person has been 'convicted by a court of law', in which case they can be presumed guilty" (N.B. I was the lead author of that essay). Of course, this BLP clarification would still allow denials to be included in the rare case of convicted people whose denials are still reported after conviction, in reliable sources. Nor would this clarification apply in the rare case that a conviction is overturned (i.e. the clarification applies "upon" conviction, not forever "after" conviction).

Thoughts? Anythingyouwant (talk) 09:43, 7 June 2026 (UTC)[reply]

Have you considered the possibility of a wrongful conviction? Why wouldn't we include denials, such as:
  • He was convicted but plans to appeal.
  • He was convicted but continues to claim that he is innocent.
Even including a sentence such as "He entered a plea of not guilty" is including the accused's denial of the allegations in the article.
I think that a small change might be appropriate: It might be better to say that "their denial(s) should normally be reported too". I wouldn't weaken it beyond that. WhatamIdoing (talk) 00:00, 8 June 2026 (UTC)[reply]
If there’s RS reporting about a wrongful conviction or an appeal, then that can be included per usual Wikipedia editing rules. But most convicted felons will always say they’re innocent, I don’t see the point in including that, the presumption of innocence no longer applies to people who have been convicted. I feel that invariably requiring a denial be included in a BLP even for convicted felons leads to people just ignoring WP:DENIALS altogether. There is also not generally a tradition among journalists or scholars of mentioning denials of people who have been convicted. Note that I haven’t suggested any change to the language of WP:DENIALS, but you’ve done so; I think your insertion of the word “normally” would make more people ignore WP:DENIALS than already ignore it…which is the exact opposite of what I’m trying accomplish. Every situation is abnormal in some sense. Anythingyouwant (talk) 00:13, 8 June 2026 (UTC)[reply]
If RSes are reiterating the denial of guilt by the convicted, there's zero reason to not include it. MANDY was a dangerous essay and we should avoid going there in BLP. Masem (t) 03:06, 8 June 2026 (UTC)[reply]
I agree about reiteration, but consider there’s an arrest of John Doe, and a not guilty plea which are both recited in RS. Then after a fair trial and unanimous conviction by a jury, the RS’s stop talking about any denial. Do we have to keep saying it? That’s what the policy currently requires, Anythingyouwant (talk) 10:52, 8 June 2026 (UTC)[reply]
If John Doe pleaded guilty at the trial we would undoubtedly mention that, so why would we not mention a not guilty plea? I'm struggling to understand how a lack of mention would improve the article? Thryduulf (talk) 11:04, 8 June 2026 (UTC)[reply]
If someone is convicted, it’s normal to drop any mention of the trial, including the plea. We just say the person served 10 years in jail for embezzlement. Do we have to say, he served 10 years for embezzlement, although he argued at his trial that he was innocent? Even though no post-conviction RS’s discuss the denial? Anythingyouwant (talk) 11:11, 8 June 2026 (UTC)[reply]
Dropping coverage of the trial would depend on how much coverage the trial got. Some persons have had very public trials (Weinstein, Cosby, for example), and just because the end result is a conviction doesn't mean it would be appropriate to remove any discussion of the trial (and of course their plea of innocence before it). But this is usually the exception, most people are convicted with a relatively quiet coverage of the trial itself, and in that case, yes we can jump to the conviction and likely not have to worry about any claims made by the convicted if there's no significant coverage of that.
A key to remember is that the decision from a trial is only a legal determination of any guilt. The decision should not be taken as a factual account of the actual events, only whether the person is guilty on the legal presentation of those events to a judge or jury. Masem (t) 11:29, 8 June 2026 (UTC)[reply]
If no RS covers the claim of the convicted suspect that they are innocent after the trial concludes, I would agree we should not cover that as well. Masem (t) 11:25, 8 June 2026 (UTC)[reply]
@Anythingyouwant, I don't think it's actually true that most convicted felons will always say they’re innocent. Plea bargaining happens precisely because so many convicted felons were willing to admin that they were guilty. I've read that in the US, that's 98% of convictions. WhatamIdoing (talk) 20:52, 8 June 2026 (UTC)[reply]
They initially denied the allegations. I’m not sure if a recanted denial (perhaps under pressure) is exempt from WP:DENIALS. Anythingyouwant (talk) 21:02, 8 June 2026 (UTC)[reply]
I'll confess I am having a hard time imagining a situation in which a denial of a noteworthy criminal conviction would not itself be a noteworthy part of the story. Are there some recent examples where this has been controversial? Is this limited to situations where the only source for the denial is a primary one? (That is, situations where the denial is verifiable but has not been reported in any reliable secondary source?) If so, I think the proposed language would be improved by clarifying that. -- Visviva (talk) 03:00, 8 June 2026 (UTC)[reply]
As I just mentioned to Masem, consider there’s an arrest of John Doe, and a not guilty plea which are both reported in RS. Then after a fair trial and unanimous conviction by a jury, the RS’s stop talking about any denial. Do we have to keep saying it? Anythingyouwant (talk) 10:54, 8 June 2026 (UTC)[reply]
The key here is determining how much WEIGHT to give the denial, not whether to include/exclude. I can see the argument that we should give less weight to a post-conviction denial than we would to a denial at the accusation stage. But that doesn’t mean we should not mention the denial at all. Blueboar (talk) 12:42, 8 June 2026 (UTC)[reply]
Your comment about weight brings up a couple related points. It seems obvious to me that whenever WP:DENIALS requires us to include a denial, the denial ought to be in the same paragraph as the accusation, not hidden far away, maybe even in a footnote where it’s almost completely useless. Another thing that seems obvious is that editors should be sanctioned for violating WP:DENIALS, claiming WP:IAR shouldn’t cut it. (I actually was sanctioned for insisting it be followed despite local consensus to ignore it.) The USSR had a wonderful constitution that protected all kinds of civil liberties, but it was never enforced. I often get that feeling about Wikipedia’s policies and guidelines. Anyway, there’s probably a category for living convicted felons, and I bet you not 10% of those BLPs mentions their denials. Nor should they be mentioned, if no post-conviction RS’s mention them. I don’t see why we must. If there’s consensus to do so in a particular case, that’s fine. Anythingyouwant (talk) 13:11, 8 June 2026 (UTC)[reply]
It seems obvious to me that whenever WP:DENIALS requires us to include a denial, the denial ought to be in the same paragraph as the accusation - no, that is absurd. Per WP:NPOV, weight and focus is decided based on the weight in the sources, not based on editors' gut feelings about fairness or what they personally think is important. If you want to put a denial in the lead, you need to produce sourcing strong enough to justify it there. Other policies cannot overrule NPOV on this (as I suspect you found out in the dispute you mentioned.) They can put a great deal of weight on the scale in ambiguous cases, and most cases are ambiguous to an extent; but no policy can say "you MUST do XYZ, without regard for sourcing." When other policies and guidelines come into direct contradiction with WP:NPOV and WP:V, NPOV and V win. With well-considered and well-worded policies, this rarely happens; if you find a particular policy is coming into conflict with NPOV and V usually often, that's probably an indicator that that policy should have its wording toned down. --Aquillion (talk) 03:11, 18 June 2026 (UTC)[reply]
  • A relevant and topical article is Wrongful conviction of Andrew Malkinson. In this case, recently concluded when the actual offender was convicted, Malkinson actually served 10 years longer than necessary in jail because to apply for parole, he would have had admit that he committed the crime, which he refused to do. Black Kite (talk) 11:22, 8 June 2026 (UTC)[reply]
    I appreciate such concerns, and I think my proposal can be improved. But in the vast majority of Wikipedia articles that mention people who have served jail time for crimes, nothing like that is involved. They typically have a trial where they claim innocence, then there’s a conviction, and the claim of innocence is never mentioned in RS’s after the conviction. So I would tweak my proposal: “A living person accused of a crime is presumed innocent until convicted by a court of law, and upon conviction there is an exception to the usual BLP policy on the inclusion of denials unless the relevant reliable sources are post-conviction.." That doesn’t mean such details cannot be included, just that they don’t have to be included, in a BLP. My experience is that WP:DENIALS is still widely ignored at Wikipedia (even at BLPN!), so I would like to get rid of aspects of WP:DENIALS that are overkill. Then maybe people will adhere to it. Anythingyouwant (talk) 11:53, 8 June 2026 (UTC)[reply]
    Can you give 3 example articles where this would make a difference? I'm having a hard time imagining articles where we mention a conviction and shouldn't at least briefly include the position of the convicted, but having actual examples where this should be removed (or not be included if it isn't there now) might convince me or others otherwise. Fram (talk) 13:03, 8 June 2026 (UTC)[reply]
    I am disinclined to name particular BLPs. I do not relish the thought of being accused (not by you) of trying to manipulate policy to advance my position in any particular content dispute, which of course I am not trying to do. Anyway, I don’t think it should require examples to see that it’s silly to require we mention a denial when no post-conviction RS does so. Allowing inclusion of such denials is fine, but requiring them is absurd. And we rarely do require it, even though it’s plainly required by the policy. I would prefer to have a policy that we can rely on, and that means what it says. Anythingyouwant (talk) 13:29, 8 June 2026 (UTC)[reply]
    There are many things that should (not "must") be included but often are not (yet). The policy mainly protects those wanting to include it from others wanting to remove it "because they are convicted" or "because they are obviously lying" or whatever reason is given for the removal. It should not get undue weight, we don't strive for equal balance between conviction and denial in most cases, but there's nothing silly about including it even if post-conviction it is no longer mentioned by the sources. Fram (talk) 13:39, 8 June 2026 (UTC)[reply]
    News reports about an accusation typically include denials, so when we omit a denial in that situation, it implies the person does not dispute guilt. That situation is entirely different after conviction, there is no journalistic practice of mentioning denials after conviction. So that aspect of our policy is absurd, and is often rightly ignored. That undermines the policy. As for the idea that the policy protects any editors, that’s a sweet notion, but untrue. For example, I was banned from American politics articles for five years merely because I insisted on reinserting an obvious denial into a BLP. But I agree with you that the policy *should* protect editors who follow it, and so I’m striving here for a policy that makes good sense and can command respect. Anythingyouwant (talk) 14:01, 8 June 2026 (UTC)[reply]
    We are not journalists, we don't do journalism. We take a wider view. Sources from before or during a case don't become inadmissible because we also have sources from after the case. Of course, we should present up-to-date information wherever possible, but without an indication that they no longer claim to be innocent (or often that they claim to be innocent of certain aspects), there is no reason to ignore these previous sources. As for your topic ban, as far as a brief search informs me it was (after a series of escalating issues I haven't checked) for edit warring about the inclusion of denial in the lead, not simply for including the denials in the article (where they already were), on a page with an editing restriction against all edit warring. That's not an issue with this policy or its application, but about which aspects of an article are important enough to also be in the lead. The policy just says that they should be included, not that they should be given equal weight or position. I don't think changing policy on this basis is a good idea. Fram (talk) 14:37, 8 June 2026 (UTC)[reply]
    I mentioned above that, “It seems obvious to me that whenever WP:DENIALS requires us to include a denial, the denial ought to be in the same paragraph as the accusation, not hidden far away, maybe even in a footnote where it’s almost completely useless.” Obviously, the edit that I am proposing here does not address that problem. So I’m not proposing to change “policy on this basis”. My proposal here is simply that, after conviction, we should give editors some flexibility about whether to include denials if no post-conviction RS’s say anything about any denial. But I suppose this is a trivial BLP requirement if it can be satisfied by sticking the denial in a footnote. Or perhaps we can even put it in a sub-article instead of a footnote? Is that a proper interpretation of BLP? Anythingyouwant (talk) 14:59, 8 June 2026 (UTC)[reply]
    Belonging in the same paragraph of the body doesn't mean it should also be in the lead. Take as an example Marc Dutroux, one of the most notorious Belgian criminals. The lead says nothing about his denial, the section on "Dutroux's testimony" gives his denials. As it should be. It's not hidden away in a footnote, but it's also not given equal weight or emphasis. Fram (talk) 15:52, 8 June 2026 (UTC)[reply]
    I have no problem with the Dutroux BLP. He’s been convicted of various crimes, so the allegation in the lead that he committed those crimes doesn’t need to be accompanied by his denial in the lead. When someone has been convicted, it means the denial has been adjudicated and found legally void. If this were an article about a Belgian accused serial killer, instead of a convicted one, then the denial should accompany the accusation in the lead for at least two reasons: (1) he’s entitled to a presumption of innocence, and (2) reciting the accusation without the denial gives the impression that he hasn’t denied it, because such things are customarily described together for people who haven’t been convicted. Anythingyouwant (talk) 17:23, 8 June 2026 (UTC)[reply]
    Presumption of innocence is something that the accused are entitled to only within the legal system. The accused is not entitled to a presumption of innocence from his accusers, his mother, his employer, his bank, or (relevantly for us) newspapers and other reliable sources. I think therefore that we can dispense with your reason #1.
    Your reason #2, on the other hand, is important to Wikipedia. We don't want to give the impression that the accused has confessed to guilt, when that's not true. However, that doesn't mean that a denial always has to be described along with the first mention of the accusation. After all, it's only "customarily" done, which is another way of saying that it's "not always" done, and the custom, when I flip through some news articles, is "in the same article" rather than "right at the start". (Examples: 4th paragraph, 4th paragraph, subhead plus 4th paragraph, 15th paragraph, 3rd paragraph, 12th paragraph, 3rd paragraph, 4th paragraph, 4th paragraph, 15th paragraph, 7th paragraph, 13th paragraph. If the main point of the article is the denial, then it may appear in the 1st sentence [ example, example ].)
    Deciding how much prominence in a Wikipedia article to give to a denial should be handled like anything else: very prominent in some cases (e.g., trumped-up prosecutions of political dissidents in repressive countries), given some space in other cases (e.g., when most reliable sources suggest that factual innocence is a significant possibility), mentioned briefly in others (e.g., all the information we can source amounts to 'He denied it'), and barely mentioned at all in others (e.g., most people post-conviction: "He entered a plea of not guilty in March, and was convicted of six of the nine charges at the trial the next month" or "As of Octember 2025, he is appealing the conviction"). There isn't a one-size-fits-all answer. Editors will have to use their judgment to decide where and how to handle denials. WhatamIdoing (talk) 22:47, 8 June 2026 (UTC)[reply]

WhatamIdoing, I didn’t suggest one size fits all, that’s why I’m suggesting an exception for the particular situation where a person has been convicted and there’s no post-conviction RS’s that discuss the denial. So I would weaken WP:DENIALS to fit that common circumstance, and let editors decide in the usual way. Conversely, where an accusation is in the lead, I would strengthen WP:DENIALS to fit that circumstance. Consider that WP:LEAD correctly says this: “The lead is the first thing most people read upon arriving at an article, and may be the only portion of the article that they read…. The lead should stand on its own as a concise overview of the article's topic. It should identify the topic, establish context, explain why the topic is notable, and summarize the most important points, including any prominent controversies.” If editors decide to put an accusation in the lead, then it seems obvious that the denial should be briefly mentioned there too (putting aside situations where there’s been a conviction). Putting it in the lead still leaves lots of room for discretion, it might be a whole sentence, or it might be just three words (“which she denies”). If WP:DENIALS is so elastic that it can be evaded by putting an accusation in the lead while the denial is put in a footnote or a sub-article or buried in a subsection, then WP:DENIALS seems pretty much useless to me. Readers who just read the lead will be misled, because readers are accustomed to seeing any denials in conjunction with accusations. Anythingyouwant (talk) 09:18, 9 June 2026 (UTC)[reply]

You want to change DENIALS to require mentioning any denial in the lead, if the allegation is in the lead. This is not ordinary practice in reliable sources. You are proposing DENIALS have a one-size-fits-all rule that if an allegation is mentioned in the lead, then the denial should also be mentioned in the lead. The way that we avoid WP:GEVAL problems with denials is precisely letting editors decide for each article individually and not according to any one-size-fits-all rule at DENIALS:
  • whether, if the lead mentions the allegation, the lead should also include the denial, or if the denial should only appear in the body, and
  • whether the denial should be described expansively or if it should be mentioned only in passing or be "buried in a subsection".
In other words, the current wording ("If the subject has denied such allegations, their denial(s) should be reported too") provides the flexibility that we need. Expanding it to say something like "If the allegations are mentioned in the lead, then the denial has to be mentioned in the lead, too" would mean making some articles less compliant with NPOV. WhatamIdoing (talk) 17:58, 9 June 2026 (UTC)[reply]
Thanks for the discussion. We disagree. WP:NPOV also says, “Wikipedia aims to describe disputes, but not engage in them.” Putting accusations in the lead without even three or four words on the other side of the dispute (“which she denies” or “which she implausibly denies”) does not seem consistent with that philosophy, nor with WP:LEAD, nor with WP:DENIALS, and I don’t see any issue about false balance either, because we can make crystal clear that RS say the denial is implausible or dishonest or what have you, which is also quite revealing about the BLP subject. Anythingyouwant (talk) 22:38, 9 June 2026 (UTC)[reply]
If the sources treat the denial as unimportant or irrelevant or as a throw-away comment, then Wikipedia should, too. And one of the ways that we treat information (of any kind, including denials) as being unimportant or irrelevant or as a throw-away comment is to omit it from the lead. Ergo, if we are, as best we can, providing a fair and unbiased summary of the sources, then that will occasionally mean omitting the denial from the lead.
Additionally, facts that require complex and nuanced explanation don't belong in the lead. "He denied everything" is just three words, but what about when he admits that as a matter of ordinary fact, he did kill someone, but there were mitigating circumstances, or he thinks the world's a better place now that the victim is dead, so he doesn't feel guilty, and besides, the charges are against his strawman instead of against the real him, because he's a sovereign citizen?
The bottom line is that we have to let editors use their best judgment. WhatamIdoing (talk) 03:21, 10 June 2026 (UTC)[reply]
I'm planning on discussing this issue of splitting up the accusation and the denial, in the essay WP:NOTMANDY if the other editors there don't mind. If and when that's done I'll give you a heads up so we can discuss further, and/or you can edit that new essay section if you want. WP:BLP (especially combined with other policies) properly treats denials as a special aspect of a BLP, not merely a routine editing decision. If sources include a denial, mindreading the importance the s0urces attach to it is often difficult, whereas it's not difficult to conclude that many times a Wikipedia editor will like or dislike a BLP subject and edit accordingly. This policy on denials is a rare obstacle to subjective editing, to protect BLP subjects and give them a small voice. I don't understand your comment about being a sovereign citizen. As to admitting a kill, but claiming mitigating circumstances, or claiming he thinks the world's a better place now that the victim is dead, so he doesn't feel guilty, you seem to be arguing that WP:DENIALS should either be broadened or abolished, but I think keeping it a narrow requirement is the best and most practical compromise so it's not burdensome but does have a real impact in our BLPs. Anythingyouwant (talk) 12:36, 10 June 2026 (UTC)[reply]
I gave you an example in which a denial can't be presented in just "three or four words". You seem to be treating all denials alike. Here are two scenarios in which that doesn't work:
  • Sometimes a denial is not really a denial. Sometimes the "denial" is an admission ("I killed him") while claiming that the admitted facts do not rise to a legal violation ("but it was in self-defense" or "but my client was legally insane at the time") or even that the person does not recognize the court's authority ("I believe a conspiracy theory that says you can't prosecute me").
  • Sometimes a denial is complex, e.g., admitting some and denying others. Especially when the allegations are a minor part of the article (e.g., allegations made during the course of a celebrity divorce), then it might not be appropriate to highlight this in the lead, especially if doing so would require a long explanation.
Remember that there are two ways to end up with WP:UNDUE attention to a denial:
  1. Giving too much "space" to the denial: If sources mention the denial briefly, or only once, then the Wikipedia article should be concise, too. That includes not repeating the denial multiple times in the same article (e.g., lead plus body).
  2. Giving too much "prominence" to the denial: If sources mention the denial at the end of an article, then the Wikipedia article should not put the denial in the lead. Conversely, if the sources focus on the denial, then the Wikipedia should make the denial prominent in the article.
Understanding how a source treats a denial doesn't require "mindreading". Editors should understand how all the sources are treating the denial (the same way you should understand how all the sources treat anything about the subject), and they should consider these general views or trends in reliable sources when writing the article. If the sources tend to emphasize the denial (e.g., it's at the top of the news articles, it's the main subject of some news articles, it's given a lot of space in the news articles), then the Wikipedia article should equally emphasize the denial (e.g., put the denial in the lead, give the denial a whole section, or at least a whole paragraph). If the reliable sources downplay the denial (e.g., mention it only in passing, place it at the end of the news article), then the Wikipedia article should do the same (e.g., omit the denial from the lead, use only three or four words in the body of the Wikipedia article to mention it).
This isn't difficult. Editors manage to do this all the time. If you personally struggle to do this, then I suggest that you step back from that and let other editors take the lead on this point. WhatamIdoing (talk) 23:30, 10 June 2026 (UTC)[reply]
If you somehow got the idea that I think every denial in a BLP should go in the lead, that’s incorrect, I don’t think that. I’m saying that if the accusation is in the lead, then the denial should be mentioned in the lead too, however briefly. Whatever the proper resolution to this issue is, it ought to be made clearer in at least one relevant policy or guideline, because right now editors could easily think that an accusation in the lead should be accompanied by its denial; BLP does not say otherwise but it does indicate that a denial is an important feature of a BLP, and policy on the lead says the lead should be able to stand on its own. It’s obvious that a denial is often simple and straightforward and wouldn’t require more than three words. I don’t need to take a step back, it seems a perfectly legitimate topic of discussion here, and I hope the situation will be improved one way or the other. Perhaps we can agree that if an error is made regarding whether to put a denial in the lead of a particular BLP, it would be better to err by putting it there than to err by only having the accusation in the lead. Anythingyouwant (talk) 03:03, 11 June 2026 (UTC)[reply]
Thanks again for the discussion. See also: Placement of a denial in the lead. Anythingyouwant (talk) 17:18, 11 June 2026 (UTC)[reply]
I think it's a moot point. WP:DUE is core policy and is non-negotiable; we cannot have policies that require or circumvent it. And DUE is based only on the sourcing, not on editors' personal opinions about what is significant (as expressed in essays or, indeed, non-core policies.) That means that in situations where the sourcing plainly makes it WP:UNDUE to put a denial in the lead, it cannot be placed in the lead. This would remain true even if someone attempted to edit BLP to require denials in the lead unequivocally - DUE would still override it in cases where doing so would be clearly undue; and DUE, as part of NPOV, is non-negotiable core policy, so it cannot be change to require this, and would override even another policy (even a very weighty policy like BLP) unambiguously trying to force something into the lead without regard for sourcing. This has been hashed out before with WP:WTW - people have sometimes tried to use it to argue that something must be attributed even when conceding that the sourcing unequivocally establishes it as uncontested fact, per WP:NPOV's Avoid stating facts as opinions. They have always, when they took that route, lost, because NPOV trumps WTW. The same principle applies here - we can set a very high bar; we can advise extreme caution. We can write policies under the assumption of "well, with this thing it'll never be so clear-cut as to put this policy in direct contradiction with NPOV." But when that does happen, no policy can ever directly instruct editors to ignore or override the weight of the sources completely, and any attempts to do so will fail in practice. The ultimate arbiter for weight and placement and inclusion is always the sources themselves, and no policy can completely overrule that. --Aquillion (talk) 03:00, 18 June 2026 (UTC)[reply]
  • Some thoughts. I am certainly not of the opinion that denial should be mentioned in every BLP case with conviction, in an ideal world. Things we might consider:
    • While the legal principle "innocent until proven guilty" applies in some jurisdictions, it does not apply in all.
    • Some systems are manifestly corrupt.
    • Further we know that even in the best systems innocent people are found guilty and guilty people are found innocent.
    • Denial of one thing is not denial of another. Often many counts are brought together.
    • Alford pleas are a thing.
    • Accused of a crime is not the same as being in the legal system.
    • Sometimes someone is found liable in a court of law, although there has been no criminal case (see E. Jean Carroll v. Donald J. Trump)
    • Sometimes someone is found innocent in a privately brought criminal prosecution, although they are found guilty later. For example one of Stephen Lawrence's murderers.
    • If someone was found guilty of a minor crime, how they pled may not be significant. For example a minor traffic or parking offence.
    • If someone has a very long history, for example was found guilty dozens of times of shoplifting, their pleas may not be readily available, and may overwhelm the statement about guilt if they need to be itemized.
    • Is a cast-iron rule a good way of ensuring the above are dealt with, or does it just create more un-necessary verbiage?
All the best: Rich Farmbrough 18:12, 23 June 2026 (UTC).[reply]

Should it be a policy requirement for the noticeboard poster about an article to always notify the local Talk page?

[edit]

Simple ask:

I think it's policy you're supposed to notice an editor if they are up on WP:ANI, 3rr, or whichever other, on their User talk page.

I propose such a requirement for any article page. So if I want to ask about a given source for Broccoli on WP:RSN, and I posted on RSN, I myself--me--am obligated and required to notify Talk:Broccoli as my next move/edit. That's it. — Very Polite Person (talk/contribs) 02:12, 10 June 2026 (UTC)[reply]

Not going to happen. voorts (talk/contributions) 02:16, 10 June 2026 (UTC)[reply]
A bot then? It just seems weird that it shouldn't be assumed the "local" users are notified...? — Very Polite Person (talk/contribs) 02:17, 10 June 2026 (UTC)[reply]
The community generally recognizes that it is good practice to provide notification of a discussion at a centralized noticeboard when a specific article is being discussed, but the community is generally against requiring notifications. For example, they aren't required for any of the deletion processes. voorts (talk/contributions) 02:20, 10 June 2026 (UTC)[reply]
Bizarrely, we require notifications if you're dragging an editor to AN/I or Arbcom (but not if you're saying they're a sockpuppet). But you don't have to tell the talk page if, for example, you're discussing the sources on RSN, or the formatting at some obscure subpage of the Manual of Style where a "consensus" of three editors is about to reach a decision that will bugger up thousands of pages. I think these rules aren't very well thought out, though.—S Marshall T/C 17:23, 10 June 2026 (UTC)[reply]
That's what has bugged me a while. There is A) no uniformity/standard and B) what we have seems decidedly arbitrary.
Like, if you're sending an article that has 3000 watchers to RSN, NOR, BLP or FTN boards for content issues (especially with no prior talk and editing engagement there in any recent memory), it feels really odd to bring in outside consensus views without allowing the local talk watchers to be fully aware and participatory.
What's an argument in favor of keeping the local "Talk:" out of the loop? — Very Polite Person (talk/contribs) 17:29, 10 June 2026 (UTC)[reply]
Busywork. Gnomingstuff (talk) 17:40, 10 June 2026 (UTC)[reply]
You're conflating a lack of notification requirement with editors being in favor of not providing notice. As I said, the community thinks notifications are a best practice but routinely rejects extending those requirements to new areas. It is what it is. This discussion won't go anywhere IMO. voorts (talk/contributions) 17:43, 10 June 2026 (UTC)[reply]
What's your basis for saying the community routinely rejects extending notification requirements to new areas? Have there been RFCs or something about this recently? FWIW, I think OP's idea is a good one, at least for certain types of noticeboard threads (particularly content ones, like RSN, NPOVN, etc.). Levivich (talk) 18:12, 10 June 2026 (UTC)[reply]
Yes, there have at least been other discussions like this. I don't remember where. It may have been regarding speedy deletion. I'm fine requiring notifications, but I think it would be an uphill battle. voorts (talk/contributions) 18:23, 10 June 2026 (UTC)[reply]
Notifications are required for deletion processes: when an article is AfD'd, there's a notice on the article itself. (And also, centralized notification at WikiProject pages and elsewhere.) Seems to me to make sense to post, eg, a notification at article talk pages when sources used at that article are being discussed at RSN. (I'd rather have a bot or script do it, à la XfD, rather than require a human to do it manually.) Levivich (talk) 18:16, 10 June 2026 (UTC)[reply]
There is no requirement to notify interested editors (such as article creators) about deletion requests. WikiProject notifications are never required. voorts (talk/contributions) 18:24, 10 June 2026 (UTC)[reply]
At least some notification is required. For example, tagging the article (which is notification) is required by AfD Step 2. The same step also requires listing it in the AfD log (a centralized notification). Step 3 is called "notify interested parties" (delsorts, WikiProjects, and individual contributors), and while Step 3 is not required, it is almost always done, because Twinkle does it and almost everyone uses Twinkle. My point is that under XfD procedure, some notification is required, the notification that is not required is nevertheless widely done, and all of that says to me that the community does indeed want and value notification. I do not believe the community is generally against notifications. Levivich (talk) 21:22, 10 June 2026 (UTC)[reply]
All I'm saying is that there are many editors who have been vocally against requiring notifications in other contexts. I'm not opposed to such a rule, but I doubt it will gain consensus. voorts (talk/contributions) 21:45, 10 June 2026 (UTC)[reply]
Do you recall if they objected to the action requirement being directly on them to do the thing, or was it opposition to a proposed standard that if Article whatever gets invoked for a discussion on say WP:RSN, that the local Talk:Article whatever folks must be informed/told so they can participate in the "off venue" chat?
It's two different things. I can see resistance (though I can't fathom why anyone would resist it) to the rule being on THEM to do it, but I can't possibly see any justification for the latter. If no one opposed the latter then this is a bot-level problem to fix probably? — Very Polite Person (talk/contribs) 21:55, 10 June 2026 (UTC)[reply]
I don't know exactly how notification bots work but I assume they are triggered by the structured nomination templates at forums like RM and XFD. Almost all other notice boards and discussion forums I can think of have pretty much free-form posts and not formal "nominations". And I would object to complicating the discussion procedure by introducing formalized, structured nominations as a requirement for posting on most discussion pages. —Myceteae🌈 (talk) 23:53, 10 June 2026 (UTC)[reply]
I agree. voorts (talk/contributions) 23:58, 10 June 2026 (UTC)[reply]
From what I recall, the context of the discussion was CSDs and the objection was that this should be discretionary, while acknowledging the norm that it's usually appropriate to notify relevant pages. I can't find the discussion. voorts (talk/contributions) 23:58, 10 June 2026 (UTC)[reply]
"WikiProject notifications are never required", but some of them, especially for deletion and other structured processes, are automatic via Wikipedia:Article alerts. Notifications of Talk: page discussions also happen via pages such as Wikipedia:WikiProject Medicine/Discussions. WhatamIdoing (talk) 00:15, 11 June 2026 (UTC)[reply]
Yes, that is the choice of each WikiProject. Some WikiProjects are okay with talk page notices; some do not want them at all. I don't see any reason to change the status quo. voorts (talk/contributions) 00:17, 11 June 2026 (UTC)[reply]
Just to be totally clear, my suggestion was Article talk page notifications if the article came up on a central board - specifically to prevent isolated clusters of competing consensus and so all editors are nominally /forced/ to equal procedurally/awareness footing at all times. — Very Polite Person (talk/contribs) 00:21, 11 June 2026 (UTC)[reply]
It seems like a basically sound idea to me, although I admit there may be some hidden downside I'm not thinking of. I think it would prevent some perverse behavior.
Sometimes people will misuse WP:CONLEVEL to reach a local consensus on an article that contravenes policy. But sometimes, too, people will go to a noticeboard to "pre-win" some argument they're having on a talk page, by presenting some rather self-serving version of the argument and getting everyone to agree that of course the policy shouldn't be applied that way, et cetera. A classic move if you are losing an RfC is to go to some policy page and quickly roust up a consensus that the RfC is invalid because it violates WP:OMGWTFBBQ... jp×g🗯️ 08:19, 16 June 2026 (UTC)[reply]

What we quite often get is people bringing things to policy talk pages, but stripped of context. Two editors get in an argument over whether the Patterson-Gimlin tape was faked, and then one of them comes to WT:V to ask, "Should Wikipedia publish unproven speculation and rumour?" and the other one goes to WT:NOT to ask, "Should we censor widely published information just because some people don't believe clear evidence?" (Basically, if anyone comes to a policy talk page or RSN to ask a question "in principle", it's best to go over their recent contributions and work out what arguments they're currently involved with before you reply.) I think that any proposed rule requiring notification on the article talk page would need quite thoughtful phrasing to ensure it has the intended effect.—S Marshall T/C 19:06, 10 June 2026 (UTC)[reply]

I think that any proposed rule requiring notification on the article talk page would need quite thoughtful phrasing to ensure it has the intended effect. I agree. There should almost always be a notice on the talk page of an article that is being discussed at a content notice board. For behavior issues, it's perhaps a little more complicated. If an editor's current or recent behavior at a particular article or its talk page generates an ANI report then other editors who have been involved or impacted should be made aware but I don't know that this should always be memorialized on the article talk page. For content notice boards, this would seem uncontroversial and I think it is not too onerous most of the time but there may be exceptions and I would want to think through the scope, purpose, and implications of following or not following a new requirement for notification. —Myceteae🌈 (talk) 23:44, 10 June 2026 (UTC)[reply]
I think that supporters of this idea should spend a month or two doing this systematically, and then seeing whether that appears to be helpful. We don't have a "requirement", but we also don't have a ban on anyone doing this if they think it would be helpful. To S Marshall's point about checking recent contribs, the OP at least appears to be willing to do this: Talk:Nikola Tesla#Nikola Tesla under discussion at Wikipedia:Fringe theories/Noticeboard#Nikola Tesla and Wikipedia:Fringe theories/Noticeboard#Nikola Tesla. WhatamIdoing (talk) 00:21, 11 June 2026 (UTC)[reply]
That motivated it, but I've seen this happen many times with RSN and FTN especially. — Very Polite Person (talk/contribs) 00:22, 11 June 2026 (UTC)[reply]
We would need to word it so that it covered only actual discussions and not just calls for attention. A non-trivial portion of the posts at noticeboards are of the "We could use eyes on this article" or "we could use experienced input on this talk page" variety, and that really shouldn't require a notice. What are the other editors from the article going to say, "no, we don't want anyone else looking at it, stay away"? That's different from an actual discussion on the noticeboard. -- Nat Gertler (talk) 12:43, 11 June 2026 (UTC)[reply]
I agree, this is an important distinction. And there is some grey area in practice. —Myceteae🌈 (talk) 15:06, 11 June 2026 (UTC)[reply]
Forgive me, in what theoretical scenario of a discussion (vs a notification) of something going down on a talk page of an article, related to content, that is discussed on a noticeboard... where the article/talk participants shouldn't be made aware of the bridged/side discussion? — Very Polite Person (talk/contribs) 16:14, 11 June 2026 (UTC)[reply]
For example, where you have reasonable grounds to suspect that a high proportion of the article/talk participants have a point of view or what Wikipedians miscall a "conflict of interest".[1]S Marshall T/C 17:46, 11 June 2026 (UTC)[reply]
[2] A high proportion of the article/talk participants or just a single person who bludgeoning the discussion by posting many more comments that anyone else on the talk page? --Guy Macon (talk) 03:48, 18 June 2026 (UTC)[reply]
I don't know that I would go so far as to say that article/talk participants shouldn't be notified. But I see a material difference between posts that reduce to "What should we do with this article?" versus "Please take a look at this". In the first instance, where there is a substantive discussion about the article content that generates (or may generate) a decision about how to edit the article, I think there should generally be a notification on the talk page of the article in question. In the case of "Please take a look" where there is no additional discussion or decision making, I think it is probably best to leave a notice but is less important. And even in the first instance ("What should we do?") there probably needs to be room for discretion and practical consideration. For example, a discussion on the talk page of a WikiProject or policy or guideline may impact dozens, hundreds, or even thousands of pages and notifying each and every one may be impractical. —Myceteae🌈 (talk) 20:06, 11 June 2026 (UTC)[reply]
  • It's worth pointing out that we do already have policies and guidelines against non-neutral notifications, I think. I wrote an essay about a particular form of this I sometimes see, WP:MOTTE; in general I feel that part of the solution is for experienced editors to be cautious when they see someone (especially another experienced editor, who ought to know better) asking a question which on the surface seems to have a very obvious answer. I've often found that when I see that and I dig into it, it turns out that there's a much more complicated underlying issue. That said, I don't think notification would help - the real problem with such discussions is that they often end up wasting people's time to no end, because when eg. the editor who brought the discussion over whether something was WP:DUE to RSN triumphantly returns to the article where there was a conflict and says "look, look, RSN said my source is reliable!" the person they were in dispute with will just go "...yesssss, but my objection was that this particular use of it was undue?" And then the dispute continues with that digression having accomplished nothing. Such interactions frustrate everyone involved but a notification wouldn't usually solve it. --Aquillion (talk) 03:27, 18 June 2026 (UTC)[reply]

Imagine that you run into a situation where an article talk page has huge problems -- an editor bludgeoning the discussion, a couple of editors flinging insults at each other, someone pushing a company, religion, or political view, or maybe just a bunch of well-intentioned new editors who think "reliable source" means "source that agrees with me."

In the middle of this huge fight you see something that makes you think "Hmmm. Is that source reliable for that claim?" so you pop over to RSN for a calm, reasoned discussion with a group of experienced editors who understand the subtleties of evaluating sources. In some cases inviting the disruptive editors who are tearing up an article talkpage to come on over and pull the same shit on the noticeboard is a very bad idea.

There is a benefit to having a calm discussion with editors who understand sourcing. Or COI. Or fringe theories. --Guy Macon (talk) 07:58, 12 June 2026 (UTC)[reply]

  • It's good practice to notify relevant pages, whether that's an article's talk page, a WikiProject, or any other places that's relevant, but I don't think this should ever be made a requirement. Not only does this not fit with other expectations, there's very places that say editors 'must' do something, but would likely to open to abuse and wikilayering over what should have been notified. -- LCU ActivelyDisinterested «@» °∆t° 17:14, 14 June 2026 (UTC)[reply]
    Yeah… I worry that if we make notification mandatory, wikilawyers will try to “negate” a consensus at a noticeboard on the technical grounds of “bad consensus - no notification”. Blueboar (talk) 10:25, 16 June 2026 (UTC)[reply]
    Same, I worry that it will cause more disputes and repeat or prolonged discussions. —Myceteae🌈 (talk) 17:11, 16 June 2026 (UTC)[reply]


References

  1. ^ What Wikipedians call a "conflict of interest", isn't. Wikipedia is not your client. It is not your customer. You do not owe Wikipedia a fiduciary duty. You do not owe Wikipedia a higher duty than you owe to your employer, or employees, or your country, or your monarch or republican equivalent. You have a basic moral responsibility to tell the truth, and we'd prefer it if you owned up to your financial interests or biases, but that's laughably far from adding up to a "conflict of interest" situation.~~~~
  2. ^ I object to the above editorializing in a ref. I think you should express your opinions in inline signed comment just like everyone else. Using a ref gives one voice undue prominence. --~~~~ Plus, the signature function doesn't work.

Recent changes to WP:LLM are too restrictive and actively harm some parts of Wikipedia

[edit]

A bit of background: I primarily edit medicine articles on Wikipedia. There are not many people who are regular contributors to WP:MED, only ~100-150 in the past month, but these pages are relied on by countless thousands of users. There is a need to keep them updated, high-quality, and well-supported, but not enough qualified people to make those edits.

LLMs are a force multiplier. They allow us to conduct literature reviews, draft text, and produce high-fidelity articles that are both scientifically accurate and readable by a general audience. They allow us to make meaningful expansions to a section or page in under half an hour, on a lunch break or a day off. And there is a way to do this that is quite productive: feeding the LLM academic articles, and instructing it to use that text to add the type of content you want. Let me give you an example. This diff, which I have self-reverted after reading the recent LLM policy update, was the product of feeding Sonnet 4.6 (a robust flagship LLM) the cited article and instructing it to add content to the existing version of the pathogenesis section. As a result, it was a major expansion of the section, completely compliant with the source. I believe it is very hard to claim that this diff was detrimental to the page or non-compliant with Wikipedia policies other than the policy preventing LLM usage wholesale; and since that non-compliance is the ostensible reason justifying the blanket ban, I think this diff also demonstrates why the recent version of the policy needs major revision.

I also acknowledge the problem of slop. This affects medicine pages too, but I'd bet it is a bigger problem on articles that are less technical in nature, since people tend not to feel qualified to edit technical articles on unfamiliar subjects. This serves as the foundation for my second idea, outlined below.

To find a more balanced equilibrium, I have the following (draft) proposals, which at this time are generalized ideas on how to edit the policy, and which I think could be discussed and built into something more concrete. I am also willing to volunteer my time to create prompts, rulesets, etc as needed, if some or all of these were to go through.

  1. Create an official Wikipedia prompt for LLM use in editing. System prompts are guidelines for LLMs that have high impact on how they function, and which are rarely if ever violated by the LLM. A robust system prompt can implement guardrails against all specific objections I have seen in previous RfC discussions about allowed LLM use. This would be a major project, probably involving multiple people who are familiar with both LLM functionality and Wikipedia editing principles.
  2. Restrict LLM editing usage to certain domains and/or projects on Wikipedia. WP:MED is the one I'm familiar with, but I'd imagine that many other scientific domains will benefit in similar ways. This may be a good option because people making large contributions to these pages already tend to be abiding Wikipedia policy, have higher standards for references, and are capable of cross-checking output text against peer-reviewed articles. This could also be taken a step further, by only whitelisting active and listed participants in scientific WikiProjects; I dislike this sort of gatekeeping, but I think it could assuage some concerns.
  3. Expand allowed forms of LLM editing to allow LLM-generated content under specific conditions. The primary condition I'd propose is the one seen in the diff I linked above: already having found an appropriate reference, feeding it to the LLM with the intent of making specific directed edits based solely on the contents of that reference document. I'd also like for literature searches to be explicitly allowed (I think the current guidelines allow for this, but it's not clear).
  4. Limiting LLM editing to specific models. I would propose these be Claude Sonnet 4.5+, Claude Opus 4.5+, and GPT-5.4+ (note: this is not the same as ChatGPT, which is a different OpenAI model). I don't think it's necessary or appropriate to exclude GPT-5.X (non-pro) and Claude Sonnet, since beyond this point cost becomes prohibitive and the differences in quality when references are provided are negligible at best. This already represents a massive step up above the models that are most commonly used for editing on Wikipedia.


I don't think I'm ready to formulate these into a concrete and actionable proposal yet, and would like to get some additional input before doing so. My intent is to create a framework that helps people to use LLMs for appropriate, beneficial, and policy-compliant Wikipedia page editing and creation, with well-elaborated rules that establish clear lines between what is and is not acceptable use. As it stands, the policy is overly restrictive and is detrimental to technical articles. Just-a-can-of-beans (talk) 20:09, 13 June 2026 (UTC)[reply]

was the product of feeding Sonnet 4.6 (a robust flagship LLM) the cited article and instructing it to add content to the existing version of the pathogenesis section. Why not just feed the LLM the articles and ask it to generate notes with page citations for yourself so that you can then read those particular pages and write your own NPOV and MEDRS compliant content? voorts (talk/contributions) 20:58, 13 June 2026 (UTC)[reply]
Or, you can take the LLM's first draft, verify it's true, and then write your own version. voorts (talk/contributions) 21:00, 13 June 2026 (UTC)[reply]
Basically it's a time concern. I'm busy. Most people qualified to write and edit medical pages are busy. If I can find it a good source, feed it the PDF, and then just verify that all the information matches up with the source, this saves me a lot of time compared to typing it all up myself.
My fundamental concern is that something that can be used as a tool in this way, to massively expedite editing, is just wholesale banned now. Just-a-can-of-beans (talk) 20:44, 17 June 2026 (UTC)[reply]
Re: [Slop] affects medicine pages too, but I'd bet it is a bigger problem on articles that are less technical in nature, since people tend not to feel qualified to edit technical articles on unfamiliar subjects. and the proposal to restrict this to MED and similarly technical topics – I'm skeptical of the premise. One of the problems with LLMs is that they facilitate the generation of massive amounts of superficially good-sounding text by people who lack the domain expertise (or writing proficiency) to effectively evaluate the end product. The LLM ban maintains the barrier to entry that, in your experience, is what keeps problematic contributions to a minimum. I also don't agree that it's true in the first place that MED articles don't attract problematic content, whether LLM-generated or otherwise. This is probably true for a rare disease like adrenoleukodystrophy but many alternative medicine and general medical or scientific topics are magnets for fringe claims. Many topics and articles that are not strictly under the MED umbrella contain biomedical information that is subject to higher WP:MEDRS standards. This includes many hot button "culture wars" topics that attract readers and editors with a wide range of backgrounds and motivations. I do think you raise some valid issues and I've not yet thought through the rest of it. I have some concerns about over-broad bans restricting quality contributions but I also understand that LLM-generated content is flooding the site and that creating too many loopholes or exceptions creates additional problems. —Myceteae🌈 (talk) 21:01, 13 June 2026 (UTC)[reply]
Sonnet is no longer a "flagship" LLM, that's two public Anthropic models old (Opus, Fable, 3 if you count Mythos). Levivich (talk) 21:03, 13 June 2026 (UTC)[reply]
  • Just running this up the flagpole to see if anyone salutes (might be a terrible idea); what if we gave certain trusted editors permission to use AI, maybe after some training or perhaps passing a test? Take the permission away the first time they screw up. Maybe make them post the first N AI edits as a draft in userspace and ask someone else to review it and post it? --Guy Macon (talk) 21:44, 13 June 2026 (UTC)[reply]
    Not saluting. What trusted editors can do is use LLMs for their behind the scenes research… this is OK because we trust them to double check what the LLM generates, and then take what they have checked and re-work it into something appropriate for WP. Blueboar (talk) 22:15, 13 June 2026 (UTC)[reply]
    My understanding from the newest version of the guideline is that this is no longer allowed. I'd also argue that this isn't really how many would prefer to do things; it's relatively quick and easy for many medical professionals to find an academic source and read it and understand it, but it is quite time-consuming to actually type up an entire section. Hence why major medical pages can go years and years without significant contributions. Just-a-can-of-beans (talk) 20:42, 17 June 2026 (UTC)[reply]
    Absolutely not. JoelleJay (talk) 13:20, 14 June 2026 (UTC)[reply]
    Care to elaborate? Just-a-can-of-beans (talk) 20:39, 17 June 2026 (UTC)[reply]
    Is that the full version of the flagpole expression? I've only ever heard the "run this up the flagpole" part before, and interpreted it like "run this up the totem pole" as in "run this upstairs to the higher-ups". Very interesting. ~2026-35116-29 (talk) 09:36, 15 June 2026 (UTC)[reply]
    Yeah, seeing who salutes, or if anyone does, is the standard full forumulation. Dates back beyond this 1995 example. -- Nat Gertler (talk) 11:22, 15 June 2026 (UTC)[reply]
    Of course we have a run it up the flagpole article. DMacks (talk) 13:38, 15 June 2026 (UTC)[reply]
    Shit flow diagram and Malacca dilemma were both primarily written with an LLM by a (I assume) trusted editor. There's a lot of work that goes into verification and such, but it is possible. Shit flow diagram is better than what I threw together years ago, and Malacca dilemma is certainly better than nothing at all. ScottishFinnishRadish (talk) 11:27, 15 June 2026 (UTC)[reply]
    This is similar the second proposal in my initial post, which I don't agree with from an ethical standpoint because I dislike gatekeeping. But for both that and your suggestion, I do think they'd be effective if the broader community found them acceptable. Just-a-can-of-beans (talk) 20:38, 17 June 2026 (UTC)[reply]
  • Wikipedia must maintain high standards where possible. LLM use is a way to degrade and join the morass of AI slop, garbage and misleading text that is already prevalent now in journalism and on the internet. We need our editors to take full responsibility for the content. If LLM output is thoroughly checked by the contributor then it could be used. Our problem is when LLM output is not checked and the problems slip past. Graeme Bartlett (talk) 22:56, 13 June 2026 (UTC)[reply]
    Alas the problems are likely to "slip past" in far too many cases. LLM output is typically far too large to be totally checked by humans. That is the whole point of LLM use: outrun humans. I would prefer policies that are extremely strict against LLM use. Yesterday, all my dreams... (talk) 15:45, 17 June 2026 (UTC)[reply]
    This is completely untrue when using a flagship LLM under a properly constructed system prompt's guidance. This is applicable only to basic consumer-facing chat models that are not designed for professional use. I understand that many people do not know the difference, hence why one of the purposes of the proposals are to create a well-defined framework. Just-a-can-of-beans (talk) 20:35, 17 June 2026 (UTC)[reply]
    I mean yeah, that's basically what I'm proposing here, a set of guidelines for responsible use, not a free-for-all. LLM use can be used to make slop but it can also be used by responsible professionals to produce more work, at higher quality. Just-a-can-of-beans (talk) 20:32, 17 June 2026 (UTC)[reply]
    Wikipedia does not have high standards this is a false premise 95% of the articles on this cite are terrible. I agree that GA and FA should have high standards Czarking0 (talk) 05:31, 19 June 2026 (UTC)[reply]
  • While I have my reservations regarding the proposed policy (for example, the vast majority of regular LLM users will not be the kind of power users knowledgeable enough to configure a system prompt, and a much clearer line is needed to stem the tide), a more fundamental disagreement lies in what Wikipedia itself means, in the future. Our decision to ban LLM content was very positively received, and it is becoming clear that a Wikipedia by humans, for humans is desperately wanted, especially as people are increasingly skeptical of the "AI revolution" accentuating divides and making tokens the world's new currency. Like many others, I don't want a future where meaningful Wikipedia editing is contingent on having enough to spend on generating content at scale. Let's see the bigger picture here. Chaotic Enby (in solidarity · talk · contribs) 23:56, 13 June 2026 (UTC)[reply]
    Keep in mind that this whole business of "pay one of six giant companies a boatload of money to run crappy AI on giant racks of servers in data centers" will inevitably turn into "pay the price of an evening out on a computer you own that runs pretty good AI free in your bedroom forever" and those six companies will collapse. I was there when the company I was working for bought a Wang Word Processor (see Wang Laboratories#Word processors). It was very expensive, but so was retyping entire documents on a typewriter, which is what it replaced. Anyone reading this editing text on a Wang? Anybody? We will need to rethink things as the technology changes. --Guy Macon (talk) 01:15, 14 June 2026 (UTC)[reply]
    On the other hand… I can remember when the laser disk was being touted as future of home entertainment. LLMs may turn out to be a dead end… tech nobody wants. Blueboar (talk) 01:37, 14 June 2026 (UTC)[reply]
    The only way LLMs end up that way is if an even better AI platform supplants them, in which case we will be debating the use of that platform on Wikipedia. BD2412 T 12:18, 14 June 2026 (UTC)[reply]
    ln time most technologies will be improved or superseded, but may have a long life. Newtonian mechanics ideas were considered flawed by relativity theory but are still used in every car on the street. From a practical point of view LLM techs have gathered so much investment that their momentum is hard to ignore, despite their amazing computational inefficiency. So let us not consider them dead while they are still breathing. They will be a problem for crowd sourced efforts for sure. Sigh... Yesterday, all my dreams... (talk) 15:38, 17 June 2026 (UTC)[reply]
    Laserdiscs (off-topic perhaps) were superseded by DVDs, which in turn has been superseded in popularity by digital streaming. Unsure about downloads, however. George Ho (talk) 19:31, 15 June 2026 (UTC)[reply]
    Truthnuke. jp×g🗯️ 08:05, 16 June 2026 (UTC)[reply]
    I disagree with this fundamentally. It should not matter how the content is written, all that matters is how strong Wikipedia can be for the reader. If taking a stand about subjective personal ethics means a measurable decrease in overall utility of Wikipedia, then it could be said that these ethics are harmful to Wikipedia.
    Further, I find the line of thinking about cost and access to be inherently invalid. Nobody is forced to use these tools - the idea is just to create a tightly regulated pathway for those who do wish to use them. The decision to ban LLM content was very positively received... by those who participated in the discussion, who are largely those users who care about LLM use in the first place. In other words, there is a strong element of response bias at play. That being said, it's not an unpopular opinion by any stretch, which is why I'm requesting here something that would allow only a limited form of use, and why I've sought feedback here.
    So with all that in mind, I do have a question for you: in what way does the LLM ban actually improve Wikipedia? Just-a-can-of-beans (talk) 20:00, 17 June 2026 (UTC)[reply]
    Your priorities are mixed up here, the argument it is not if LLM's could help Wikipedia, the argument is if they could harm Wikipedia. While this sounds like semantics these are very different arguments. This is a tad bit extreme of an example, but I feel it helps illustrate this point. Sure, an axe murderer killing a certain man with a mustache who failed art school would absolutely not be condoned if the public knew what he was to do eventually, but I still don't think we condone axe murdering. AI, while with some human help, can produce good content, can also do the opposite just as much if not more frequently when it comes to BLP and CTOP. In a perfect, idealistic world, using an LLM would be a-ok on Wikipedia as everyone would make sure the content it provided was good, but we are not in that world. Even the good AI-generated content generally takes more time to convince the AI to write factually without errors, then to just write it yourself. AI does also not source information well or sometimes not at all, making this take even longer. Sure, if 25% of AI content added to Wikipedia is good, (and that's a unrealistic number; Wikipedia is drowning in bad hallucinated AI edits) that's all fine and dandy, but if the rest is bad, then sometimes we simply have to make a blanket rule in hopes of catching as much errors, vandalism and what-have-you on Wikipedia. If we analyzed every AI edit on Wikipedia, we would never get a thing done around here. Ilov3gam3z (talk) 20:19, 17 June 2026 (UTC)[reply]
    Should a medicine be banned because a small fraction of people get severe side effects or an allergic reaction to it? The argument is about the net good or bad to Wikipedia. I contend that there is a very strong potential for good, and when used appropriately, a very weak (if present) potential for bad. If you are about to argue that people would not follow these guidelines, this is a fallacy, because they're not going to follow the existing wholesale ban either. The ban only affects editors who care enough to read and abide Wikipedia policies and guidelines.
    I don't know if you've actually read through my proposal to be honest, because it directly addresses most of what you've written here, and other things (such as bad sourcing habits) are irrelevant to the proposals I've made, which are specifically in a framework of not allowing the LLM to source material. Just-a-can-of-beans (talk) 20:30, 17 June 2026 (UTC)[reply]

    "a small fraction of people get severe side effects"

    This is, wrong? Like literally just wrong. This is not a 'small fraction' in any way whatsoever. I would say when it comes to premium LLM's, they are still not good enough to evade these problems. While topics with immense amounts of research and coverage, sure, can be written about fairly decently by AI (even flagship), niche topics (which is a good bit of Wikipedia, to be honest) has been a pain point with AI for quite some time, and top-of-the-line models still have problems with this exact issue. This whole 'flagship vs non-flagship' argument is not a good train of logic, as all AI's are fallible, (and much more fallible then humans) but they are especially fallible when a topic has less information on it online. No flagship, nor currently existing LLM, can evade this issue. Also, this aside, one of the main issues is aesthetics. AI writes very differently from humans, and this can be a turnoff for many people. One of those is myself. AI uses words people don't use very often, techniques and sentence structure that people also don't use very often, and in general something written by an LLM sounds very AI. Yes, all the techniques used by AI are good and together are better, but no person writes THAT well, and uses THOSE rare techniques that much. It becomes an uncanny valley of sorts. Even your example diff has this clear problem, I would instantly flag that article as written by an LLM, it reads like it. Also, I know you are responding to everyone at once, but I feel you are not adequately reading and addressing the concerns of all the people against this, which let's face it: is the majority. Some of your responses come off as genuinely hostile and not in good faith. Remember WP:BATTLEGROUND. I would advise you (not trying to be condescending here) to take a step away and take some time to read and maybe edit your proposal to try and address the concerns. Ilov3gam3z (talk) 21:46, 17 June 2026 (UTC)[reply]
    This is a terrible argument. I would replace LLM with IP editor in your reply. Czarking0 (talk) 05:33, 19 June 2026 (UTC)[reply]
    There is probably some response bias at play. Sure, editors who care the most about this have been most involved in crafting the current guideline and are most likely to be interested in this discussion. My observation, as someone who does not spend a lot of time on LLM problems, is that many editors have objected to such broad bans. Over time, as the AI problem grows, and as attempts to define more detailed rules and exceptions have failed to to concerns about implementation and interpretation, is that the community has come around to accepting this broad and straightforward ban. I'm someone who has been, and remains somewhat, uncomfortable with it, for reasons similar to yours. But I accept that the current ban reflects consensus, meets a pressing community need, and that alternative proposals to date do not address the problems. Several editors have provided substantive critiques of the details of your proposal, which you have not engaged with. Dismissing every single one of us as biased as though we speak with a single voice is neither accurate nor constructive.. —Myceteae🌈 (talk) 20:38, 17 June 2026 (UTC)[reply]
    In fairness to OP, it looks like you've just returned to this thread and started responding. But still, I find fault with the blanket characterization in the response I replied to. —Myceteae🌈 (talk) 20:43, 17 June 2026 (UTC)[reply]
    I think you have some wisdom here. As someone who was opposed to the LLM ban and did not even see the discussion for some of the consensus, I believe a minority of the active editors care a lot about banning LLMs. In part they want the ban for the harms LLMs bring to wikipedia, but they also want the ban because they are opposed to LLMs in every aspect of their life. They constantly started new and very lengthy discussions that most of the editors are not really interested in participating in and they would have continued to do so until they got the total ban.
    The fact that we ban LLMs but not paid editing goes to show how much less the current generation of policy leaders is willing to think about nuance in our policies than the previous ones were. Czarking0 (talk) 05:39, 19 June 2026 (UTC)[reply]
    I, too, missed the discussion which resulted in the current version of the guideline. Part of me was surprised to see it had gone live, given the opposition I have seen (and offered) to such bans in the past. But perhaps it was inevitable. I am sympathetic to the editors who spend a tremendous amount of time cleaning up after AI crap that floods the project. In theory there are acceptable use cases even but I understand that the sheer volume of bad content and the work required to review, verify, and fix it makes a broad ban much more workable. I'm at a state of uneasy acceptance. —Myceteae🌈 (talk) 02:31, 20 June 2026 (UTC)[reply]
  • "I'd bet it is a bigger problem on articles that are less technical in nature, since people tend not to feel qualified to edit technical articles on unfamiliar subjects" doesn't gel with how llms seem to be used. People make the most use of them in topics they don't understand, relying on the llm to do the understanding. The promoted example of WMF's own experiment with llm text in mainspace was a MEDRES article. CMD (talk) 02:22, 14 June 2026 (UTC)[reply]
    At the moment LLM issues are considered "problems" but in time will become "nightmares" for crowd sourcing. We must take a very strong stand against them, unless we all like nightmares, I do not. Yesterday, all my dreams... (talk) 15:56, 17 June 2026 (UTC)[reply]

As it stands, the policy is overly restrictive and is detrimental to technical articles.

  • I appreciate your concerns about this "policy" (ahem... guideline), and I admire your acknowledgment on issues with AI slop. Nonetheless, the consensus overwhelmingly favored extending the rule from only new LLM-generated articles to rewriting an existing one with LLM-generated content. I even favored extending the scope of the rule. I dunno how many threads like this we will see in the future, but I can't stop them until this matter will be considered a perennial issue. Better yet, we're still in a very long road until the time the consensus changes its mind about LLMs. —George Ho (talk) 03:10, 14 June 2026 (UTC)[reply]
Oh, and you've contributed to medicine-related topics, right? Even an LLM may potential produce an inaccurate info about especially a related contentious topic, like complementary and alternative medicine and COVID-19. George Ho (talk) 03:21, 14 June 2026 (UTC)[reply]
I've observed that both Claude and GPT have strong safeguards in place about these topics specifically, but the crux of the argument is still valid. Hence why I proposed a source-provided model of text generation. If the model is given 1-3 high quality academic source PDFs and told to construct a particular section of a particular page, using only information which is taken directly from those source articles, it will avoid these things. The models are pretty good at following directions when you instruct them to avoid drawing any conclusions or relying on any information outside of source articles. Just-a-can-of-beans (talk) 20:25, 17 June 2026 (UTC)[reply]
I'm not convinced that creating official system prompts or whitelisting models is the right fix. System prompts might help the AI fit in, maybe even get it to use higher quality sourcing, but they never stop hallucinations and inaccuracy. Limiting LLM use to specific topics does nothing as well AI can and will introduce inaccurate info na matter the topic and most people wont check even on medicine articles. I do agree that the current AI guidelines and rules are to restrictive in some ways but I do not think this is how or the way to fix it. I find it acceptable when LLM output actually is checked by someone knowledgeable, but most of it isnt. Like Graeme Bartlett said Wikipedia must maintain high standards where possible. This proposals does not address these issues. Luka Maglc (talk) 06:50, 14 June 2026 (UTC)[reply]
What do you think of a guideline update that specifically allows text generation or editing based on a provided (user-sourced) academic source? With explicit instruction to only utilize information available in the document. It seems like this would address a lot of the other concerns I've seen in these comments and I'm curious what you think of the idea, since you seem to have a good understanding of LLM-related concepts and limitations Just-a-can-of-beans (talk) 21:14, 17 June 2026 (UTC)[reply]
One of the problems is the particular type of hallucination; an experience from work had an LLM cite a paper. Correct paper, incorrect author list. Unless you know about the field, those sort of errors can be hard to catch. Red Fiona (talk) 07:43, 14 June 2026 (UTC)[reply]
Comment: The best way to use AI I've found is as a reviewer. A prompt I've used is "You are an expert of ______ known for being harsh, precise, and demanding. You are asked to peer-review the following text: ____." Another is "You are an expert of _____ asked by a colleague to proof read their writing. Review the following draft and provide recommendations for improvement: ____." I use these as a filter before sending text off to be proofread by a human. I've found that changing "a colleague" to "your student" sometimes gives better results, but not always. This is the kind of use I can see AI being helpful with on Wikipedia, not generating original text. Fundamentally, Wikipedia was likely used to train most if not all of these models, so we're just feeding Wikipedia back into itself if we use it to liberally to generate new content. GeogSage (⚔Chat?⚔) 08:48, 14 June 2026 (UTC)[reply]
You'd be surprised, the top models have actually been fed entire academic databases. I've seen Opus accurately cite obscure medical information with correct DOI links and everything. Still iffy on the cites outside of that anecdote, but the actual knowledge base is very solid Just-a-can-of-beans (talk) 21:10, 17 June 2026 (UTC)[reply]
No doubt, I've seen it give some impressive suggestions for citations using it as a "reviewer." I've also seen it hallucinate sources, or make up stuff and misattribute it. The problem is that you need an expert human in the loop to catch some of these mistakes. It isn't really a huge time saver if you need to rely on accurate outputs, it can be useful to polish existing work though. Like, if you write something organically, asking AI about what could be improved isn't going to hurt anything (worst case, you ignore the suggestions) as long as you don't take the suggestions at face value. GeogSage (⚔Chat?⚔) 05:18, 18 June 2026 (UTC)[reply]
And then we must consider other editors' time required to review and possibly clean up any such contributions. Another editor below raised the case of an editor who was viewed (not without controversy) as generating high quality, acceptable LLM contributions to a large number of articles in under-covered technical fields. It proved too good to be true and has created a massive amount of work for the community. —Myceteae🌈 (talk) 15:29, 18 June 2026 (UTC)[reply]
That will be a problem beyond Wikipedia. Self published (or low quality) books or blogs with LLM content will grow like mushrooms, and will become input to other LLMs. How do you spell nightmare? Yesterday, all my dreams... (talk) 06:10, 18 June 2026 (UTC)[reply]
NO. LLM-generated content, especially medical content, should never be permitted here under any circumstance. I can't believe this is even a discussion. JoelleJay (talk) 13:26, 14 June 2026 (UTC)[reply]
+1
I don't have much to say here, especially since your comment basically has most of what I would want to say, even if worded a little strongly. - BlueEleephant (talk · contribs) 03:38, 15 June 2026 (UTC)[reply]
I agree, "no no no" to LLM use. Yesterday, all my dreams... (talk) 15:47, 17 June 2026 (UTC)[reply]
It's not very helpful to make such a strong statement without providing any reasoning. The fact that others agree to it, also providing no reasoning, may demonstrate that community consensus in this case is a product of hysteria. Just-a-can-of-beans (talk) 20:05, 17 June 2026 (UTC)[reply]
The reasoning should be obvious from what others have said here and from the many other discussions the community has had on genAI. We went two decades without having LLMs, we don't need them now. So what if it hypothetically takes us longer to write quality articles ourselves than if we were assisted by AI? I'd rather that be the case than a single LLM-generated article come out where we must trust that the "author" had sufficient expertise (and for very technical medical topics that would mean an MD or PhD with several years of postgrad experience in that area(*)) to evaluate the output. What is much more likely to happen (and is already happening) is editors who don't have the necessary expertise and thus historically didn't have the ability to write efficiently on a certain topic would nevertheless feel qualified "enough" to guide an LLM-generated summary. And then, because it's a niche topic, it doesn't get evaluated by an actual expert and any errors subsequently get consumed and propagated by other LLMs. And the time the editor would have spent painstakingly understanding the topic enough to summarize it themselves will instead be put towards more lazy LLM-generated creations that never get properly validated.
(*)In some previous thread I linked to a study that examined how well medical trainees/professionals were able to detect errors in various different AI-generated diagrams of the heart. IIRC the results were that med students, nurses, and interns/residents gave a passing accuracy grade to some highish (20-50+) percentage of diagrams, attendings in adjacent but non-heart-specific fields passed fewer, and experienced cardiologists passed almost zero. JoelleJay (talk) 16:30, 18 June 2026 (UTC)[reply]
So the detection of AI errors clearly relates to the expertise of the human evaluator. Not surprising but a source would be good. Would you like to add this source to the "top 10 list" below? Thanks. Yesterday, all my dreams... (talk) 20:36, 18 June 2026 (UTC)[reply]
This is the study I was referencing. It's on AI-generated diagrams of various congenital heard defects. From the abstract: The nurses and medical interns were found to have a more positive perception about the AI-generated cardiac images compared to the faculty members, pediatricians, and cardiology experts. From the body: Medical students, interns, and residents were significantly more likely to perceive the images as anatomically accurate, find the illustrative text useful, and consider the images both usable for medical education and visually appealing compared to other evaluators (p-value < 0.001) Nurses found the images notably more attractive and useful for medical education, and they also rated the accompanying text as highly useful, compared to other groups of evaluators (p-value < 0.001). [...] Conversely, the cardiology experts were significantly more inclined to perceive the images as (inaccurate, not attractive, not for medical education and their illustrative text being not useful) compared to the other evaluators. [...] In comparison to cardiology experts; the nurses perceived images more positively with [significantly] the highest relevance score compared to cardiology experts (34.1% higher p < 0.001), followed by medical students/interns/residents (26.6% higher p < 0.001), then faculty staff/academician (15.5% higher p < 0.001) and pediatric consultant/specialist ( 14.5% higher p < 0.001). JoelleJay (talk) 14:00, 19 June 2026 (UTC)[reply]
Do you have any GA on medical topics ? Czarking0 (talk) 05:42, 19 June 2026 (UTC)[reply]
I also acknowledge the problem of slop. This affects medicine pages too, but I'd bet it is a bigger problem on articles that are less technical in nature, since people tend not to feel qualified to edit technical articles on unfamiliar subjects.
No, it's the exact opposite; on less technical topics, the (inevitable) errors and source-to-text integrity problems are easier to identify and fix than topics that require greater subject matter expertise to do so. I don't even know if there's more than a handful of people's worth of overlap between people knowledgeable about medical topics and people who do AI cleanup. Gnomingstuff (talk) 05:41, 15 June 2026 (UTC)[reply]
One question about LLM use that I have always had is what degree of reliability folks want from a LLM before allowing it to edit/rewrite an article. Errors aren't a purely machine phenomenon, humans do them too. I know about the scale issue but scale in the past has been an issue with human editors too. Jo-Jo Eumerus (talk) 12:54, 15 June 2026 (UTC)[reply]
The problem isn't errors. The errors are a symptom of the real problem, which is that LLMs (fundamentally, unchangeably, by nature of their architecture) can never be reliable. They can never stop lying. They can never notice when two things they say back-to-back directly contradict each other. It's absurd to trust such a machine with any task that relies on verifying that written words align with sources.
I just recently responded to an AI-generated query by a user which stated that they were aware their draft was based entirely on primary sources and therefore wasn't fit for mainspace, and then just a few lines later asked for an experienced editor to check whether they had sufficient secondary sources to publish the article to mainspace. There's no logical component that allows the LLM to realise it just said the exact opposite thing. It's a glorified text-prediction algorithm. That's why there's all those viral clips of people asking it to count to 100 and it says "1, 2, 3, 4... all the way to 100" and then tries to gaslight the user into thinking it counted all the way (or it apologises, says it'll really do it this time, then does the exact same thing) Athanelar (talk) 06:22, 16 June 2026 (UTC)[reply]
The errors are in fact the problem, as one can see from pretty much every discussion on AI/LLM . LLMs (fundamentally, unchangeably, by nature of their architecture) can never be reliable. They can never stop lying. They can never notice when two things they say back-to-back directly contradict each other. It's absurd to trust such a machine with any task that relies on verifying that written words align with sources. definitively don't think that this is true at all and viral clips are not proper evidence that LLMs/AIs are inherently error-prone, let alone more so than human editors are. Wikipedia has been putting up with human fallibility for decades. Jo-Jo Eumerus (talk) 12:20, 16 June 2026 (UTC)[reply]
I think it's partly true and partly false. Whether hallucinations are an inherent an unavoidable aspect of LLMs is currently a hot topic, and there is plenty of researching suggesting this, eg. [1] [2] [3] [4]. However, "They can never notice when two things they say back-to-back directly contradict each other" is false and easy to disprove: just give an LLM two contradicting statements and ask it if they contradict; at least some (if not most or all) of the time, it will correctly identify contradictions. Also, anyone who uses Claude Opus knows that it regularly spots mistakes in its own responses and self-corrects. Not always, maybe not even "often enough" but to say "they can never" do it is just not true. At least some of them do it at least sometimes, and there is plenty of work underway to attempt to improve this aspect. Now per the above research, some say that meaningful accuracy is unobtainable. I don't know if that's true or not, it remains to be seen. Levivich (talk) 16:07, 17 June 2026 (UTC)[reply]
As someone who has used a good bit of AI models, (Claude, ChatGPT and Gemini (mainly for tinkering and my own curiosity to push their limits)) for a decent peroid of time, other then what Anthropic is doing, (which is an anomaly when it comes to model quality) AI (and again, only in my experience) has gotten worse. Like, significantly worse. While modern AI models have gotten better at sounding less-AI and more human, Gemini specifically constantly conflates information that is unrelated with each other, says blatantly incorrect outdated information as fact, and if you are having a niche problem, just gives you patent unsourced falsehoods, and ChatGPT is just a joke now. Most decent LLM's are either locked behind a paywall (which many will not pay for, including me) or just don't exist (especially in Gemini's case). I am actually decently fine with Claude though, they are a major exception to the rule and the only slightly moral AI company I have seen so far. Ilov3gam3z (talk) 16:32, 17 June 2026 (UTC)[reply]
Have you really tried all the old models, though? Because I'm surprised that's your take-away. I think objective benchmarking, and pretty much all reviews I've ever read, support that new AI models are better than their prior iterations.
ChatGPT 4 was way better than 3 -- wouldn't you agree? And 4o hallucinates less than 4. 5 even less than 4o, plus there's much less sycophancy. It still hallucinates, too much for my liking, but it's better than 3 or 4.
As for Claude, Sonnet is way better than Haiku. I mean, just ask those two models the same question, and I think you'll agree that Sonnet gives a better answer than Haiku. And Opus is even better than Sonnet. Opus 4 in particular seems like a big step over Opus 3 (or any Sonnet or Haiku) in both hallucinations and sycophancy. I was busy and missed the two days when I could have played with Fable, and obviously I've never tried Mythos, but the reviews are that they are big improvements.
And that's without getting into the advanced of agentic AI. Claude Code and Claude Cowork are, in my view, amazing technologies that blow prior iterations out of the water.
Gemini, all versions, sucks, that much I agree with.
But I think a lot of the anti-LLM folks are judging based on the free version of Gemini and have never tried, eg, Opus 1M set on max. To be clear, even that still isn't good enough to write Wikipedia articles without careful human review. When I personally recently tested Opus 1M max on summarization, it failed miserably. But I do think they're getting better. We'll have to wait to find out if there's a ceiling and where it is. Levivich (talk) 16:42, 17 June 2026 (UTC)[reply]
Here's the thing; is the average Wikipedia editor going to pay for the better models? No offense, a very small portion of AI's userbase actually pays for the better models. The jump from ChatGPT 3 to 4 was definitely huge, but the jump from 4 to anything else has been relatively minor with the problems of AI still looming large. Claude is obviously an outlier here, but I cant imagine that Anthropic can keep this up much longer, as all the other AI companies at some point have had to scale back on free model quality, ads baked into the AI and much else. Also, users with AI subscriptions (specifically Claude) can use an amount of tokens much more expensive then the actual cost of the subscription. There is no way that they aren't operating at a cost deficit, and a pretty bad one, too. I personally predict that AI will get more powerful, but will become more expensive, and the free tiers will become even more of a joke then they already are. As most people use the free tiers, don't expect AI improvements to be really visible, as only some will have access to said improvements. On a side note, I am very skeptical of what Anthropic has said Mythos can do, it feels like a very baseless train of hype. Ilov3gam3z (talk) 17:05, 17 June 2026 (UTC)[reply]
I admit, I agree with every single word of that comment. You know, one thing that could happen but will probably never happen, is for the WMF to get free access to the highest quality paid tier AI for trusted editors, similar to the Wikipedia Library. Since the AI companies already pay the WMF for access to Wikipedia, you'd think a deal could be made for them to give us access to "the good stuff" for free. Never gonna happen tho, because I don't think the community wants it. Levivich (talk) 17:44, 17 June 2026 (UTC)[reply]
Oh, hell no. They trained on our content, and want to sell it back to us? Not a chance. SarekOfVulcan (talk) 17:46, 17 June 2026 (UTC)[reply]
S V, I agree with your sentiment, but the chance % depends on the decision at WMF. Do we have a say? Who knows. Yesterday, all my dreams... (talk) 18:58, 17 June 2026 (UTC)[reply]
I said give ("for free"), not sell (not that I expect that distinction will change many minds). Levivich (talk) 19:00, 17 June 2026 (UTC)[reply]
Flagship LLM makers pay licensing fees for academic content they use for training (though I think in some cases they received official permission for free). It varies a bit company to company, but both OpenAI and Anthropic take copyright and licensing seriously. Just-a-can-of-beans (talk) 20:51, 17 June 2026 (UTC)[reply]
pay licensing fees - Have we forgotten that one time a big-time AI company downloaded all of anna's archive a huge book piracy site? Ilov3gam3z (talk) 21:57, 17 June 2026 (UTC)[reply]
I could see the WMF and maybe Anthropic or Google teaming up on something of the sort but personally I just wouldn't see the use case and would probably never use it. Sure, the train of logic that goes "if you cant beat them, join them" has some good applications, (and I am talking specifically about using AI while the platform is already being flooded by AI misinformation) this is not really one of them. Fighting AI hallucinations with AI hallucinations is like fighting radiation with more radiation, not fighting fire with fire (which, misnomer, you CAN fight fire quite effectively with fire, but I digress). This is not to say AI has some okay applications at times, for all the hate on AI overviews in Google search, (which is semi-justified as when it comes to anything relatively niche, Gemini implodes) it does sometimes lead me to a Stack Overflow or Reddit post that does, in fact, solve the problem. I recently turned AI overviews off with extensions, though, as over time it kind of did this oh, it's really bad!' 'oh, it's really good!' 'oh, it's even worse! thing. This does not mean I am for the proposal, though, this is simply just a bit of nuance. Ilov3gam3z (talk) 18:15, 17 June 2026 (UTC)[reply]
Usage cases might include research, checking scripts for bugs, proofreading, maybe some agentic stuff (eg for vandal or sock fighting). Levivich (talk) 19:03, 17 June 2026 (UTC)[reply]
As I have addressed under another post you made that says basically the same thing, source prioritization is relatively specific to Google Gemini, and I haven't seen any other western flagship have that problem for years. Just-a-can-of-beans (talk) 20:53, 17 June 2026 (UTC)[reply]
is the average Wikipedia editor going to pay for the better models?
They are not, which is why I suggested mandating the specific models to be used. In my opinion, anything below the specified models (Claude Sonnet or GPT-5.4+) is unsuitable even for text generation with a provided source. That being said, I noticed in your previous post that you have not actually used these models or other advanced ones. You may be surprised by how big of a step up they are. If we were to set a standard for Claude Opus or GPT Pro, these models are even capable of inline quality checks when instructed to do so and given a set of quality standards (which Wikipedia thankfully has in ample supply) Just-a-can-of-beans (talk) 20:09, 17 June 2026 (UTC)[reply]
The only models good enough to meet Wikipedia standards are the paid ones, and not everyone can pay. The amount of tokens that would be used if this was to be implemented in the way some have suggested (official Wikipedia AI partnership) would cost a lot to say the least. Also, I do not have a deep understanding of the paid models as I have not used them frequently, but I have used them on occasion. Ilov3gam3z (talk) 20:24, 17 June 2026 (UTC)[reply]
When used via API, the average prompt with Claude Sonnet or GPT-5.4 costs a few cents. As an example, the diff I linked in my original post, where it edited a section after analyzing an academic PDF, this cost me about $0.09 with Claude Sonnet; I ran GPT-5.4 simultaneously (selecting the Claude output because it was better) and that one was even cheaper, being about $0.04. Paid, yes, but not prohibitively expensive as you might expect. This was done via OpenRouter.
An official partnership like that wouldn't even really be about the token cost unfortunately, but rather about the training cost, which is very resource intensive. But that's really not necessary: a strong system prompt with a stock-standard model will produce the kind of result you're thinking about. There aren't many people out there capable of writing such a thing, I suppose, but I very much doubt I'm the only Wikipedia editor who knows how to do it. Just-a-can-of-beans (talk) 20:49, 17 June 2026 (UTC)[reply]
But we can't be sure a official partnership would happen. Going simply based on the idea of specific models w/ specific prompts, this would most likely require people to pay for these models themselves. Most people will not do this, and do I simply don't see the point. Ilov3gam3z (talk) 21:50, 17 June 2026 (UTC)[reply]
This would be an absolute nightmare. NO. First of all, AI still makes hallucinations, the idea that it has gotten 'better' is a lie, it can simply hide it better with dubious sources. Only allowing specific models would make yet ANOTHER backlog of work needed to be done by volunteers, as well as a 'Wikipedia prompt' being volatile and requiring constant changes (more work for volunteers). Also AI has a weird tone that the best prompt can't fully fix, that will sound weird enough to be a turnoff to a good many people. Just no. Ilov3gam3z (talk) 13:26, 15 June 2026 (UTC)[reply]
Wikipedia is for humans, by humans Ilov3gam3z (talk) 13:31, 15 June 2026 (UTC)[reply]
I haven't seen anyone else comment on this yet, but I'd also like to add that the writing style in the Claude contribution is a little out of step with encyclopedic tone. It's subtle, not glaring, and humans do write like this, especially those with academic backgrounds who are accustomed to writing in this style. —Myceteae🌈 (talk) 14:47, 15 June 2026 (UTC)[reply]
I do wonder if some of that is due to my own edits. I tend to over-clarify in my writing. I just felt that the initial output was a bit too technical for the average reader. Just-a-can-of-beans (talk) 20:19, 17 June 2026 (UTC)[reply]
Could very well be. It's difficult in articles like this, which are highly technical and where there is little "lay" literature on the topic. This is why it's a pervasive problem for human- and LLM-written articles on many topics. —Myceteae🌈 (talk) 20:41, 17 June 2026 (UTC)[reply]
I think you're erroneously assuming here that the Wikipedia userbase want to integrate LLMs into the creative process, and have simply been forced to put a pause on that due to the technical issues. I don't believe that to be the case.
As others have said, sentiments like "Wikipedia is by humans, for humans" are becoming more and more popular as the rest of the internet becomes steadily subsumed by identical-sounding AI-generated prose. Our goal is not to create the most comprehensive encyclopedia as fast as possible. Our goal is to create an open-source repository of human knowledge.
Just today I was googling to find some information about a problem and I found at leazt 5-6 articles from different sites all with the exact same heading format; "Problem X, What it is, the risks, and solution tips". One of them said something like "Fire remains a primary risk of throwing lit cigarettes on thr carpet", with the AI-ptose "remains" managing to imply, somehow, that the act of throwing lit cigarettes on the carpet may somehow later have a different primary risk. It was basically impossible for me to find information about what I was looking for that actually seemed to be written by a human.
In an age where everybody's writing their scripts, listicles and help forums with AI, Wikipedis remains (see what I did there?) a bastion of actual human effort, and actual human achievement. There are many people like myself who believe no matter how good AI gets at writing for Wikipedia, we still shouldn't use it. In a dead internet where bot-generated traffic now makes up the majority of activity (according to Cloudflare), let's have Wikipedia be a hub of life. The day Wikipedia articles start sounding like every other piece of soulless machine-generated prose inundating the internet these days is, I think, the day I finally stop using these kinds of peer-to-peer websites altogether. I've already been driven off every other social media because of it.
In the words of Pope Leo, "We must, then, avoid the “Babel syndrome,” namely the idolatry of profit that sacrifices the weak, a uniformity that neutralizes differences, and the pretense that a single language — even a digital one — can translate everything, including the mystery of the person, into data and performance." Athanelar (talk) 06:13, 16 June 2026 (UTC)[reply]
I'm sorry but I do not think that it is appropriate to use Wikipedia as a platform for activism, including activism about human vs machine content. I think that it is only appropriate to attempt to create the most comprehensive, thorough, well-sourced, complete, and useful free-access encyclopedia on earth; limiting this effort for personal belief is just not appropriate. The exact same line of thinking could be used to argue that common people shouldn't be allowed to edit Wikipedia without showing some credentials first, and I think you'll agree that this is contrary to the fundamental nature of the site. Framing Wikipedia as a resource that has rules on who or how it can be written is not appropriate, it should just be about what is written.
And, to address the first paragraph, no I do not think most editors want to use LLMs, I just think that they're a valuable tool that can be used to greatly increase individual editors' productivity, and that a blanket ban is a net detriment compared to regulated use. Just-a-can-of-beans (talk) 21:01, 17 June 2026 (UTC)[reply]
Framing Wikipedia as a resource that has rules on who or how it can be written is not appropriate, it should just be about what is written. But this is obviously not true, and obviously never could be true. We restrict contribution by editors with conflicts of interest, paid or otherwise. We have rules and norms about disruptive editing, we have a whole list of things Wikipedia is not, most of which are arbitrary and based on the idea of what sort of encyclopedia we want to create rather than being based on any specific meritocratic argument. Even our fundamental basic requirement, "notability," represents the deliberate decision to create a selective encyclopedia as opposed to an indiscriminate one.
There is no reason why "Wikipedia is not AI generated" can't go right alongside "Wikipedia is not indiscriminate," "Wikipedia is not a democracy" etc. Athanelar (talk) 21:16, 17 June 2026 (UTC)[reply]
And we have lots of rules about which sources can be used, when, how, and under what circumstances. And a great many content policies and guidelines. —Myceteae🌈 (talk) 21:27, 17 June 2026 (UTC)[reply]
But aren't all of those about the quality of the articles themselves? Disruptive editing is obvious, and CoIs are similarly banned not for the fundamental ethics of the matter, but because they introduce a strong element of bias that detracts from page quality. There are also exceptions to CoI when it could improve mainspace content, such as allowing individuals to edit pages about themselves under certain circumstances, and having a system for allowing paid edits. Notability criteria keep pages and Wikipedia itself cohesive and focused, which is necessary for content to truly be able to inform readers about the important details. A blanket ban on LLM use is different, in that it punishes how a page is written, not the eventual results on that page. Is it not a bit unreasonable to have a blanket ban here, but not on paid editing? Just-a-can-of-beans (talk) 21:31, 17 June 2026 (UTC)[reply]
But aren't all of those about the quality of the articles themselves? They are precisely about maintaining a certain idea of "quality" which is not objective, but rather deliberately decided upon by Wikipedia's foundational principles and the ongoing consensus of its community.
You could argue, for example, that conflict of interest editors are exactly who we would want to write our articles about companies and the likes; these companies are literally willing to pay people to make sure these articles stay up to date with the latest information about the company's organisation, products and services etc. That might be very valuable if we were a different encyclopedia with different values. But we have decided, as a matter of principle, that we are not a directory or catalog and that we'd rather our articles be based on what people wholly unrelated to the company has to say. That is, again, not objectively "better" than the alternative, it's a subjective judgement based on the values of the project.
All of these things reflect the fact that Wikipedia has a culture, ethos and values, and that our only goal is not simply to collate as much information as possible as fast as possible.
Now, if you want to say "Wikipedia's values shouldn't exclude AI-generated text" that's one thing, but arguing that Wikipedia is somehow an objective judge and shouldn't include or exclude things based on judgements of philosophy or value certainly doesn't stand up to scrutiny. Athanelar (talk) 21:48, 17 June 2026 (UTC)[reply]
p.s., Is it not a bit unreasonable to have a blanket ban here, but not on paid editing? I actually would wholeheartedly support a blanket ban on paid editing, too. I have long argued that people who are only here because they've been paid to write promotional content (and that is, unsvoidably, what the purpose of paid editing ususlly is) are definitionally not here to build an encyclopedia. I'd support very limited carveouts for, say, educational institutions who employ a paid Wikipedian-in-Residence (with the distinction being that someone being "a paid Wikipedia editor" should be fine, but being paid to write specifically about a person/organisation really shouldn't be) Athanelar (talk) 21:51, 17 June 2026 (UTC)[reply]
First of all, the paid editing tangent is a whatabout-ism. Secondly, just for clarity, I do not object to AI in the research process of editing, nor having AI make a rough draft you manually source, fix the tone of to sound like you and double check and remove hallucinated content. I am simply against doing AI editing with this much of a lack of human guardrails. Ilov3gam3z (talk) 22:46, 17 June 2026 (UTC)[reply]
What about finding sources yourself and using an LLM to help determine the most common themes and facts shared between them, then generating an outline, and then the article piece by piece while a human verifies each step of the way? ScottishFinnishRadish (talk) 00:13, 18 June 2026 (UTC)[reply]
It's a small distinction, yes, but it is an important one. I am fine with AI assisting humans but I am not fine with humans assisting AI. This proposal is an example of the latter, not the former. Ilov3gam3z (talk) 00:19, 18 June 2026 (UTC)[reply]
I think so long as specific claims about what a computer program can and cannot do are considered an article of religious faith, we are not going to make much progress here. Sometimes a computer program can do something really well; sometimes it cannot. You have to look and see to have a worthwhile opinion. jp×g🗯️ 08:12, 16 June 2026 (UTC)[reply]
You can call it "religious faith" or you can call it "principle." Wikipedia has its five pillars and its mission statement and a whole cultural ethos as to what it is and is not for which informs decisions about its policies and guidelines. The point is that whether or not these programs are "good" at what they do should not necessarily be the (only) deciding factor as to whether and how we use them. Athanelar (talk) 08:18, 16 June 2026 (UTC)[reply]
Wait until they use Reddit or blogs to learn, then others use those... LLMs will be a nightmare. Yesterday, all my dreams... (talk) 19:04, 17 June 2026 (UTC)[reply]
Flagship LLMs have been fed virtually every English language academic database at this point, and Opus in particular is capable of accurately quoting specific lines with correct DOI citations purely from memory, without any web searching (it is not reliable enough for me to recommend using it like this, which is why I recommended a reference-provided text generation setup).
Google Gemini, even on its premium versions, does have the problem of treating Reddit and other non-academic sources as authoritative information sources. This is why I omitted it from my recommendations. I consider this a Gemini-specific issue that does not apply to Claude Sonnet+ or GPT-5.4+ Just-a-can-of-beans (talk) 20:17, 17 June 2026 (UTC)[reply]
As an AI patroller, I will directly address the implication that some editors who use LLMs to generate article content (not research, not summarise, generate) article content will do so in a careful enough manner to ensure policy compliance. Find me a single case of an experienced editor being brought to WP:AINB and not having masses of issues turn up in their articles. One single case. Then look at comparatively the amount of articles that are indisputably slop and the amount of time it's taken to clean those up, and compare that to the amount of time it saves you, personally, to make an article which theoretically passes all the PAGs.
I don't think you realise the scale of the problem here. From the few times I've run my AI edit summary log, just looking at the people stupid enough to copy-paste their chatbot's suggestions into the edit summary box, we're talking about high two digits / low three digits per day of editORS, not just edits. Without the ability to revert on sight without excessive debate and verification, and you should be as aware as I am that LLMs are the world's greatest sealioners, the result is a mathematical certainty.
It may surprise some to learn that I have zero ideological slant against LLMs. I use them avidly in my own work and research, and indeed, my Wikipedia patrolling tasks. LLMs are a specialised tool which can be very useful in some circumstances, but are actively harmful in the vast majority of circumstances. Article generation is not something LLMs are good at, period, and this is well verified by thousands and thousands of experiments by people claiming to have varying degrees of LLM and Wikipedia competency. If you find a way to do it perfectly, I suggest that you send us your arXiv preprint. Fermiboson (talk) 17:31, 20 June 2026 (UTC)[reply]
Shit flow diagram and Malacca dilemma. ScottishFinnishRadish (talk) 19:08, 20 June 2026 (UTC)[reply]
My recollection was that the conclusion from those cases was that LLMs provided little to no timesave. Fermiboson (talk) 21:17, 20 June 2026 (UTC)[reply]
You asked for a case. There's two articles.
What if I preferred to use my volunteer time in that way? I think incredibly overly-detailed articles on professional wrestling and transformers with poor quality in-universe sources are a waste of time, but that's how some editors prefer to spend their time. It's no one's place to tell another editor who is contributing constructively that they can't.
Wikipedia already got significantly fucked by missing the shift to mobile. We lost an entire generation of editors and we're continuing to lose out on potential editors because the mobile interface is garbage. That's on the WMF. LLMs and other AI tools are how an increasing portion of people interact with the Internet and now the community is going to lose the next generation of potential editors. That's going to be on us.
Wikipedia is already bleeding with declining editor counts and falling traffic which leads to even fewer potential editors. Maybe educating people on constructive LLM use would slow that down a bit, rather than the Wikipedia immune system going after editors. Our immune system is good and I've blocked dozens of editors for LLM abuse but we're developing arthritis. ScottishFinnishRadish (talk) 21:43, 20 June 2026 (UTC)[reply]
I'm not aware of instances where we've blocked someone who was adding non-problematic LLM-generated content? The problem as I see it is that people come here, start adding problematic LLM-generated content, get warned and mass reverted, and then get demoralised and leave (ofc some continue past the warning and get blocked). What we can do about that (other than being nice about it or surfacing NOLLM earlier), irdk. But the hypothetical unproblematic, undisclosed LLM use goes undetected anyway, so an AI ban is pretty much just on problematic use.
If you're arguing that if/when LLMs get to a level where they can crap out FAs, we should reverse the ban to stop biting newbies, why would we then need loads of new editors when we can just have bots with a few human fact-checkers/managers? What's the reasoning for allowing some LLM use but not all? What's the benefit of having an LLM-using editor instead of a skilled AI agent (managed at a distance)? (sorry if I've misunderstood) Kowal2701 (talk, contribs) 22:34, 20 June 2026 (UTC)[reply]
What we can do about that (other than being nice about it or surfacing NOLLM earlier), irdk. We can educate them on how to use an LLM constructively, the same way we educate people on OR, SYNTH, RS, and how to format citations.
But the hypothetical unproblematic, undisclosed LLM use goes undetected anyway, so an AI ban is pretty much just on problematic use. LLM use has to be disclosed, as I did on the two articles I created using LLMs, and that constructive use is now disallowed. Telling people they have to break the rules to contribute isn't a good way to handle it
why would we then need loads of new editors when we can just have bots with a few human fact-checkers/managers? Because the workflow that I used requires a significant amount of time, similar to that of writing an article from scratch. Editors still have to find and read sources, verify everything, check for close paraphrasing and copyvio, and everything else that goes into creating quality articles. If someone wants to contribute usefully that way rather than the way someone else does they shouldn't be prohibited. Some people prefer doing research and reviewing rather than writing wholesale, and I think that's fine. ScottishFinnishRadish (talk) 22:52, 20 June 2026 (UTC)[reply]
I like the notion of adding WP:LLMRESP to AI warn templates (though they're already fairly cluttered, I'd also like to add m:List of Wikipedias because it's often non-fluent English-speakers).Btw WP:LLMDISCLOSE is an essay. I'll think on the rest, but (looking to the near future) I can't see the logic in allowing LLM-generated content with human review, but not unreviewed/liberal LLM use or agentic LLM-generated content, assuming the premise that there no concerns over core PAG compliance. If LLMs had already hit their ceiling, it's a different conversation, but I think that if we don't privilege human-written content, there'll become no reason to have human writers.
Assuming the premise that LLMs have hit their ceiling, part of the reason NOLLM is so blunt is because it's aimed at CIR LLM users who might see the nuanced "you must review" as permissible, but aren't capable of doing proper review (which is loads of people rn, esp. people not proficient in English). I don't think it's possible to address that while permitting the editors you describe, and there's a large backlog of AI cleanup as it is Kowal2701 (talk, contribs) 23:46, 20 June 2026 (UTC)[reply]
I added a sentence on LLMRESP to NOLLM [5], LLMRESP needs to be expanded to have a section on copyediting (ie. what prompts to use etc.) Kowal2701 (talk, contribs) 20:32, 21 June 2026 (UTC)[reply]
Of the potential new editors who are discouraged from editing by our LLM policy, I would bet the vast, vast majority would not (and would never) use LLMs with the apparent time and care you took to construct those articles (I would also note that @Fermiboson was asking for cases of experienced editors brought to AINB). This is absolutely the case already in academia, where LLM-generated, error-laden submissions to journals have skyrocketed. If thousands of senior academics whose reputations are at stake aren't sufficiently checking LLM summaries of experiments they conducted in fields they are experts in, why would we expect anonymous editors with unknown qualifications to behave any better for Wikipedia articles? JoelleJay (talk) 11:20, 21 June 2026 (UTC)[reply]
Malacca dilemma as produced had that strange llm mixture of errors mixed with close paraphrasing. Whether they are all gone now I am not sure, but it is not an example of an llm work that ended up with few issues. CMD (talk) 00:53, 21 June 2026 (UTC)[reply]
Just my 2c, I think a compromise we'll most likely end up with will be to allow LLM-generated content by someone with an LLM user right, but it'd always need to be disclosed and the criteria for granting the right would be strict. It'd also probably be stigmatised somewhat, similarly to how paid editing currently is, because it contradicts the 'hard work for no reward' editing culture we have that produces mutual respect and collegiality. I expect (hope) widespread use to always be a red line Kowal2701 (talk, contribs) 20:05, 20 June 2026 (UTC)[reply]
I could see this happening 100% Ilov3gam3z (talk) 20:07, 20 June 2026 (UTC)[reply]

What I would like to see is an AI specially trained (by the AI vendor, not just user trained) to be as much like a human Wikipedia editor as possible, trained to avoid doing the kind of things we criticize in other AIs. Call it "Wikipedia mode" with a bunch of warnings on the AI's page about it being a start point for a human, not something to cut and paste. If they could actually make something good, imagine the PR value.

IF they can make it good. That's a huge IF. I suspect that the result would be another Grokipedia or Hallucipedia. --Guy Macon (talk) 16:45, 15 June 2026 (UTC)[reply]

Best implementation of this idea I have seen so far, although I do fear this may be a bit of a hassle in general. We are getting on just fine without LLM's and AI in general, although we are having some editor retention and acquisition problems, but I don't feel we are at a tipping point where we need editors to be 'assisted' by AI to help. Ilov3gam3z (talk) 19:12, 15 June 2026 (UTC)[reply]
Frankly, if the editors we're failing to retain are the kinds of people who teachers report can't write a one-paragraph essay without turning to their chatbot of choice, I'm not losing any sleep over it. I don't think our goal should be to make Wikipedia editing accessible to the most intellectually lazy segment of the population. Athanelar (talk) 06:25, 16 June 2026 (UTC)[reply]
We are getting on just fine without LLM's and AI in general
I don't think this is true, it's more like "we are barely treading water beneath the deluge of AI edits past and present."
Which is another reason why the proposal is a terrible idea. Gnomingstuff (talk) 11:37, 16 June 2026 (UTC)[reply]
Not a terrible idea, but disastrous. Terrible would be too kind a word. But I share your over all feeling. Yesterday, all my dreams... (talk) 15:59, 17 June 2026 (UTC)[reply]
Indeed, and probably worse. Yesterday, all my dreams... (talk) 18:59, 17 June 2026 (UTC)[reply]
This is almost exactly what I have proposed. When given a robust system prompt, a flagship LLM is capable of doing this. Running this on an unmodified flagship would almost certainly produce better results than a model trained from the ground up for Wikipedia use, unless that model was a relatively basic remix of an existing flagship. Such a thing could be made theoretically but would be somewhat costly and Wikipedia would be expected to foot the bill, so I don't think that's a realistic path, unfortunately. Just-a-can-of-beans (talk) 20:13, 17 June 2026 (UTC)[reply]

Now that LLMs, if not AIs, have become a hotter topic than before, I can't help wonder whether now is the right time to close this discussion as failed attempt to circumvent or make the community reconsider the WP:LLM rule. Indeed, I made a response there about laserdisc inspiring and then being superseded by DVDs, especially in popularity. George Ho (talk) 19:34, 15 June 2026 (UTC); edited, 18:27, 17 June 2026 (UTC)[reply]

Nobody has yet answered the question that I posed many months ago: what do LLMs train on when thay have run out of human-generated content? Phil Bridger (talk) 18:50, 16 June 2026 (UTC)[reply]
Eventually they will train on other LLM items, of course. But that is not all. As is, they even use Reddit!! Yesterday, all my dreams... (talk) 15:50, 17 June 2026 (UTC)[reply]
Western flagships are pouring enormous amounts of money into human training, to avoid LLM items as training material. The reasoning is rather simple: the gaps in the initial LLM's outputs, if used as training material, will only multiply in later iterations.
Chinese LLMs train on Western LLM outputs, which is why they are unusable for any serious task. Just-a-can-of-beans (talk) 20:21, 17 June 2026 (UTC)[reply]
Suggesting changes to policies is not an "attempt to circumvent" those policies and it's rather ridiculous to characterize it as such. Levivich (talk) 16:09, 17 June 2026 (UTC)[reply]
Apologies for not finding the right word. Striking out that wrong word.... —George Ho (talk) 18:27, 17 June 2026 (UTC)[reply]

SUGGESTION: Could someone please run an LLM on this proposal to determine its chance of success? My guess: less than 10%. Let us see what AI has to say. Yesterday, all my dreams... (talk) 19:09, 17 June 2026 (UTC)[reply]

The following links:
George Ho (talk) 19:37, 17 June 2026 (UTC)[reply]
AI committing harikari in it's own responses, lol. Also half those links don't work, except the GPT one. Ilov3gam3z (talk) 19:47, 17 June 2026 (UTC)[reply]
Or should we say Seppuku? May be we should ask Gemini? Yesterday, all my dreams... (talk) 21:18, 17 June 2026 (UTC)[reply]
All of these are consumer-grade LLMs with minimal processing power, unsuitable for professional work of any kind. I specifically rejected the use of them in my proposal, for a reason. Just-a-can-of-beans (talk) 20:22, 17 June 2026 (UTC)[reply]
Weirdly, Claude Sonnet 4.6 thinks your sub-proposal #3 has a better chance than the other three, especially #4: https://claude.ai/chat/64a4d2f1-3dda-450d-b931-1afeae9a8cc6
Claude Opus is paywalled, so couldn't use it.
ChatGPT says that full relaxation is "unlikely" but some "modest clarification or narrow carve-out" might likely have some chance: https://chatgpt.com/s/t_6a337b5135288191a585f752300e5296
Oh wait.... You said GPT 5.4, which may be paywalled: https://openai.com/index/introducing-gpt-5-4/
GPT 5.5 is also paywalled: https://openai.com/index/introducing-gpt-5-5/
Where does that leave us? George Ho (talk) 05:10, 18 June 2026 (UTC)[reply]
Well done George, but please provide a summary of what the LLMs said given that 2 of the links need additional input before they can be read. In any case, anyone who reads this discussion can see that the overall sentiment is far from positive. So LLMs just confirmed the obvious. Yesterday, all my dreams... (talk) 21:14, 17 June 2026 (UTC)[reply]
Copilot says "low" chance for "full rollback" and "moderate" chance for some sort of relaxation.
ChatGPT says that proposal as written would have "10–20%" chance, but even it also says that the outcome of complete dismissal has "10–20%" probability. Chances of the "modified and partially adopted" outcome are "25–40%". That of the outcome of rejection but also "spark[ing] later clarifications" would be "40–50%".
Google Gemini predicted "very low chance of success".
(Hopefully, I haven't violated WP:AITALK, have I?) George Ho (talk) 05:27, 18 June 2026 (UTC)[reply]
No you have not violated any policies, because you are reporting LLM output. That has now given me a new idea, which I am sure is the subject of a few master theses right now: "How often do LLMs agree?" Regardless of accuracy, do they agree on most issues? If they do not, which is probably the case, then that is a point against them. Yesterday, all my dreams... (talk) 05:35, 18 June 2026 (UTC)[reply]
I wonder what Kalshi or Polymarket would say… —Myceteae🌈 (talk) 15:24, 18 June 2026 (UTC)[reply]
I too once believed that well-crafted prompts combined with careful review by subject-matter experts could lead to quality content. Then we had the case of Esculenta (talk · contribs), who I considered to be pretty much the pinnacle of competence and trustworthiness, and who nonetheless turned out to have pervasive source-text integrity issues across their thousands of LLM-assisted edits (most of which still haven't been reviewed and cleaned up). That experience taught me that even when a user says they are aware of the hazards of LLM-assisted editing and that they are carefully reviewing the output, we still have to closely scrutinize the results, and the man-hours it takes to do so far outweighs the efficiency improvement for the original editor. You may feel you are getting good results much faster, but actually you are just offloading the hours to other editors, and editor-hours of people willing to wade through others' LLM-contributions is a precious scarce resource these days. Please stick to using LLMs to find sources and guide research, and please continue to read those sources and write quality content yourself. We can revisit this question in 2030 once we've finished cleaning up the messes from all the "competent" LLM users who went before you. -- LWG talk (VOPOV) 00:42, 18 June 2026 (UTC)[reply]
Well said, and logical. But was 2030 a typo? 3030 may be a more likely time frame for cleaning up the expected LLM mess. Yesterday, all my dreams... (talk) 05:05, 18 June 2026 (UTC)[reply]
Esculenta's process differs from Just-a-can-of-beans's proposal in a number of important ways but this raises another issue that I haven't seen discussed here: close paraphrasing and WP:COPYVIO. This is already a challenge in many niche or highly technical fields where there the literature is limited and rife with jargon. I've seen a few discussions over the past year or so about this particular issue (like this one), noting that this is a challenge even when not using LLMs. I admit I don't have experience with using LLMs as proposed but I would expect this to be a problem when feeding a single source to a model and asking it to summarize. —Myceteae🌈 (talk) 15:44, 18 June 2026 (UTC)[reply]
The only way to adequately check LLM output for copyvio and close paraphrasing is to have carefully read and internalized both the sources and the LLM output, which calls into question the claimed time savings, since this careful reading and thought makes up most of the time spent when editing without LLMs. If the parts of editing that the LLM is claimed to replace normally make up 30% of the time spent editing, yet you claim the LLM speeds up your editing by 80%, I can only conclude that quality is being sacrificed. -- LWG talk (VOPOV) 18:11, 18 June 2026 (UTC)[reply]
And even if you check all the sources that the LLM listed, it's a leap to presume that it had not absorbed and regurgitated information from a source that it did not include in the listed sources. -- Nat Gertler (talk) 19:10, 18 June 2026 (UTC)[reply]
It's really not that difficult if you're providing the sources and having it give you the supporting text. Makes it quick and easy to check. ScottishFinnishRadish (talk) 19:21, 18 June 2026 (UTC)[reply]
LLMs can and will experience context drift, especially when the provided documents are large. Forcing it to provide source snippets to support blocks of generated text is only useful as far as you can personally evaluate the fidelity, so if someone doesn't actually understand the topic deeply enough to write about it on their own, they may not be able to recognize misalignment. Plus the more rigidly you constrain the model to specific texts, the more likely it is to mislabel context. JoelleJay (talk) 14:14, 19 June 2026 (UTC)[reply]

A lot of the narrative around use of AI on Wikipedia has a sort of social justice/WP:RIGHTGREATWRONGS feeling that is unrelated to the goal of creating a free encyclopedia. Of course we should use any tool to that end. Anyone whose goal is to make an encyclopedia "by humans, for humans" is firstly ignoring the fact that this is an electronic project with spellcheckers and a ton of bots that fix various minor errors, and secondly importing a goal that is not the goal set forth in the statement of purpose of this project. Now, obviously it is a bad idea to have a Grokipedia-like fully automated function of gathering potentially dubious sources and crafting an article out of them. However, LLMs are just fine at taking sources provided to them through good old-fashioned human research and drafting a sentence or a paragraph summarizing the gravamen of those sources. In a very short time, it will be impossible to tell the difference in writing, and it has been pointed out to me more than once that merely being a good writer now makes one suspect of being an LLM. BD2412 T 00:13, 21 June 2026 (UTC)[reply]

I don't see it as RGW so much as existential: the minute LLMs can write summaries as good as humans (and that minute may already be here) is the minute Wikipedia becomes obsolete. Which means Wikipedia editors become useless. That's what the anti-LLM folks are fighting against: their own obsolescence. I'm not sure I'd characterize Wikipedia's existential fight for existence as RGW. Levivich (talk) 00:17, 21 June 2026 (UTC)[reply]
I feel as if this is a bit of a strawman. As an anti-LLMs'-in-Wikipedia editor, I am not worried in the slightest about being replaced. What a ridiculous idea; AI edits are hallucinatory, copyvio ridden, deep into the uncanny valley when it comes to style and cannot for the life of itself comprehend Wikipedia markup, templates and all the like. No, I (and many others, I might add) am worried about the quality of the encyclopedia. Quantity matters much less.
There is a reason the Library of Babel is only used on the internet as a fun gadget; there is too much quantity of information to find any quality information. Ilov3gam3z (talk) 00:35, 21 June 2026 (UTC)[reply]
Well, for one, there's a fundamental difference between spell-checking tools or bots and generative AI. I don't think that distinction needs to be explained further...
For two, the error rate with current LLMs is way too high, especially when considering the volume at which such errors would be introduced and the likelihood of them flooding niche topics that would inevitably go for years without expert evaluation. An editor who has zero understanding of some specific technical topic isn't going to try to write a summary of it, while an editor with limited background in the topic may write on it and may make mistakes (and often these will be relatively predictable mistakes, as anyone who has graded exams might have noticed). Permitting any leeway in LLM use would both expand the pool of editors contributing on topics they have no/limited expertise in and enormously increase the volume of their edits. Even if contribution rate somehow stayed constant, allowing LLMs to summarize even short tracts of text would still degrade the confidence reviewing editors would have in source-content fidelity, meaning each assessment would take more time.
If the only goal is to create as many encyclopedia articles as possible, then we're already beaten by LLMs. The people who don't care whether their information is human- versus AI-generated already just use chatbots as their search engines or rely on Gemini overviews, both of which are heavily influenced by existing Wikipedia articles. The greatest value we can provide now is our assurance that content is strictly composed by humans. JoelleJay (talk) 12:53, 21 June 2026 (UTC)[reply]
Re: The greatest value we can provide now is our assurance that content is strictly composed by humans, well then, let's ban use of spell-checkers. Let's turn off all the maintenance bots and prohibit the use of sources found with a search engine. Hell, let's ban the use of electronics altogether, and write Wikipedia with pencils and paper. BD2412 T 19:20, 21 June 2026 (UTC)[reply]
Not going far enough there. Pencils can be erased, paper decays and is flammable. If it isn't carved on stone tablets, you can't trust it. --GRuban (talk) 19:25, 21 June 2026 (UTC)[reply]
So apparently you do need a primer on how vastly different spellcheckers are from generative AI? Maybe if you are too unfamiliar with these technologies to recognize why they are incomparable you should step back from this discussion. JoelleJay (talk) 10:19, 22 June 2026 (UTC)[reply]
The leading generative AI platforms can literally be used as a spellchecker. You can give it a block of text and ask it to identify typos and it will. Better yet, it will also identify correctly spelled but contextually wrong words (e.g., there/their confusion). What we have here is an exhortation that since hammers can be used to kill people, we should be prohibited from using them to hammer nails. BD2412 T 19:20, 22 June 2026 (UTC)[reply]
@BD2412, WP:NOLLM allows copyediting one's own edits with an LLM. It disallows copyediting articles for the reason stated at NOLLM, see Wikipedia:AI noticeboard/Archive 3 #Three years of bad AI copyedits by User:Kofi Meija and Wikipedia:AI noticeboard/2025-12-14 Kofi Meija as an example. Kowal2701 (talk, contribs) 20:41, 22 June 2026 (UTC)[reply]
This is semantics. We are not discussing the usage of AI in spell-checking here. Also there have been cases of AI spell-checking incorrectly, so this isn't the gotcha that you think it is. Ilov3gam3z (talk) 20:49, 22 June 2026 (UTC)[reply]
Show me one instance of an AI getting the spelling of a word wrong, outside of being told to write something intentionally containing misspelled words. BD2412 T 21:09, 22 June 2026 (UTC)[reply]
... I think you could be arguing in better faith here Kowal2701 (talk, contribs) 21:17, 22 June 2026 (UTC)[reply]
Here you go. Admin-only, I'm afraid. —Cryptic 21:40, 22 June 2026 (UTC)[reply]
I don't think image generation is being contemplated here. BD2412 T 22:27, 22 June 2026 (UTC)[reply]
Fine. Small things, like it's vs its' or bigger things like, for example, what if an article has an Indian English tag? I don't exactly think that AI is going to replace it's z's with s's without being told, and I don't think most people will tell it too. Ilov3gam3z (talk) 23:41, 22 June 2026 (UTC)[reply]
On the other hand, you made a typo in your comment there, and yet you're still allowed edit :-) Anyway, if the problem with LLMs was "it's" vs "it's" or "colour"/"color," nobody would be talking about banning LLMs. This of course isn't about spellchecking, and I think this exchange illustrates the problem with the entire "written by humans" argument: 1. humans make mistakes, too, and 2. humans use technology when they edit (which I think was the point of bringing up spellcheckers). It's not as simple as "humans good machine bad," it's a question of whether the technology creates too many mistakes to be helpful, a question of which technology editors should and shouldn't use, or even of how they should and shouldn't use a particular technology. Levivich (talk) 00:37, 23 June 2026 (UTC)[reply]
@Levivich: I think the entire question is the "how"; tools are tools, and need to be used correctly regardless. BD2412 T 01:38, 23 June 2026 (UTC)[reply]
In my experience LLMs will change z's and s's and do other engvar changes. I've had it happen in both directions. Doesn't happen every time, just something to watch out for if using. CMD (talk) 02:51, 23 June 2026 (UTC)[reply]
This discussion is about a proposal to generate biomedical articles with LLMs, not about using them for copyediting. Of course LLMs can perform spellchecking... as mentioned below, we already have different rules for this as it's (supposed to be) a separate function from the generative capabilities of LLMs. JoelleJay (talk) 13:13, 23 June 2026 (UTC)[reply]

Researching Salebot1

[edit]

I am so SOOOOO sorry if I place this here, but I am scared that new socks are going to invade Wikipedia talk:Long-term abuse/Salebot1 (and it may get create protected) or that nobody may reply once I place the edit in, so here's the explanation. In the sandbox, I made two edits——one based off YouTube and the other on a Fortnite Fandom wiki——about researching Salebot1's minions' attitude, much like how they copypaste content from websites related to Angry Birds Stella and place that nonsense on a bunch of (actually, the most searched) articles. This is why I'm researching them. The most surprising part is that Salebot1 socks' actions actually get logged into the edit filter. These socks keep appearing, and if you see the new socks (like 45(Redacted) or AntiCompositeNumber (Redacted)), please do not report them to WP:AIV nor to WP:UAA as they have gross usernames.

Please note that everything I mention here is for research. I am not trying to engage in acts of vandalism.SimpleObjects-9ei 🏖️/☀️/🥵 (🌎 CentralAuth) 16:44, 15 June 2026 (UTC)[reply]

Respectfully, what are you asking? I can not tell. I especially don't get your advice to not report them to WP:AIV nor to WP:UAA. How exactly should we deal with them if not via these two methods? 45dogs (they/them) (talk page) (contributions) 19:45, 15 June 2026 (UTC)[reply]
I guess emailing an admin would be the best option, but I can't think of anything else. CheeseAndJamSamdwich (talk) 18:58, 17 June 2026 (UTC)[reply]
I guess, but it's all pretty unclear. Isn't LTA designed to handle these things? What weight are we supposed to give to one editor's request that we not report suspicious activity related to this particular abuser? —Myceteae🌈 (talk) 15:29, 22 June 2026 (UTC)[reply]
The need to report these accounts in an easy and quick manner far outweighs the risk posed by usernames created by this LTA, IMO. An admin can also just revdel afterward if really needed. 45dogs (they/them) (talk page) (contributions) 15:40, 22 June 2026 (UTC)[reply]
AgreeMyceteae🌈 (talk) 17:37, 22 June 2026 (UTC)[reply]
Well... I use Special:Log very very often (and came across these sock puppets at least one time, and reported like ten of them), which is why I decided to research this. Seems that now SB1 socks are now searching for PvZ2 stuff and place it on several iPhone articles.
Sorry if I called 45dogs, it's because one sockpuppet's username (specifically the one that starts with "45") contained the N-word. The other one (starting with "AntiCompositeNumber") contained another offensive slur. – SimpleObjects-9ei 🏖️/☀️/🥵 (🌎 CentralAuth) 15:43, 22 June 2026 (UTC)[reply]

How to report an edit war on Wikidata

[edit]

[I apologize if this is the wrong place (if you feel this discussion can be moved to a more appropriate place, please feel free to move it)] Does anyone know in which venue an edit war happening on Wikidata can be reported? The page d:Q921634 has been vexed for weeks with an unsourced map that gets constantly removed and then reuploaded under a different name on Commons, but I don't know where I can signal it. Grufo (talk) 18:46, 15 June 2026 (UTC)[reply]

While a Wikipedia policy does not apply to Wikidate, the advice at Wikipedia:Edit warring#What to do if you see edit-warring behavior would be a good place to start. You should be able to contact Wikidata admins at Wikidata:Administrators' noticeboard. Donald Albury 19:57, 15 June 2026 (UTC)[reply]
Thank you. I will immediately repost in Wikipedia talk:Edit warring. --Grufo (talk) 20:00, 15 June 2026 (UTC)[reply]
@Grufo: That is not what I intended. The advice I pointed you to is about what to do before reporting to a notice board. Only if you have been unsuccessfull in trying to settle your dispute with the other editor(s), then you need to contact an appropriate noticeboard or talk page on Wikidata. The Wikidata Administrator's noticeboard is a likely place to do that. Donald Albury 20:38, 15 June 2026 (UTC)[reply]
@Donald Albury: Sorry if I misinterpreted. The situation is a bit more complicated than that. I had already engaged with these accounts at Talk:Fraxinetum#Map of Fraxinetum, and on Latin Wikipedia I had to block them due to vandalism, since they were blanking (#1, #2) the equivalent of our {{Disputed}} and {{Wikify}} templates (I assume due to the fact that these were displayed at la:Fraxinetum). I believe the situation at this point is beyond any possibility of settling. What would you suggest? --Grufo (talk) 21:02, 15 June 2026 (UTC)[reply]
Grufo, whatever you do the place to report this is not the English Wikipedia, which has no jurisdiction over Wikidata. I would try Wikidata:Wikidata:Administrators' noticeboard as suggested. People more familiar with editing Wikidata may be able to suggest a better venue. Phil Bridger (talk) 21:17, 15 June 2026 (UTC)[reply]
@Phil Bridger: Thank you. It seems reasonable. Although this might be closer to interwiki spam rather than a problem that affects this or that wiki project. So maybe meta:Wikiproject:Antispam could be better? --Grufo (talk) 21:29, 15 June 2026 (UTC)[reply]
@Grufo, I suppose you can take your pick but since the question was How to report an edit war on Wikidata and the locus of the issue is presently on Wikidata, Wikidata:Wikidata:Administrators' noticeboard would seem to make the most sense, as two other editors have already suggested. For any noticeboard, I suggest reading the guidance (headers or banners) at the top which usually give a good indication of the type of issues that should be reported there and may link to relevant policies, guideline, or alternative venues. —Myceteae🌈 (talk) 19:16, 16 June 2026 (UTC)[reply]
You are right, Myceteae. But thinking about this case more thoroughly made me realize that making it a point about Wikidata might be too narrow. I mean, for months these accounts have been repeatedly uploading this unsourced map about Fraxinetum, which repeatedly got deleted, claiming that in the Middle Ages there was an emirate in the middle of Europe. It might really not be a Wikidata issue (my bad for thinking it was), but rather a crosswiki one. --Grufo (talk) 03:59, 17 June 2026 (UTC)[reply]
I defer to you, as you have more experience with cross-wiki issues and with this particular problem than I do. Editors should take care to find the proper venue for their concerns, as you are doing, but in my experience a good-faith post to an active noticeboard often gets the attention of someone who can help even if another venue is technically more tailored to the issue. This is of course not an invitation to editors to post willy-nilly to the most high profile forum they can think of, and it's clear from this thread that you wouldn't read it that way. In the pat when I've needed help and been unsure about the best place to seek it, I've often stated that and have had mostly good results with that approach. —Myceteae🌈 (talk) 17:10, 17 June 2026 (UTC)[reply]
I agree with you Myceteae. I can only add that in some rare cases the proper venue itself is not so obvious and a brainstorming about it can help. So thank you. --Grufo (talk) 22:21, 17 June 2026 (UTC)[reply]
Absolutely. I see no problem with seeking initial guidance here and actually think it was a good approach. But at some point one must just decide to make a report somewhere, or not. It sounds like you have as good a handle on this issue and plausible venues as anyone and have decided to move forward with Antispam. I hope this gets the proper attention. —Myceteae🌈 (talk) 23:12, 17 June 2026 (UTC)[reply]
Grufo, there seem to be at least three issues here: two of them are being addressed by others, but you need to address the immediate issue. Please decide whether this is a Wikidata or a cross-wiki issue (it doesn't matter much if you make the wrong decision), follow the advice you have been given and WP:BE BOLD. Phil Bridger (talk) 21:01, 17 June 2026 (UTC)[reply]
There is a Wikidata issue. But since for a while there has been an enwiki issue, a lawiki issue, and there is still a wikicommons issue ongoing, I believe this might be better dealt globally. As soon as I have a second I will post on meta:Wikiproject:Antispam. --Grufo (talk) 22:16, 17 June 2026 (UTC)[reply]
Updates: For now I have resolved by asking for page protection of d:Q921634 at d:Wikidata:Administrators' noticeboard. Still the crosswiki issue remains, but I really don't have time now to go through all the global edits of these accounts and fill a report. If the problems continue m:Wikiproject:Antispam will become the only way to go (although I should have done it already months ago). --Grufo (talk) 21:36, 23 June 2026 (UTC)[reply]

QUESTION: Is there (or not) a bot that can detect "some" edit wars? Some obvious ones are easy to detect, and a notice can be posted somewhere. Shall we suggest one? Yesterday, all my dreams... (talk) 20:38, 16 June 2026 (UTC)[reply]

Another reason why the WMF should never decide to add AI-generated content to Wikipedia

[edit]

The following is currently a non-problem -- the WMF isn't anywhere near the point where they would consider such a thing, and German courts can't order a US foundation around -- but things change. Imagine a future where AI gets really really good and a major AI vendor decides to donate computer time to Wikipedia for the PR value.

"A court in Germany has ruled that Google is legally responsible for false information generated by its AI Overviews, treating those summaries as Google's own words rather than merely search results. The case began after Google's AI falsely linked two publishing companies to scams and dishonest business practices, even though the cited sources did not make those claims. The court rejected Google's argument that users should verify AI answers themselves, ruling that the company is responsible because it creates and controls the AI-generated summaries. The decision could have major consequences for Google and other AI companies by making them legally liable when their systems produce false or defamatory information." --Source

--Guy Macon (talk) 04:15, 18 June 2026 (UTC)[reply]

All other issues aside, the largest problem with AI is accountability. Humans need to verify the outputs, check the sources, and make sure it isn't pure slop. Doing this is barely easier then just creating the content organically, although it does offer some advantages in terms of determining what content to include and how to structure it. Unfortunately, people want to use it to replace the need to think and fact check, so here we are. GeogSage (⚔Chat?⚔) 05:07, 18 June 2026 (UTC)[reply]
I'd have hoped it was obvious that I can't just publish libel about people and get away with it because I used AI and not everything you read on the Internet is true, but it's good to have a court confirm that and strike down Big Tech's argument that it's special and above the law. Certes (talk) 10:59, 18 June 2026 (UTC)[reply]
Yes, at the moment they are not above the law in Germany. But will probably appeal. As for the rest of the world, it depends on their lobby power. So time will tell. Yesterday, all my dreams... (talk) 11:09, 18 June 2026 (UTC)[reply]

Guy, yes, but that is one of 10 different reasons for not using them. Remember the David Letterman Late Show Top Ten List? Perhaps someone should write an essay in that style. It would be fun. Please see my comment above regarding "How often do LLMs agree?" I am sure a few theses on that will be written soon. If they do not agree, which is to be trusted? A Confusion matrix would be very interesting. By the way, my lack of faith in current LLMs does not originate from my lack of subject knowledge. The reverse is true. I published my first paper on AI before before cell phones existed and before the world wide web. So my hesitations have a serious basis. The neural net structures most of these systems use are inherently probabilistic, and hence error prone. But that is another story. Yesterday, all my dreams... (talk) 05:46, 18 June 2026 (UTC)[reply]

Post script: Please do not rely too much on the Confusion matrix article. Now that I have looked through it, I see various issues there. The editors who did over 50% of the edits are gone, and ips have caused problems. I mentioned some problems on the talk page there. Yesterday, all my dreams... (talk) 06:00, 18 June 2026 (UTC)[reply]
Post script 2. Now we have another reason for not using LLMs in Wikipedia. Earlier today, Google AI quoted the Wikipedia lede on Confusion matrix almost verbatim. A few minutes after I had touched up the lede and improved it with a source Google AI changed, and ignored my ref about audiology. It then seems to have used some material from Geeksforgeeks! But most importantly it IGNORED my talk page comments about article quality. So these system currently ignore article quality comments and use whatever there is, and may mix it with low quality websites. Yesterday, all my dreams... (talk) 11:05, 18 June 2026 (UTC)[reply]
I don't think we can expect AI scrapers to read and understand talk page comments then follow their suggestions, especially as not all comments are objective or even accurate. Certes (talk) 12:38, 18 June 2026 (UTC)[reply]
I would expect a good AI system to look at my comment on the confusion matrix talk page and see if there is any value in it. And there is, given that l specifically mentioned Ryszard S. Michalski. A good AI system would look him up and recognize that the article was missing his work. Please look at Michalki's talk page and see what a fan said. That comment should have been ignored by AI but my comment had value given the link I provided. A simple analysis of his work shows that he was a pioneer. Alas illness did not treat him well. Yesterday, all my dreams... (talk) 12:54, 18 June 2026 (UTC)[reply]
My theory is that "a good AI system" does not currently exist, that one will most likely exist some time in the future, and that everything we decide based upon the AIs we are seeing now will have to be reevaluated. Until then, I am firmly in the "no AI on Wikipedia" camp. Should the day arrive where in nearly every case an AI-generated online encyclopedia gives better results than we humans can create maybe my position will change. Keep your eyes on aicyc.org, wikigen.ai, and any new project like them. Not Grokipedia, though. That one appears to have been created specifically to push an agenda. I could say more, but this editorial[6] says it better. --Guy Macon (talk) 19:22, 18 June 2026 (UTC)[reply]
More than that, Wolpert's work establishes that even "good" AI systems (whatever that means) will inevitably differ on various outputs. So your "theory" is actually a fact at this point. Yesterday, all my dreams... (talk) 19:50, 18 June 2026 (UTC)[reply]
Hmmmm. Different humans editing Wikipedia also have different outputs. I am not seeing any connection between having different outputs and being bad. (The AIs are indeed currently very very bad, but not for that reason). --Guy Macon (talk) 20:37, 18 June 2026 (UTC)[reply]

David Letterman style top 10 reasons for having no LLM content in Wikipedia

[edit]

1. Per Guy, legalities.

2. Per myself, no attention to source quality, as in confusion matrix above.

3. The work of David Wolpert, specially the no free lunch theorem, among other results. This means that each LLM makes trade offs and hence different LLMs will inevitably give different answers because there is no perfect learning algorithm.

4. Per LWG above, out of control copy vio problems.

5. Per Jeske below, the Brandolini effect. Please take a look at Wikipedia:AI noticeboard to see the situation there.

6. Per JoelleJay above, the detection rate for AI errors is highly dependent on the expertise level of the human evaluator.

7. Someone, please suggest more below...

Thanks Yesterday, all my dreams... (talk) 12:25, 18 June 2026 (UTC)[reply]

I wrote an essay in this vein a while ago. Feel free to steal or adapt. It's a bit obsolete now since I wrote it before WP:NOLLM when attitudes towards AI were a bit more positive. -- LWG talk (VOPOV) 14:55, 18 June 2026 (UTC)[reply]
Top Ten Signs LLM Content Is Inappropriate For Wikipedia!
10 ) Portions of Bible are lifted from the Buggre Alle This Bible
9 ) Constantly replaces references to a banned vandal with references to low-quality television
8 ) More hallucinations than Alice on divinorum in Wonderland
7 ) Claims there is a caliph for all of Islam
6 ) Casually refers to Brandolini as a hack in discussions
5 ) Claims a website homepage is actually a news article
4 ) Cites UrinatingTree on Pittsburgh sports articles
3 ) Boasts about beating the clones
2 ) Can't tell the difference between cites and sites
And the Number One Sign LLM Content Is Inappropriate For Wikipedia...
1 ) It makes humans defending it look like LLMs themselvesJéské Couriano v^_^v Object Class: Drygioni 15:31, 18 June 2026 (UTC)[reply]
  • I think I have said enough on this issue, and will "say no more" and move on. Could a couple of you please turn this list into an essay? And could you do me a favor by dedicating it to Michalki given his initial efforts on the subject? (Coi, I knew him). Thanks. Yesterday, all my dreams... (talk) 20:44, 18 June 2026 (UTC)[reply]
8. if I have to hear one more time about how blah aimed to enhance blah and highlights the importance of fostering blah then I will stab myself to death by aiming a highlighter at my lungs, then take photos of the crime so my foster family can enhance them in the darkroom Gnomingstuff (talk) 20:17, 19 June 2026 (UTC)[reply]

Larry Sanger proposing WikiProject Intellectual Diversity

[edit]

 You are invited to join the discussion at Wikipedia talk:WikiProject Council § Proposing a new WikiProject Intellectual Diversity. George Ho (talk) 06:29, 19 June 2026 (UTC)[reply]

 You are invited to join the discussion at Talk:Clop (erotic fan art) § RfC about images on this page. Editors are invited to discuss whether explicit sexualised fan art images should be included in an article which discusses the images. TarnishedPathtalk 15:08, 20 June 2026 (UTC)[reply]

Honestly I'd recommend extending this to all sex and porn articles. The frequent invocation of WP:GRATUITOUS suggests other articles would be examined under this as well, depending on the outcome. I pointed this out in the talk page there, and it seemed to anger people who accused me of "derailing", but I truly cannot comprehend how the decision made for that page (on the basis of explicit images, not amount of images) would not set a precedence for other related articles; one other editor already suggested that outcome. Ringtail Raider (talk) 20:10, 20 June 2026 (UTC)[reply]
Can you be specific about which related articles you believe would be implicated by the result of the ongoing RfC? Regardless, I agree that this discussion should benefit from as many eyes as possible: the discussion involves questions about extreme and sexually explicit art involving children and animals. If you want a more generalized result, there can always be a WP:PROPOSAL at a later date. But the current discussion is already far advanced and I think is already serving as a bellwether to reinforce certain limits this community already tacitly endorses. SnowRise let's rap 13:22, 22 June 2026 (UTC)[reply]
Note that SnowRise appears to about the only editor who believes the images under discussion involve children. Thryduulf (talk) 13:33, 22 June 2026 (UTC)[reply]
Some images in this genre contain identifiable neotenous features (i.e., childlike proportions of the head relative to the body, proportions of the eyes relative to the head, length of limbs relative to the torso, proportions of mouth and cheeks relative to the face). BD2412 T 18:39, 22 June 2026 (UTC)[reply]
because I know the show, I know that if you read all the creator's info and the like, the shows principle characters are all meant to be young adults (21 and older), and that they also have models for much younger characters. That all said, that requires background knowledge that we cannot expect a reader to know. And it was ostensibly a show aimed at young kids. As such to anyone not familiar with the show any images ftom thus are going to fall very very close into what would be absolutely unacceptable images. It approaches the same problem lolicon has, in that you get all these "all girls are 18 or older" subtexts (which i just checked only as two very tame far from explicit examples). I've already !voted by for thus article, only one image, the current top one is really appropriate considering all other concerns here. Masem (t) 21:08, 22 June 2026 (UTC)[reply]
Some images in this genre contain identifiable neotenous features is very different to "the specific images under discussion depict children". The clear consensus of editors, including ones knowledgeable about the show and other relevant subject matter, in multiple discussions, is that these specific images are not CSAM or otherwise problematic for related reasons. Note this is not the same as saying the images are appropriate for Wikipedia (I have not expressed an opinion on that and don't intend to), I just feel strongly that the decision should be made based on what the images actually depict and the manner in which they depict it rather than matters that are not relevant to the discussion. Thryduulf (talk) 21:56, 22 June 2026 (UTC)[reply]
"Note that SnowRise appears to about the only editor who believes the images under discussion involve children." Without meaning to sound short, I think you need to re-review that discussion: I am far from the only person in that thread who feels the images represent children or something so child-like in presentation that it makes no difference. And there are probably more who do feel the same way, but were contented to rest their opppose !votes (the vast majority of respondents do strenuously oppose inclusion for one reason or another) on the WP:offensive content and image use policy grounds.
That said, I agree substantially with the rest of your thoughts here. I would call these images not CSAM, but "simulated CSAM". They certainly don't qualify for the criminal definition of CSAM in the U.S., because SCOTUS has previously ruled that art simulating child sex abuse is not CSAM. However, note that the courts are already indicating willingness to re-evaluate this standard now that AI is capable of creating realistic looking simulacrums.
Regardless, whether the work in question appears to be a child, or is based on character known to be a child (as is the case in this instance), is very much at the heart of the analysis of the appropriateness of the content, and there are policy, safety, UCoC, and TOU implications all over the place here that make those particular images non-starters. SnowRise let's rap 05:00, 23 June 2026 (UTC)[reply]
icon

2026 Labour Party leadership election (UK) has an RfC for possible consensus. A discussion is taking place. If you would like to participate in the discussion, you are invited to add your comments on the discussion page. Thank you. Qwerty123M (talk) 11:35, 22 June 2026 (UTC)[reply]

Discussion is actually at Wikipedia talk:Superfluous bolding explained, so don't try to stop by to have a look the article before moving on to the talk page. Am I saying this because I did that? Maybe. ⹃Maltazarian parleyinvestigate 14:40, 22 June 2026 (UTC)[reply]