Jump to content

Wikipedia:Bots/Requests for approval

From Wikipedia, the free encyclopedia
(Redirected from Wikipedia:RFBOT)

New to bots on Wikipedia? Read these primers!

To run a bot on the English Wikipedia, you must first get it approved. Follow the instructions below to add a request. If you are not familiar with programming, consider asking someone else to run a bot for you.

 Instructions for bot operators

Current requests for approval

Operator: Dw31415 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 11:56, Sunday, June 14, 2026 (UTC)

Function overview: Replace Archive Today links with the original source link when possible and not already hidden by CS1 templates.


Automatic, Supervised, or Manual: Automatic

Programming language(s): Python, Pywikibot

Source code available: https://gitlab.wikimedia.org/dw31415/cutlass-bot

Links to relevant discussions (where appropriate): Wikipedia talk:archive.today guidance#Wrap standalone, blacklisted link

Edit period(s): Continuous

Estimated number of pages affected: 88,385

Namespace(s): Mainspace/Articles

Exclusion compliant (Yes/No): Yes

Function details:

  1. Use quarry to identify external deprecated archive links in the 0 namespace
  2. Filter to paths containing a link http(s)://{hostname}. Extract the source url (note: no plans to test if the link is live, see discussion)
  3. Find the context of the link in the page
  4. Filter to links in [] (not in a template)
  5. Replace the link (see discussion, replacement details under discussion.

Note: See dry run at User:Dw31415/ArchiveEdits1

Discussion

Background and Proposal: In February, the WP:NOMOREARCHIVETODAY RfC reached a consensus to “remove” all links to archive today. Since then good efforts have been made to replace the links or hide them when contained in templates. However, more than 100,000 links remain visible.

This week, Wikipedia editors documented instances in which archive.today links redirected readers to the Tehran Times rather than the expected archived content[1]. The behavior was captured on video and discussed at Wikipedia talk:Archive.today guidance. This new behavior reduces the utility of the remaining links and demonstrates that readers cannot reliably predict where these links will lead.

I intend this bot to implement the existing community consensus by replacing archive.today links with their original source URLs when those URLs can be identified. I do not intend the bot to evaluate the continued availability of the original source material. Rather, it will restore the target selected by the original editor while removing links to a service that the community has already determined should no longer be presented to readers.

I currently operate DwAlphaBot, but propose this task should be conducted by a new bot to improve traceability and distinguish these edits from DwAlphaBot’s other approved tasks. The code is not yet complete, and discussion is ongoing regarding implementation details, including whether any hidden metadata should be preserved[2].

I am seeking early review and guidance from BAG and interested editors and request approval for an initial trial of up to 20 edits involving only deterministic replacements where the original URL is reviewed by me. Dw31415 (talk) 13:50, 14 June 2026 (UTC)[reply]

  • I oppose any bot replacing these links without checking for their availability at the original URL and without checking their availability at Wayback Machine/Ghostarchive/Megalodon. sapphaline (talk) 14:53, 14 June 2026 (UTC)[reply]
    Thank you for considering it. Do you oppose CS1 hiding the archive links? That’s by far the greater number (~600k) links. I’m trying to gain support for a similar approach. Are there any mitigations that would win your support? Dw31415 (talk) 15:17, 14 June 2026 (UTC)[reply]
    "Do you oppose CS1 hiding the archive links?" - no, because this creates a backlog of links that need replacement. Your approach essentially means offloading one big backlog (visible archive.today links) to a different big backlog (dead links), which is even bigger and has even less people interested in cleaning it up. sapphaline (talk) 16:21, 14 June 2026 (UTC)[reply]
    "Are there any mitigations that would win your support?" - checking the availability of the original URL and assigning appropriate |url-status= to the citation is a bare minimum; ideally the bot should also check the mentioned archives and add an archived copy in case it's not a redirect (3xx) or an error page (4xx/5xx). If this is implemented, then the bot should also add a hidden category for every affected page so that editors can check after the bot and replace inappropriate archives added by it. sapphaline (talk) 16:43, 14 June 2026 (UTC)[reply]
    Nice idea for the category. I’ve added one to the template (link at other discussion). I need to check how to make it hidden Dw31415 (talk) 17:20, 14 June 2026 (UTC)[reply]
    It would be easy to check the way back api and mark if one exists. Harder to do more than that. Dw31415 (talk) 17:21, 14 June 2026 (UTC)[reply]
  • (edit conflict) I oppose unless the content is available at the original URI (simply not being a 404 is not enough - check for usurpations, soft 404s, domain reselling pages, etc). in other situations it should be replaced with a working, non-deprecated archive or marked as a dead link with a comment noting that an archive.today link was removed with a link to why. If the checking cannot be done reliably by a bot then it is not a task suitable for a bot. Thryduulf (talk) 16:50, 14 June 2026 (UTC)[reply]
    Please say more about “marked as a dead link with a comment”. The Mauer PDF in the linked discussion is a good example is a good example. The source url returns some minimal text (not a 404). I haven’t checked way back for it yet. Dw31415 (talk) 16:59, 14 June 2026 (UTC)[reply]
    By marking as a deadlink with a note, I mean cases that where if the archive.today copy didn't exist it would be tagged using {{dead link}} (or similar) but leaving a hidden comment that the AT copy does exist if anyone wants to view it (doing so may enable them to find the information elsewhere for example). Thryduulf (talk) 22:02, 14 June 2026 (UTC)[reply]
    It looks like the hidden comment part is there for all links - the template currently renders in wikicode like this according to it's documentation (one space added to defeat edit filter), which includes the information about the archive.today link. Tazerdadog (talk) 22:49, 14 June 2026 (UTC)[reply]
    Thanks! I’ll fix tonight. Dw31415 (talk) 23:23, 14 June 2026 (UTC)[reply]
    @Thryduulf, Have you experience the redirect to Tehran Times? If not, I'd ask you to try it. I find it unsettling. You might try the "tally-ho" archive at Frank_Frazetta. I tried to reproduce, but my home ISP actually blocks archive today. I get a connection refused error. Please try it and let us know if the Tehran Times redirect still reproduces. Dw31415 (talk) 12:45, 15 June 2026 (UTC)[reply]
    I have not personally experienced that behaviour, but I don't understand the relevance to my objection. My objection is to removing an AT link without one of (a) a replacement archive of the content, (b) a working link to the content, or (c) marking the link as dead with a note that the the AT archive exists (but is not suitable for reasons explained at a linked page). {{Deprecated archive}} matches (c) only if no other archive or live copy of the content exists. Thryduulf (talk) 13:00, 15 June 2026 (UTC)[reply]
    @Thryduulf, thanks for calling me back to your objection. My concern is that the conditions you outline are difficult for a bot to evaluate reliably at scale.
    However, I do not think it is appropriate to hold this bot task to a different standard than the one already applied through the CS1 implementation. The community has already accepted an approach in which links are suppressed from reader view based solely on the presence of an archive.today URL. That implementation did not depend on establishing that the original URL was live, that another archive existed, or that the archive.today snapshot was not uniquely valuable.
    If the community’s position is that those evaluations are required before a hidden archive.today link may be removed from view, then it follows that the same evaluations should have been required before those links were hidden from readers in the first place. I do not believe BAG should create a higher threshold for these links than the threshold that was used to hide them with CS1.
    Am I missing something about how this proposal compares to the CS1 implementation or the consensus from the RfC? Dw31415 (talk) 19:59, 15 June 2026 (UTC)[reply]
    if the conditions cannot be reliably evaluated by a bot then this is not a task that is appropriate for a bot to perform. Thryduulf (talk) 21:06, 15 June 2026 (UTC)[reply]
    It would seem that you find the CS1 implementation to be objectionable as well, is this correct? fifteen thousand two hundred twenty four (talk) 21:12, 15 June 2026 (UTC)[reply]
    If that is also being applied without meeting the above necessary conditions then yes, but my understanding is that that is not changing the wikitext and so does not harm the encyclopaedia in the same way a bot will Thryduulf (talk) 07:32, 16 June 2026 (UTC)[reply]
    It was applied 22 February 2026 without meeting your desired criteria, but it did meet the RFC consensus that the links be removed as soon as practicable (though if we were to split hairs, hiding isn't removal). fifteen thousand two hundred twenty four (talk) 08:57, 16 June 2026 (UTC)[reply]
    Just because we have previously been reckless with the encyclopaedia previously is not an acceptable justifcation for doing something more reckless (at best) again. The RFC did not give editors carte blanche to harm the encyclopaedia in order to achieve a goal motivated by a moral panic rather than rational thought. Thryduulf (talk) 09:19, 16 June 2026 (UTC)[reply]
    I have no desire to relitigate the RFC, which it's appearing more and more like that's what this is. The consensus there found that directing readers to an archive that hijacks connections to perform attacks and modifies its contents to target certain persons was harmful, and that links to said archive should be removed asap. Any claim that this rational finding was motivated by moral panic is one I can't take seriously. I'll be focusing my attention elsewhere now. fifteen thousand two hundred twenty four (talk) 09:42, 16 June 2026 (UTC)[reply]
    CS1 does not change the wiki text. It removes the archive from displaying at all at render time. I’ll try to get a before and after. Dw31415 (talk) 14:32, 16 June 2026 (UTC)[reply]
{{Deprecated archive
 |sourceurl=https://example.com/source-page
 |title=Source page
 |archivehostpath=archive .ph/YYYYMMDD/https://example.com/source-page
}}


Should I ping respondents to Wikipedia talk:Archive.today guidance#Wrap standalone, blacklisted link or should we keep support/oppose there? Dw31415 (talk) 15:12, 14 June 2026 (UTC)[reply]
  • We already had a full consensus discussion to do this - the RFC that had consensus to remove all archive.today links was exceptionally well attended. It closed with a consensus to go much further than this bot would, and remove every archive.today link, regardless of any hole that would be left. Since that discussion, 2 big things have happened, both of which indicate we should go ahead with this bot expeditiously. The first is the changes to the CS1 template, which hid the majority of the archive.today links in the same sense that this bot would. This proceeded with minimal controversy relative to the size of the change. The second change is the random linking to the Tehran Times when the referrer is Wikipedia. That degrades the utility of the archive, and makes it an unreliable link for our readers in the sense that it doesn't go where it promises it does. Requiring this bot to jump through excessive hoops to check for repairing the dead link is counterproductive when we need to action these removals in a timely manner. I can get on board with implementing whatever Dw31415 can implement quickly. If any of these checks on repairing the link are technically difficult or time consuming, we need to proceed without them, and invite the objectors to come in behind the bot and do them in a second pass. Tazerdadog (talk) 18:18, 14 June 2026 (UTC)[reply]
    Thanks. Just to underscore, the flexibility of the Template:Deprecated archive. It’s designed so the community could decide to reverse the decision and display the deprecated links again just by updating the template. (No need to touch the pages again). Dw31415 (talk) 18:51, 14 June 2026 (UTC)[reply]
    I spent about 90 minutes working out the plan for the work queue (https://gitlab.wikimedia.org/dw31415/cutlass-bot/-/blob/main/queue-implementation-plan.md?ref_type=heads). I hope to get some guidance soon from BAG on next steps. I'll be away for Monday & Tuesday. I'll be able to respond by phone but not able to work on the bot. Dw31415 (talk) 03:57, 15 June 2026 (UTC)[reply]
    This is my understanding of the RFC as well, WP:NOMOREATODAY closed asking that as soon as practicable we remove all links to it, which is a rather strong result when considering that none of the initially proposed options solely concerned removal (Option A was removal/hiding). I see no qualifiers there that links should be removed, but only after the original site is determined to be live, just that they should be removed as soon as it's feasible. With a bot it's feasible, and the proposed approach using {{deprecated archive}} is highly reasonable, essentially mirroring the CS1 hiding that is already widely deployed without issue. When it comes to actioning the existing consensus I see no reason to oppose the proposal here. fifteen thousand two hundred twenty four (talk) 20:12, 15 June 2026 (UTC)[reply]
    @Tazerdadog: I went through WP:NOMOREARCHIVETODAY and see nothing that supports mass removal by bot in the manner you suggest. There is strong consensus to deprecate. There is also strong consensus to get rid of these links as soon as practically feasible, in the sense that url removal must be minimally disruptive and preferably replaced by an alternative.
    If what is desired is the hiding of these links from readers, {{cite xxx}} templates can be updated to hide archive.today links and put them in a maintenance category. Same from {{webarchive}}. Once that's done, we can look at bots wrapping raw urls in a similar fashion. Headbomb {t · c · p · b} 16:46, 16 June 2026 (UTC)[reply]
    Pinging Voorts into this conversation - as the closer he's better positioned than I am to comment on the closure. Are there a significant number of archive.today links still visible to readers that are wrapped in a cite x or webarchivetemplate rather than as bare links? If that's the case then I absolutely agree we should fix those and then circle back to this discussion afterwards. I thought the CS1 change had taken care of them, but I could easily be wrong. Tazerdadog (talk) 16:58, 16 June 2026 (UTC)[reply]
    Serves me right for trying to use the visual reply tool for anything - @Voorts: Tazerdadog (talk) 17:00, 16 June 2026 (UTC)[reply]
    What's the question? voorts (talk/contributions) 17:43, 16 June 2026 (UTC)[reply]
    Trying to phrase carefully so I don't put words in someone's mouth:
    Does your closure at the Archive Today RFC imply community consensus in favor of using a bot to address the archive.today links assuming that such a bot is the only known way to address the links in a timely manner?
    Are there any checks that the bot should perform while wrapping the link, such as checking for a live link, checking for an alternative archive, or marking links as dead that should be performed while the bot runs to remain consistent with the community's consensus? Are there any that it must perform, even if it slows the development of the bot and the eventual addressing of the links?
    Is the fact that we're implementing a half measure by hiding the archive link from readers while retaining it in plaintext a fatal issue with complying with the close, given that we have been unable to get a solution to fully remove the links moving? Tazerdadog (talk) 18:05, 16 June 2026 (UTC)[reply]
    The close deprecated archive.today and said all links should be removed, not just hidden. There was no consensus in the discussion to merely hide the links. The close did not speak to whether we should use a bot, but I don't see why that would be objectionable. voorts (talk/contributions) 18:20, 16 June 2026 (UTC)[reply]
    "removed as soon as practical". That last part is important. Headbomb {t · c · p · b} 18:39, 16 June 2026 (UTC)[reply]
    IIRC, the question of hiding vs. removing was addressed in the RfC. voorts (talk/contributions) 18:27, 16 June 2026 (UTC)[reply]
    Thank you for the quick answer Voorts. @Headbomb: - is this sufficient to demonstrate community consensus for the bot task, or should we start additional discussions to establish it? @Dw31415: - can the bot be modified so that the link is removed instead of simply placed in the Wikitext, ideally in a way that is reversible or that allows other editors/bots to follow behind yours and check whether a different archive matches the archive today citation that was removed? (this could be as simple as a table of citations and archive today links off in wikispace somewhere) Tazerdadog (talk) 18:41, 16 June 2026 (UTC)[reply]
    IMO that RFC has no consensus on any specific "what is the next step" item. That, to me, is something to be hashed out first before bots are coded. Headbomb {t · c · p · b} 18:44, 16 June 2026 (UTC)[reply]
    @Voorts: Yes, and while everyone agrees that removal is the ultimate goal, that doesn't mean it is the first step. Headbomb {t · c · p · b} 18:41, 16 June 2026 (UTC)[reply]
    The consensus was to deprecate and remove all the links as soon as possible voorts (talk/contributions) 18:43, 16 June 2026 (UTC)[reply]
    "As soon as practicable" is the wording of the close. That means not mass removed by bot, unless the community decides that it doesn't want to wait and does not want intermediate steps done (like a bot run to find alternative archives). Headbomb {t · c · p · b} 18:45, 16 June 2026 (UTC)[reply]
    That is some serious wikilawyering. voorts (talk/contributions) 18:46, 16 June 2026 (UTC)[reply]
    Bots require clear mandates. That RFC is not a clear mandate. "We ought to start running as soon as we're ready" does not mean "We ought to start running NOW", especially when we aren't ready, and that people haven't even decided what 'ready' looks like. Headbomb {t · c · p · b} 18:52, 16 June 2026 (UTC)[reply]
    The RfC said we should remove the links as soon as possible. A bot would allow us to do that. What is your objection to a bot doing it? voorts (talk/contributions) 18:59, 16 June 2026 (UTC)[reply]
"As soon as practicable" not "As soon as possible". Headbomb {t · c · p · b} 19:15, 16 June 2026 (UTC)[reply]
That was about adding to the spam blacklist. I had originally said links should be removed "forthwith", but changed that after editors pointed out implementation would take some time. Opposing a bot to implement the RfC result is contrary to the consensus. voorts (talk/contributions) 16:33, 19 June 2026 (UTC)[reply]
@Thryduulf, I'm not sure that I'm understanding your hopes here. Which of these two sentences is closest to your view?
  1. It is important that readers be able to click on links to this website, even though it is known to have behaved maliciously, to have redirected readers to a different website (which doesn't help anyone check whether the source supports the article content), and now appears to be threatening to delete all the pages that are linked in Wikipedia.
  2. Readers don't actually need to see these links, but we shouldn't remove the URL information from the wikitext, because that information might be helpful to editors when they manually review the links or are just editing the article in general (and, yes, it'd still be there in the page history, but realistically, nobody's going to find that).
WhatamIdoing (talk) 17:00, 19 June 2026 (UTC)[reply]
My view is closer to your 2, but it's not spot on:
  • If the AT link verifies the content then that link should remain unless and until it is replaced by either a working link that verifies the content or an alternative archive verifies the content.
  • If the AT link does not verify the content (including if the archive has been deleted) then it should be removed and replaced with, in preference order:
    • A working live link that verifies the content
    • An alternative archive
    • An explicit indication that the source is dead
  • If it is unknown whether the AT link verifies the content then it should remain until that is established. If a bot is unable to work within these parameters then it must not be approved.
IMO everything else is an affront to WP:V, which I shouldn't have to remind editors is a non-negotiable core policy. Thryduulf (talk) 17:07, 19 June 2026 (UTC)[reply]
I'm trying to understand what it means for the link to "remain". If I wrap the AT link inside a <!-- hidden HTML comment -->, does it still "remain"? I think so: The link is still right there in the wikitext. But do you agree with me? WhatamIdoing (talk) 17:27, 19 June 2026 (UTC)[reply]
It sort of remains. It would be best if the link were to remain visible until its status was resolved, but being present but hidden is better than being removed. Thryduulf (talk) 18:08, 19 June 2026 (UTC)[reply]
Can you live with being present in the wikitext and hidden from the reader? Would it make a big difference to you if it were instead present in the wikitext, visible to logged-in editors, and hidden from unsuspecting readers? WhatamIdoing (talk) 19:25, 19 June 2026 (UTC)[reply]
I thought I answered your first question in my last comment? It's not ideal but I can live with it. I don't understand how your second question differs from the first? Thryduulf (talk) 19:37, 19 June 2026 (UTC)[reply]
The keep/remove choice is a bit more of a spectrum than you might think, due to the magic of CSS. It goes something like this:
  1. Remove from wikitext; nobody sees the ATODAY link. (In all options, the title/author/date/original URL/rest of the citation would stay, of course.)
  2. Keep in wikitext, but <--hidden-->, so it can only be seen if you look in the wikitext.
  3. Keep in wikitext, but use CSS magic to <--hide it from readers--> while still keeping it visible to logged-in editors.
  4. Keep in wikitext; everybody sees it (=what we have today).
You've said that #2 is okay and that #4 is okay. I wonder if you think #3 would be a material improvement compared to #2. (Upside: easier for editors to see, still protects most unsuspecting readers; downside: on a long/complex page, the CSS might be a little slow to process, so the link might sometimes be visible for a second or two and then disappear). WhatamIdoing (talk) 20:26, 19 June 2026 (UTC)[reply]
Ah ok, I understand now. Basically the preference order is 4 (until assessed, i.e. not permanently), then 3 then 2. Thryduulf (talk) 22:32, 19 June 2026 (UTC)[reply]
Okay. If we change [http//www.example.com Title] to {{new template|link=http//www.example.com |label=Title}}, we should be able to set the template work as #2, #3, or #4 at different times. It would also make it easier for people to find the ones that need to be reviewed, because maintenance templates can have maintenance categories attached. But we'd need the bot to install the template before any of that could be done. The point of this request is to install the template. Can you agree to have the template installed? WhatamIdoing (talk) 23:52, 19 June 2026 (UTC)[reply]
The question of whether it should be hidden or removed was posed in the RfC; my close found a consensus to remove rather than hide. voorts (talk/contributions) 20:32, 19 June 2026 (UTC)[reply]
Yes, but it's possible to do the "hide" step now by bot, while we continue the "remove" process. Also, the "hide" step can be done with a template that has an associated with a maintenance category, which makes it easier for any "removers" to find them. I don't think that "hide" should be seen as the opposite of "remove"; it is instead a step that can help us reach the "remove" goal. WhatamIdoing (talk) 22:04, 19 June 2026 (UTC)[reply]

Dry run added: User:Dw31415/ArchiveEdits1 Dw31415 (talk) 12:21, 15 June 2026 (UTC)[reply]

Note the first example there sets the title of the archived page to be "Archived" which is obviously incorrect. If the bot is going to add errors of this nature to the encyclopaedia then that is another reason to oppose. Thryduulf (talk) 07:36, 16 June 2026 (UTC)[reply]
@Thryduulf, I agree the “archived” case should be examined more carefully. There are two other options I considered:
  1. Changing the link to an Interstitial webpage that gives a warning, a click through, and information about whether a link exists at the way back.
  2. Changing the link to a WP page that explains the situation and how to find the original archive link
Do you find either of these less objectionable? Dw31415 (talk) 14:38, 16 June 2026 (UTC)[reply]
p.s. here is a mock up of an interstitial https://dw31415wp-glitch.github.io/archive-checker-bot/?url=https://archive-today/2025.01.01-120000/https://example.org/article Dw31415 (talk) 16:16, 16 June 2026 (UTC)[reply]
(edit conflict) If webpage is linked, directly or indirectly, from an article in any form other than a bare url then any metadata about that page (including its title) should be recorded (and displayed if the link is displayed) correctly. This applies regardless of whether the link is to a live webpage or to an archive, and if the latter what archive that is.
Really every link processed by this bot should be left in one of four states:
  1. A link to a live copy of the content that supports the associated article text (with or without an archive, AT or otherwise)
  2. A link to an acceptable archive of the content that supports the associated article text
  3. An explicitly marked dead link with some sort of note that an archive exists at AT with some explanation why it isn't being linked (this can be inline, via a linked page or some combination)
  4. An explicitly marked permanently dead link (there is no benefit to even mentioning a broken AT archive)
Thryduulf (talk) 16:18, 16 June 2026 (UTC)[reply]
I would probably oppose those solutions - if we have the link, it should directly go to where we said it was going to go. If we're not willing to honor the link destination, which we are not for archive today, we should not have a clickable link. In any case, I'd like to see a high quality consensus discussion authorizing these before we seriously consider implementing them. Tazerdadog (talk) 18:26, 16 June 2026 (UTC)[reply]
Any thoughts about how to have that discussion? I’m leaning to modifying the proposal so the template starts with the status quo behavior. This would allow the community to act more easily through the template (just like CS1). That might allow the bot folks to stay out of the consensus building business. Dw31415 (talk) 20:21, 16 June 2026 (UTC)[reply]
If I was proposing a way forward from here, I'd first make sure that this comment from @Headbomb: didn't get lost in the shuffle:

If what is desired is the hiding of these links from readers, {{cite xxx}} templates can be updated to hide archive.today links and put them in a maintenance category. Same from {{webarchive}}. Once that's done, we can look at bots wrapping raw urls in a similar fashion.
— User:Headbomb

.
If we have any low hanging fruit contained in these templates, we should at least disable reader facing links while we have the followup conversation via a single edit to the templates.
Following that, I'd I'd defer to the BAG. I think we've made the best case we currently have for an existing consensus with the closer of a very well attended RFC coming in to this BRFA discussion. If that's good enough, great. If it's not, I'd ask what we do need, then hold that discussion. I could see a case for needing a more recent consensus, for needing a consensus to do something specifically with a bot instead of just as soon as (possible/practicable), or a clarification on what the bot needs to do versus what can/should/technically must be left undone, or a clarification on the final desired state (removal with no easy way to undo it, removal with an undo button/database in the template, hiding it in the wikitext so a volunteer could repair but no layman would find it, hashing the citation so that if you don't know the trick we tell to citation repairers you can't find it, etc.) Tazerdadog (talk) 00:21, 17 June 2026 (UTC)[reply]
Looking a little deeper, it looks like Chaotic Enby pushed changes to the webarchive template shortly after this started, and the cite web talkpage redirects to CS1, so that might be already done? Tazerdadog (talk) 00:29, 17 June 2026 (UTC)[reply]
Yes, those changes to CS1 and webarchive have already hidden 600k-ish links. This bot proposal targets the remaining 100k-ish. Dw31415 (talk) 01:33, 17 June 2026 (UTC)[reply]
@Fifteen thousand two hundred twenty four, @Headbomb, @Sapphaline, @Thryduulf, @Voorts: Just FYI, these websites were added to the global spam blacklist yesterday. This affects all WMF wikis, not just the English Wikipedia. WhatamIdoing (talk) 02:17, 17 June 2026 (UTC)[reply]
The websites have been added to our local whitelist, and the current edit filter is keeping everything out with a more customized error message and fewer side effects than a spam blacklist. For most practical purposes this should take us back to the status quo of the last 3 months. Tazerdadog (talk) 03:06, 17 June 2026 (UTC)[reply]
@Headbomb, thank you for reviewing and for your work on BAG. To your comment:

If what is desired is the hiding of these links from readers...templates can be updated to hide...

CS1 and webarchive already changed[3] to hide the archive today links from readers without any significant objection from the community. That change hid the majority (~600k) of archive today links. I understand your position that the Wikipedia:NOMOREATODAY RfC does not provide sufficient consensus for a bot to remove links. Please note that this proposal doesn't actually remove the archive today links. It wraps standalone archive today links in a new template. This bot would empower the community to more effectively implement the RfC through modification of the template. Is there any role for a bot based on the decision of the RfC and the consensus in the CS1 and webarchive moves? Thanks! Dw31415 (talk) 04:15, 17 June 2026 (UTC)[reply]
  • Consensus already exists to do this. Do not allow this to get derailed. I consider it unethical to delay.—S Marshall T/C 16:03, 19 June 2026 (UTC)[reply]
    Consensus exists to remove the links "when practical", a practical solution requires not going at it like a bull in a china shop to the detriment of WP:V. Thryduulf (talk) 16:48, 19 June 2026 (UTC)[reply]
    That's absolutely incorrect. Consensus was to get rid of all archive.today links. voorts (talk/contributions) 17:47, 19 June 2026 (UTC)[reply]
    I don't think my close could have been any clearer that the community clearly doesn't want archive.today links around anymore. voorts (talk/contributions) 17:49, 19 June 2026 (UTC)[reply]
    I'm becoming increasingly concerned that your close was not actually neutral. You have certainly become a big advocate of a hardline interpretation of the result, which is not the hallmark of someone who has dispassionately evaluated the consensus, which was far more measured than "remove everything without any consideration for anything regardless of what anybody says". Thryduulf (talk) 18:05, 19 June 2026 (UTC)[reply]
    I'm not advocating for anything other than adherance to consensus, which was perhaps the clearest I've ever seen over the course of closing many discussions on Wikipedia. The fact that you disagree with it does not change the fact that an overwhelming consensus of editors wanted us to stop using archive.today and get rid of links to it. I frankly think trying to undermine consensus by making baseless procedural objections like this is disruptive. voorts (talk/contributions) 18:23, 19 June 2026 (UTC)[reply]
    I'm not trying to be disruptive, rather I'm trying to minimise the disruption to the encyclopaedia caused by an overzealous interpretation of an RFC where the consensus (when you read the actual reasoned arguments not just the hyperbole) was absolutely not in favour of disrupting the encyclopaedia to make a point. Thryduulf (talk) 18:27, 19 June 2026 (UTC)[reply]
    If you thought my interpretation of the RfC was incorrect or overzealous, you should have come to my talk page when I closed it and then opened a close review if I didn't agree with your objections, instead of trying to undermine it months later. My close says what it says, and it says there was consensus to "remove" (not hide; which was expressly an option in the RfC) links to archive.today. voorts (talk/contributions) 18:31, 19 June 2026 (UTC)[reply]
    At the time I thought your close was on the poor side but not by enough to challenge it, especially when emotions were running so hot. However it is your actions since the close that have gradually made me more and more uncomfortable, particularly the increasingly strident advocating for removal without regard for consequences which is at odds with the remove when practical wording from your own close. Thryduulf (talk) 18:36, 19 June 2026 (UTC)[reply]
    "Consensus was to get rid of all archive.today links" Yes, eventually. Not immediately, but rather "when practicable". Disruption is highly likely to occur and any bot that wants to plow through approval to remove all links as quickly as possible will be denied as ill thought out and running against consensus of the RFC. Headbomb {t · c · p · b} 18:41, 19 June 2026 (UTC)[reply]
    My actions since the close have been to discuss the close at this one thread when I was asked to weigh in. As the closer, it is not my place to consider the consequences of removal. The community weighed those consequences, and at the time of my close, I found a consensus, based on actual reasoned arguments not just the hyperbole to immediately deprecate archive.today, add it to the spam blacklist as soon as practicable, and to remove all links to archive.today. voorts (talk/contributions) 18:41, 19 June 2026 (UTC)[reply]
    Frankly, I think you are not being dispassionate here. You are so against the removal of archive.today links that you are misreading community consensus to stop it from occurring. voorts (talk/contributions) 18:43, 19 June 2026 (UTC)[reply]
    I'm not against the removal of archive.today links. I'm against their removal without regard for the consequences of doing so, as I have explained in great detail multiple times because some editors (apparently including you) are unable or unwilling to understand that wholesale removal without regard to the consequences to the encyclopaedia was not something the RFC supported - not that an RFC, however well attended, could find a consensus to contravene WP:V.
    If the advocates of removal spent a fraction of the amount of energy on actually assessing the links so that there are no adverse consequences from removal as they do complaining that they aren't being allowed to harm the encyclopaedia as fast as they want to then there wouldn't be a need for discussions like thisn one. Thryduulf (talk) 18:54, 19 June 2026 (UTC)[reply]
    The WP:V objection that you're making was addressed and rejected in the RfC itself. If you want to relitigate my close, open a close review. If not, what are we even doing here? voorts (talk/contributions) 18:56, 19 June 2026 (UTC)[reply]
    The WP:V objection was not addressed in the RFC. What we are doing here is trying to ensure that any bot acts only in accordance with the bot policy, which in relevant part requires consensus for the actions to be taken and consensus for them to be taken by a bot. I'm not seeing either of those at this time. Thryduulf (talk) 19:01, 19 June 2026 (UTC)[reply]
    "Verif*" appears 141 times on the RfC page and that argument was directly addressed in my close: Those in favor of maintaining the status quo rested their arguments primarily on the utility of archive.today for verifiability. However, an analysis of existing links has shown that most of its uses can be replaced. voorts (talk/contributions) 19:05, 19 June 2026 (UTC)[reply]
    Thryduulf is 100% correct here. You can either accept this and carry on, or I can close this bot request and deny it as premature, without prejudice against future bot requests once a clear consensus emerges. Headbomb {t · c · p · b} 19:06, 19 June 2026 (UTC)[reply]
    @Headbomb, while you're here, can you get your script to flag these? We need visibility, especially for medical articles. I've been surprised how often these links are associated with non-MEDRS sources (e.g., an "educational" page on a random dentist's website); in other cases (e.g., archives of the ICD codes), it's just unnecessary. WhatamIdoing (talk) 19:48, 19 June 2026 (UTC)[reply]
    I could. I just need to think of a scheme that flags them as problematic but not for the normal reasons. Maybe orange. Ping me again tomorrow if i haven't come up with something. Headbomb {t · c · p · b} 20:05, 19 June 2026 (UTC)[reply]
    This: If the advocates of removal spent a fraction of the amount of energy on actually assessing the links so that there are no adverse consequences from removal as they do complaining is wrong by at least three orders of magnitude. I've done some of that work. I have, in fact, spent more time on that work than I have spent in this discussion. But so far, I have solved ATODAY problems in fewer than 100 articles, and just for WPMED articles, I have more than 1,000 to go. WhatamIdoing (talk) 19:39, 19 June 2026 (UTC)[reply]
    Thryduulf is correct even as he's making verifiably untrue claims about the RfC discussion? voorts (talk/contributions) 19:06, 19 June 2026 (UTC)[reply]
    However, an analysis of existing links has shown that most of its uses can be replaced. is very much not the same thing as "remove all the links as soon as possible, without regard for whether they have been and/or can be, and while doing so make it harder for this to be done later". Thryduulf (talk) 19:10, 19 June 2026 (UTC)[reply]
    The first sentence of the close says to get rid of them. Those two sentences that I just quoted to you are addressing the arguments that were made in the RfC, arguments that you just asserted weren't actually made. I'm having a hard time AGF at this point if you're just going to cherry pick parts of the close and ignore parts of the discussion that are inconvenient to you. I've said what I have to say here at this point. voorts (talk/contributions) 19:13, 19 June 2026 (UTC)[reply]
    When you're reduced to claiming that shades of grey do not exist then it's very clear that no rational discussion is possible. Thryduulf (talk) 19:15, 19 June 2026 (UTC)[reply]
    I think you two should seriously consider disengaging from each other and from this discussion. I hope you will.—S Marshall T/C 20:44, 19 June 2026 (UTC)[reply]
    @Headbomb, a couple of questions:
    1. What practicalities are stopping us from removing (or hiding) the links?
    2. Is it reasonable to ask for a BAG second opinion?
    Dw31415 (talk) 01:12, 20 June 2026 (UTC)[reply]
    Regarding practicality, I understand that something being practical means that something is possible and we have the resources to do it. Changing the template code in the Template:Cite web. made it very practical to hide the links. Using a bot makes it more practical to hide the standalone links. In your interpretation, is finding a replacement source one of the practicalities? Dw31415 (talk) 01:19, 20 June 2026 (UTC)[reply]
    Regarding a second opinion, normally I'd say by all means let's have more discussion and consensus building. In this case, we had one of the best attended RfC's in recent memory. We have the closer here plainly saying that the community decided on removal as soon as possible (and that trying to read in additional requirements to "practical" is engaging in wiki-lawyering). I'm reluctant to open an additional RfC that will essentially be asking if "remove" means "remove". The many respondents who share the consensus view would understandable be upset at needing to duplicate the arguments they made in the February RfC. As a better process, may I suggest that you approve the requested 20 edit trial run and we use that as a stimulus for more discussion. As an additional option, maybe the editors here saying that the closer didn't mean what the closer is still saying, should request a close review. Dw31415 (talk) 01:39, 20 June 2026 (UTC)[reply]
    Let me just add a word of appreciation for the spirited discussion. I trust that it comes from a place of dedication to improving the encyclopedia. My personal motivation matches the sentiments from WMF:

    For readers to remain relaxed and trusting while using Wikipedia, they should be able to reasonably expect that links on Wikipedia to potentially dangerous websites are rare, and that those that do exist are dealt with quickly once spotted[4]

    As for time, I've spent way too much time trying to find compromise solutions. That time would be better spent helping to develop tools to replace the references with better alternatives. Dw31415 (talk) 01:49, 20 June 2026 (UTC)[reply]
    Eric at WMF responded to a message I sent asking if they had capabilities or plans that would duplicate this effort. Here is his reply:

    We [at the WMF] don't have any plans or capabilities that this [bot] effort would conflict with or duplicate. Right now, we're focused on supporting volunteer-led actions. I haven't reviewed the details of your bot, and would defer to the community to do that. But if your initiative gets some steam, we'd love to stay tied in and see if there are ways we can help it from our end[5]

    Dw31415 (talk) 02:05, 20 June 2026 (UTC)[reply]
    @Dw31415:
    1. What practicalities are stopping us from removing (or hiding) the links?
    Right now the main blocker is that it is unclear what the community wants. Yes, it wants the removal of those links, but it does not want wanton reckless action. Since the RFC never bothered to ask what "practicable" means, we're left guessing.
    Right now, CS1 templates are not hiding links but are marking them as a deprecated archive service and putting them in a maintenance category. {{webarchive}} templates have recently been updated to hide links and put them in a maintenance category. At very minimum, I can see support for taking bare links and marking them with a template, since that is what is currently being done elsewhere, but I don't know what the behaviour of that template should be concerning hiding or not. This is something for the community to decide, but it is not a decision we can make here. So as far as I'm concerned, the community needs to figure out what behaviour is currently desired {{cite xxx}} which keeps links and marks them as deprecated, or {{webarchive}} which hides links and marks them as deprecated. Once that is clear, templates can be updated and this one deployed.
    2. Is it reasonable to ask for a BAG second opinion?
    Sure. Everyone's always free to ask for a second opinion. I've asked @Earwig: for his two cents earlier, but he may be busy, especially given the length of the discussion here.
    Headbomb {t · c · p · b} 02:16, 20 June 2026 (UTC)[reply]
    @Headbomb, Regarding CS1 links, I understand that the module is hiding them like at Jeremy Lin#cite ref-17 (note there is no "Archived" link in the citation to render the link to [...]today/20120630151832/http://www.denverpost.com/commented/ci_16724722). I think this was the leading cause of the reduction of 600k external links. Am I missing something? Dw31415 (talk) 02:43, 20 June 2026 (UTC)[reply]

My bad, I think I was looking at a hardcoded instance of an example somewhere, rather than a live version and that gave me an innacurate picture of the current situation. Since CS1 and webarchive templates both hide links, it seems reasonable to approve a task that would also hide the remaining bare links. I'd have to review reactions to template updates first though, if there are any.

I'd also have to take a look at the template proposed to hide these links. The cases in User:Dw31415/ArchiveEdits1 seem needlessly complicated. Compare say the current proposed

  • {{Deprecated archive|sourceurl=http://archives.dailynews.lk/2003/10/18/fea05.html|title=Establishing Pāli Text Society for Buddhist literature|archivehostpath=archive. today/20131217001046/http://archives.dailynews.lk/2003/10/18/fea05.html}}

with say {{DAL}} for deprecated archive link (or some short variant)

  • "{{DAL|https:// archive. today/20131217001046/http://archives.dailynews.lk/2003/10/18/fea05.html|Establishing Pāli Text Society for Buddhist literature}}

with functionality that recognises the base url vs host path and archive date automatically. Headbomb {t · c · p · b} 03:13, 20 June 2026 (UTC)[reply]

I agree the current version is needlessly complicated. I had a false assumption that humans could not add the template unless the url was deconstructed. I’ll workshop some more in the discussion linked at top. I like your suggestion. Dw31415 (talk) 03:28, 20 June 2026 (UTC)[reply]
@Headbomb, I think performing a trial run of 20 edits would help generate additional input. Would you please approve a 20 edit trial? Dw31415 (talk) 14:02, 21 June 2026 (UTC)[reply]
For others following here, I created Wikipedia talk:Archive.today guidance#Feedback requested on Template:Deprecated archive to get additional input on the template. Dw31415 (talk) 14:04, 21 June 2026 (UTC)[reply]
@Headbomb, gentle nudge on my request to do a 20 edit trial. Please approve. Dw31415 (talk) 11:24, 23 June 2026 (UTC)[reply]

References

Operator: CocaPopsRather (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 12:19, Saturday, June 6, 2026 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s):

Source code available:

Function overview: Monitors changes for likely vandalism/spam, using algorithm first and GenAI review as a fallback. It reverts only obvious vandalism or spam, can post standard user warnings, and avoids content disputes, BLP/source nuance, edit wars, and other cases requiring human review

Links to relevant discussions (where appropriate):

Edit period(s): Continuous

Estimated number of pages affected: unlimited/all

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: -reliable vandlism detection, -for cases when a normal algorithm would not revert vandlism, a gen-ai model is used to compare the diffs and decide the action.

Discussion

  • Information Note: The user account this request is for is also listed as the Operator, but the account name does not clearly indicate that the account is a bot and the account has very few edits. Please note that WP:Bot policy states that a bot account's username should make it immediately clear that the account is in fact a bot, which is normally done by having the account name end with the word "Bot". Also note that a bot may not operate itself, so the Operator field should identify the account of the human running the bot. AnomieBOT 12:23, 6 June 2026 (UTC)[reply]
    I did make an account with the word BOT In it, however it was soft-blocked, and I was advised to change the name. In-turn resulting in "AutoPatrollerTrials", instead of, "AutoPatrollerBot". I will change the operator field now. AutoPatrollerTrials (talk) 12:26, 6 June 2026 (UTC)[reply]
  • I would like to provide a example test of when the bot would not normally revert for obvious vandlism, but promote the request to Gen-AI to decide whether or not it should be reverted.

The bot will also use a confidence scale, if the confidence of the AI is not above 97% that an edit should be reverted it is ignored to prevent malfunction. CocaPopsRather 12:39, 6 June 2026 (UTC)[reply]

Is there a reason why User:ClueBot NG is not sufficient for this task? Why do we need a second anti-vandal bot? Primefac (talk) 13:22, 6 June 2026 (UTC)[reply]
ClueBot NG is excellent for high-confidence, obvious vandalism, and this bot is not intended to duplicate or replace it. ClueBot uses ClueBot Core, which only reverts when its model reaches a very high threshold (e.g really obvious vandlism). This bot is aimed at the edits that fall below that threshold or require more contextual judgement. It uses additional context such as the diff, page history, user history, warning history and policy-specific reasoning to decide whether an edit should be reverted.
The intent is not to create a second ClueBot, but to handle a different class of edits: cases that are suspicious or damaging but not obvious enough for to be detected by cluebot core. CocaPopsRather 13:37, 6 June 2026 (UTC)[reply]

A few items of note:

  • The bot account should be used only for bot edits. You appear to be using the account for manual edits as well, such as filing this BRFA.
  • Considering that this is stated to use generative AI, has there been any discussion, such as at a Village pump or a talk page dedicated to anti-vandalism work, that indicates the community at large is willing to trust generative AI with this task?

Anomie 13:34, 6 June 2026 (UTC)[reply]

On the account-use point: I understand that the bot account should be used only for bot edits. Filing or discussing the BRFA from the bot account was a mistake, therefore I have moved the discussion to my main account.
On the gen AI point: I agree that community confidence is important,especially when regarding gen ai. My thinking was that a tightly limited trial would provide concrete evidence about whether the system is accurate enough, rather than asking the community to evaluate it only as theoretical.
That said, if the committee feels wider discussion should happen before any trial, I am happy to start that discussion at Village pump or alternative.
The intended trial would be conservative: human-review first, logging decisions, and demonstrating performance before any fully autonomous use is in place. CocaPopsRather 13:41, 6 June 2026 (UTC)[reply]
Non-BAG member here, but I am a bot operator. I do have a few questions:
  • Which AI model will you be using to analyze the edits? Additionally, are you willing to provide the entire prompt text for transparency's sake?
  • Who is paying for the tokens consumed by this bot? English Wikipedia generates roughly 160,000 edits per day. Even if your bot is only sending about 3% of the edits to the AI, that's still nearly 5,000 edits per day being reviewed by an LLM. That would consume many tokens, and rack up a sizable bill. If the tokens are subsidized, what is the motivation of the organization providing such subsidy?
  • In your own words, you used AI to write the code for WP:VandalHandle. Did you use AI to write the code for this bot? If so, has said code been thoroughly reviewed by a human?
  • Where is the community consensus asking for such a bot? WP:BOTREQUIRE point 4 specifically requires that there be consensus established for a task.
Thanks! phuzion (talk) 16:45, 6 June 2026 (UTC)[reply]
Thanks for the questions, Ill answer.
On the AI model: the has been designed so that the model provider is configurable rather than hard-coded. It can use API based models from providers: OpenAI, Google, or Anthropic, and it can also use local Ollama models. For any approved trial, I am willing to specify exactly which model is being used and provide the full prompt text for transparency.
On token costs: any API costs would be paid by me. There is no external organisation subsidising the bot, and therefore no outside motivation or influence. I also do not intend to send every edit to an LLM. The intended design is that normal algorithmic checks filter edits first, and only a much smaller number of suspicious cases would be escalated for model review.
On AI assistance in the code: the diff you linked refers to the WP:VandalHandle documentation rather than the VandalHandle code itself. However, to answer the wider question clearly: No, AI has not been used to "write the bot", however, AI assistance has of-course been used for identifying logic errors, suggesting fixes, and improving code. To clarify: No, the program has not simply been written by AI and deployed without human understanding. The code has been human reviewed by me, and the output/functionality has been tested.
On consensus: I understand that WP:BOTREQUIRE point 4 requires consensus for the task. My initial view was that anti-vandalism bot work is already a well-established task area on enwiki, but I accept that this proposal adds a new element because it involves GenAI-assisted review. Because of that, I agree that wider community confidence is important before any autonomous reverting or warning is considered.
In light of Anomie’s suggestion, I think the best next step is to run a logging-only test first, where the bot records what it would have reverted or warned for, without making edits. Those logs can then be inspected by BAG and the community, and used as evidence in a later discussion about whether this specific approach has consensus. CocaPopsRather 18:14, 6 June 2026 (UTC)[reply]
Note that, when it comes time for the community discussion, you'd probably do better to hold it at WP:Village pump (proposals) or the like than trying to do it here. Anomie 23:28, 6 June 2026 (UTC)[reply]
A logging-only trial can be conducted easily enough without approval by having the bot list actions it would take somewhere where the log can be inspected, for example a page in the bot's userspace or a page on an external site. Note, if writing to the bot's userspace, you'd likely want to batch updates rather than updating with each potential action. That log should be sufficient to hold a community discussion. Anomie 17:46, 6 June 2026 (UTC)[reply]
Thank you for that suggestion, I'll conduct that. CocaPopsRather 17:59, 6 June 2026 (UTC)[reply]
On hold. Feel free to disable this tag when the necessary information has been obtained. Primefac (talk) 10:59, 7 June 2026 (UTC)[reply]
Although at that point they may want to go back to using User:AutoPatrollerBot, and rename this page accordingly. Anomie 15:43, 7 June 2026 (UTC)[reply]

Operator: Sdkb (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 21:26, Saturday, February 7, 2026 (UTC)

Function overview: Removes erroneously italicized commas at the end of italicized terms.

Automatic, Supervised, or Manual: Automatic

Programming language(s): AutoWikiBrowser

Source code available: The bot will be operated by running through lists of pages from the RegEx search query insource:/''[A-Z a-z]+,'' / with a find and replace for ''([A-Z a-z]+),'' ''$1'', . It will use the edit summary Fix erroneously italicized comma and general fixes (task 5).

Links to relevant discussions (where appropriate): None. Although not explicitly specified in the Manual of Style, it is standard English to italicize only the term itself, not punctuation following it.

Edit period(s): Daily

Estimated number of pages affected: 82,000 per this search

Namespace(s): Mainspace (potentially expanding to other namespaces)

Exclusion compliant (Yes/No): Yes

Function details: Because italics markup looks similar to quotation marks and many editors are used to American-style quotation, many editors erroneously put commas following italicized terms within the italicized term, causing the comma to be erroneously italicized. This bot will fix many of these instances, using the AWB settings described above. I did 50 test edits for a version excluding italicized terms with spaces, manually reviewing each one, and the only instances that gave me any pause were ones within quotations, e.g. here (after "for" in the paragraph beginning "King asked a bookmobile driver"). These could be excluded if an issue, but, per the MOS, Insignificant spelling and typographic errors should simply be silently corrected (for example, correct basicly to basically), so I think it's fine to include them. I reviewed another 60 edits (including terms with spaces) via search and found no issues.

Discussion

Should something similar be done with bold? (10,000 per this search) -- WOSlinker (talk) 21:46, 7 February 2026 (UTC)[reply]

Likely. It might also be worth requesting this be added to the genfixes for AWB so that when this run is over any new instances will be more likely to be picked up. Primefac (talk) 21:50, 7 February 2026 (UTC)[reply]
Yeah, I think it'd definitely be nice to do the same thing with erroneously bolded commas. I intentionally kept the query constrained to start off (ignoring any italicized terms with unusual characters, for instance), but it could be expanded after the initial run is over.
And yes, I agree it'd be nice to add this to the GENFIX set. Cheers, Sdkbtalk 22:54, 7 February 2026 (UTC)[reply]
Are you not wanting to do bold? Primefac (talk) 17:48, 15 February 2026 (UTC)[reply]
I looked through the first 100 search results for the bold query. I found one niche edge case: On this page, bolding is used to delineate which parts of two passages match. Because manual line breaks are used, some bolded strings end with a comma. You could argue that this is a downstream effect of the article using poor syntax with manual line breaks, or that a passage like that should have been surrounded with {{as written}}. But because bolding is sometimes used for niche purposes like this, I think it's the slightest bit riskier to try to fix it than italics.
I'll defer to whatever the consensus is here about whether, given this, it's worthwhile to include it or not. Sdkbtalk 17:44, 20 February 2026 (UTC)[reply]

This feels like something so minor that it would be best either ignored or done as part of AWB GENFIXES. I oppose this being done as the sole edit to a page. Thryduulf (talk) 14:28, 20 February 2026 (UTC)[reply]

It's certainly not the most earth-shattering change to a page, but it is an improvement, and it's clearly in compliance with WP:COSMETICBOT because it changes the output HTML of the page. It is something that I occasionally notice as a reader. Also, because it's an AWB bot, it can be run alongside GENFIXes, so often the comma fix will not be the only change the bot makes. Sdkbtalk 17:20, 20 February 2026 (UTC)[reply]
I think we'll have to agree to disagree on whether the change is an improvement or neutral, and I have no objection to the change being made alongside changes that are unambiguously improvements, but minor changes like this should never be the sole change made by a bot. Thryduulf (talk) 18:40, 20 February 2026 (UTC)[reply]
On hold. There is opposition to the task, and with only the implication of consensus to run the task based on existing guidelines I would prefer to see a stronger consensus to specifically target this as a bot run. I know AWB releases updates less frequently than most countries change leadership, but that would be another route to go down to start whittling away at the list. Primefac (talk) 20:17, 8 March 2026 (UTC)[reply]
@Primefac, where would be an appropriate venue to get additional input on whether there is consensus to run this as a bot task? Thryduulf's view seems to be that WP:COSMETICBOT should be made stricter, and while I know that's a view some editors hold, presumably it's a minority given that editors have not found consensus to change the language of the bot policy. Sdkbtalk 20:34, 8 March 2026 (UTC)[reply]
Either at the MOS talk or a Village Pump. I wouldn't necessarily say that it's a more strict ruling on COSMETICBOT given that it already says Minor edits are not usually considered cosmetic but still need consensus to be done by bots. Since this is a "barely visible" type of minor edit, I'd like to get at least some measure of support for making it; it's not like you're going to need an RFC, just enough to indicate that Thryduulf is in the minority when it comes to being concerned. Primefac (talk) 20:48, 8 March 2026 (UTC)[reply]

Bots in a trial period

Operator: Phuzion (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 00:02, Tuesday, June 16, 2026 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): AWB, C#

Source code available: AWB and Primefac's infobox code

Function overview: Deprecated parameter replacement on {{Infobox military installation}}

Links to relevant discussions (where appropriate): Discussion

Edit period(s): One time run

Estimated number of pages affected: ~9,900

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: This is a simple parameter replacement run, replacing the 33 deprecated parameters in {{Infobox military installation}}. A dry run in pre-parse mode found about ~9,900 pages to touch.

As a quick note, this task will technically need to be run under 2 separate AWB configurations. I will need to split the task for the the usage of {{{image_map_caption}}} and {{{pushpin_map_caption}}}, as there is no logic to automatically determine which to replace {{{map_caption}}} with. If it would be preferable, I could do the first half of the edits on the trial for one parameter and the second half on the other.

Discussion

Just wanted to chime in my support for this cleanup, which is another in a long line of similar cleanups. As the one spearheading the work on the template ({{Infobox military installation}}) this bot run will be super helpful and is necessary to finish the work on the template. Happy to help review edits after a trial is granted and completed as well as help answer any questions. --Zackmann (Talk to me/What I been doing) 00:07, 17 June 2026 (UTC)[reply]

{{BAG assistance needed}} - This BRFA has been sitting idle for a week without any review from BAG. phuzion (talk) 00:28, 23 June 2026 (UTC)[reply]

Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Let's try to do 50 of each configuration. — The Earwig (talk) 02:39, 23 June 2026 (UTC)[reply]

I just ran 100 edits, but I think I may have mistakenly grabbed the wrong config file when starting the bot. I'm going to take a closer look tomorrow and would like to request another 100 edit trial run. phuzion (talk) 03:10, 23 June 2026 (UTC)[reply]
Scanned about 25 random edits of the trial and see no issues. Phuzion double check your config file, but I didn't see any issues at all. Zackmann (Talk to me/What I been doing) 04:08, 23 June 2026 (UTC)[reply]

Operator: Phuzion (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 00:02, Tuesday, June 16, 2026 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): AWB, C#

Source code available: AWB and Primefac's infobox code

Function overview: Deprecated parameter replacement on {{Infobox airport}}

Links to relevant discussions (where appropriate): Discussion

Edit period(s): One time run

Estimated number of pages affected: ~13,500

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: This is a simple parameter replacement run, replacing the 22 deprecated parameters in {{Infobox airport}}. A dry run in pre-parse mode found about ~13,500 pages to touch.

Discussion

Just wanted to chime in my support for this cleanup, which is another in a long line of similar cleanups. As the one spearheading the work on the template ({{Infobox airport}}) this bot run will be super helpful and is necessary to finish the work on the template. Happy to help review edits after a trial is granted and completed as well as help answer any questions. --Zackmann (Talk to me/What I been doing) 00:07, 17 June 2026 (UTC)[reply]

{{BAG assistance needed}} - This BRFA has been sitting idle for a week without any review from BAG. phuzion (talk) 00:29, 23 June 2026 (UTC)[reply]

Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. — The Earwig (talk) 02:40, 23 June 2026 (UTC)[reply]

100 edits as requested. phuzion (talk) 03:10, 23 June 2026 (UTC)[reply]
Scanned about 25 random edits of the trial and see no issues. Looks like it is doing exactly what it is designed to! Zackmann (Talk to me/What I been doing) 04:07, 23 June 2026 (UTC)[reply]

Operator: Thilio (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 14:00, Thursday, May 28, 2026 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python, Pywikibot

Source code available: https://gitlab.wikimedia.org/toolforge-repos/thilliobot

Function overview: Detects previously (old) requested RM/TR discus/ons that reappear later as fresh TRs and active Requested moves discus/ons then posts auto comment with a diff link.

Links to relevant discussions (where appropriate): https://en.wikipedia.org/wiki/Wikipedia_talk:Requested_moves#Bot/bot,_Feedback_Needed_RM/TR_maintenance_bot

Edit period(s): Continuous (runs every 10 minutes through Toolforge jobs framework)

Estimated number of pages affected: 1. Mainly Wikipedia:Requested moves/Technical requests. Depending on request activity.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: Bot monitors RM/TR and detects if the same move request was previously submitted and later removed. It also detects currently active RM discussions that have the same source and target titles and posts an automated comment that contains a diff (permalink) to the active or prior discussion. It also notifies the requester using ping when possible, including temporary accounts.

Discussion

  • find_previous_request() is called once per request, but it's always scanning the same RM/TR revision history. It may be worth loading and indexing those revisions once up front instead of re-fetching them for every entry. – DreamRimmer 05:11, 2 June 2026 (UTC)[reply]
    @DreamRimmer I did some few changes modified find_previous_request() where she now uses the pre loaded data instead of reloading revisions for every request. commits added new function too load_revisions_data(page) she can fetch and stores her (revid, page_text) for each revision once she starts of thy run. I hope that what you're suggesting from the DRY RUN test I can feel the API calls and speed up process is magnificence. CONFUSED SPIRIT(Thilio).Talk 08:51, 2 June 2026 (UTC)[reply]
    my first logic of the revision limit of 150 I have changed it even though I did add the pre load logic still she cant detects old requests, for eg. she cannot detect if a request is in the range of 151 or 200 etc. @Tenshi gave me an idea through WCGTD Discord of using storing files so i implemented it using sql databases. now she can run once and load the data in the database for later use, scan revisions of the page, where for each revision she checks if the timestamp is older than 90 days if not it stops. also each revision within the last 90 days she fetches the full-page text and inserts a row into her database. When it is done, it returns list of all cached.
    for example she runs on June 1, she inserts 5000 rows that makes revisions from March 1 to June 1 loaded 5000 revisions from the sql cache from last 90 days. later can be expend to 180 days or something. The update logic works like this, update cache she queries the database to find the newest timestamp already stored in last timestamp and then the loop logic comes in through revisions again but it stops ASAP when it reaches a revision with a timestamp less than or equal to last timestamp which is already cached. she only fetches and inserts new revisions that are newer than last cached revision; she can also delete any rows older than 90 days (for example if a revision from 91 days is now outside the window) second run example the newest cached revision is from June 1 she fetches only the revisions that happened on June 2 (let’s say maybe 15 or 30 new revisions), then it inserts those 15 or 30 new rows to make the cache contain revisions from March 2 to June 2, rows 5000 still but only 15 or 30 new ones were fetched same to the third run etc. commit CONFUSED SPIRIT(Thilio).Talk 12:11, 3 June 2026 (UTC)[reply]
  • Approved for trial (30 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. You could just parse the page and get the request headings instead of storing the entire page text, but whatever works best for you. – DreamRimmer 09:22, 4 June 2026 (UTC)[reply]
  • Can the comments by the bot be placed at the end of a request rather than immediately afterwards which may come before prior comments by others? Tenshi! (Talk page) 00:42, 23 June 2026 (UTC)[reply]
     Done test diff CONFUSED SPIRIT(Thilio).Talk 08:24, 23 June 2026 (UTC)[reply]
    Special:Diff/1360780898 had a space in between the request, causing two bullet points to be rendered, this should be above the space. Tenshi! (Talk page) 15:23, 23 June 2026 (UTC)[reply]
    What if I remove the bullet point *, leave only the : and clean up the spacing too? CONFUSED SPIRIT(Thilio).Talk 15:55, 23 June 2026 (UTC)[reply]
    That would violate MOS:INDENTMIX, specifically the 4th example. Tenshi! (Talk page) 16:06, 23 June 2026 (UTC)[reply]
    Special:Diff/1360788742 space removed, did backward traversal algorithm. CONFUSED SPIRIT(Thilio).Talk 16:31, 23 June 2026 (UTC)[reply]

Operator: SnowyRiver28 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 02:07, Saturday, April 25, 2026 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: Here

Function overview: Standardises citations of Australian legal cases with the {{cite AustLII}} template

Links to relevant discussions (where appropriate):

Edit period(s): Once a month

Estimated number of pages affected: ~800 currently

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: This task will use regex to identify cite web templates linking to AustLII, a legal database used in Australia. The template {{cite AustLII}} standardises the referencing of legal cases, but there are around 800 references to AustLII (query here) that use the cite web template instead. After the initial run it could be scheduled to run on a regular basis to keep on top of new edits (wouldn't need to be very often). If this is successful a similar task could be applied in future to change from cite web to {{cite Legislation AU}}.

Discussion

Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 22:54, 3 May 2026 (UTC)[reply]

Thanks, trial log is at User:SnowyBot/Trials/Task 2 SnowyRiver28 (talk) 01:38, 4 May 2026 (UTC)[reply]

Bots that have completed the trial period

Approved requests

Bots that have been approved for operations after a successful BRFA will be listed here for informational purposes. No other approval action is required for these bots. Recently approved requests can be found here (edit), while old requests can be found in the archives.


Denied requests

Bots that have been denied for operations will be listed here for informational purposes for at least 7 days before being archived. No other action is required for these bots. Older requests can be found in the Archive.

Expired/withdrawn requests

These requests have either expired, as information required by the operator was not provided, or been withdrawn. These tasks are not authorized to run, but such lack of authorization does not necessarily follow from a finding as to merit. A bot that, having been approved for testing, was not tested by an editor, or one for which the results of testing were not posted, for example, would appear here. Bot requests should not be placed here if there is an active discussion ongoing above. Operators whose requests have expired may reactivate their requests at any time. The following list shows recent requests (if any) that have expired, listed here for informational purposes for at least 7 days before being archived. Older requests can be found in the respective archives: Expired, Withdrawn.