Wikipedia:Bots/Noticeboard

Bots noticeboard

Here we coordinate and discuss Wikipedia issues related to bots and other programs interacting with the MediaWiki software. Bot operators are the main users of this noticeboard, but even if you are not one, your comments will be welcome. Just make sure you are aware about our bot policy and know where to post your issue.

Do not post here if you came to

discuss non-urgent bot issues, bugs and suggestions for improvement. Do that at the bot operator's talk page
discuss urgent/major bot issues. Do that according to instructions at WP:BOTISSUE
discuss general questions about the MediaWiki software and syntax. We have the village pump's technical section for that
request approval for your new bot. Here is where you should do it
request new functionality for bots. Share your ideas at the dedicated page

Bot causing multi colon escape lint error

There are now 8,218 lint errors of type Multi colon escape, and all but 7 of these are caused by WP 1.0 bot. This bug was reported at Wikipedia talk:Version 1.0 Editorial Team/Index#Bot adding double colons 9 October 2017. Perhaps some bot experts who don't typically wander in those parts can apply their skills to the problem. Please continue the discussion there, not here. —Anomalocaris (talk) 06:14, 28 November 2017 (UTC)[reply]

Didn't Nihlus already deal with all of these? Primefac (talk) 13:08, 28 November 2017 (UTC)[reply]

NilhusBOT 5 is a monthly task to fix the problems with the 1.0 bot until such time as the 1.0 bot is fixed. --Izno (talk) 14:16, 28 November 2017 (UTC)[reply]

Correct. I've been traveling lately, so I wasn't able to run it. I am running it now and will let you know when it is done. Nihlus 14:33, 28 November 2017 (UTC)[reply]

Please see User talk:Nihlus#1.0 log. The problem is that to fix this properly afterward will be more difficult, unless the 1.0 bot, or another task, is done to retroactively rewrite the log page (not only update new entries but correct old ones). —Paleo Neonate – 15:50, 28 November 2017 (UTC)[reply]

Basically, I don't have the time but would need to myself properly fix the log today. It's simpler to just revert and fix it properly when I can. —Paleo Neonate – 15:51, 28 November 2017 (UTC)[reply]

@PaleoNeonate: Why are you having this discussion in two separate places? I addressed the issue on my talk page. Nihlus 15:55, 28 November 2017 (UTC)[reply]

I thought that this may be a more appropriate place considering that it's about 1.0 bot and Nihlusbot, so will resume it here. Your answer did not address the problem. Do you understand that:

Before October 5, 2017, category links were fine, but then later were broken, resulting in the same kind of bogus double-colon links as for drafts (these were not mainspace links, but Category: space links)
It's possible that draft links were always broken, resulting in the same kind of broken double-colon links
Nihlusbot causes both broken category and draft space links to become mainspace links (not Draft: or Category: ones as it should)
As a result, the "fix" does not improve the situation, the links are still broken (mainspace red links instead of category and draft links).
If keeping these changes and wanting to fix them later, it's more difficult to detect what links were not supposed to be to main space. In any case, to fix it properly, a more fancy script is needed which checks the class of the page...

Thanks, —Paleo Neonate – 23:31, 28 November 2017 (UTC)[reply]

Do I understand? Yes, yes, this time it did due to a small extra bit in the code, disagree as stated already, this is something I am working on. Thanks! Nihlus 00:27, 29 November 2017 (UTC)[reply]

So, there are issues with almost every single namespace outside of articlespace, so WP 1.0 bot is making a lot of errors and should probably be prevented from continuing. However, until that time, I am limiting the corrections I am making to those that are explicitly assessed as Category/Template/Book/Draft/File-class. If they are classed incorrectly, then they will not get fixed. Nihlus 01:52, 29 November 2017 (UTC)[reply]

A few hours ago, there were just 6 Multi colon escape lint errors. Now we have 125, all but 4 caused by WP 1.0 bot. This may be known to those working on the problem. —Anomalocaris (talk) 06:02, 29 November 2017 (UTC)[reply]

@Nihlus: thanks for improving the situation. I see that Category links have been fixed (at least the ones I noticed). Unfortunately links to drafts remain to mainspace. —Paleo Neonate – 19:54, 29 November 2017 (UTC)[reply]

@PaleoNeonate: As stated above: I am limiting the corrections I am making to those that are explicitly assessed as Category/Template/Book/Draft/File-class. If they are classed incorrectly, then they will not get fixed. Nihlus 19:55, 29 November 2017 (UTC)[reply]

Yes I have read it, but unfortunately contest the value of such hackish edits in 1.0 logs. Perhaps at least don't just convert those to non-working mainspace links when the class is unavailable, marking them so they are known not to be in mainspace (those double-colon items never were in mainspace)? A marker, or even a non-linked title would be a good choice to keep the distinction... —Paleo Neonate – 20:48, 29 November 2017 (UTC)[reply]

Again, I repeat: I am limiting the corrections I am making to those that are explicitly assessed as Category/Template/Book/Draft/File-class. If they are classed incorrectly, then they will not get fixed. That means those are the only fixes I am making with the bot going forward as I have no intention of supervising each edit made to discern whether something is a draft/project page or not. Nihlus 20:56, 29 November 2017 (UTC)[reply]

I am limiting the corrections I am making to those that are explicitly assessed as Category/Template/Book/Draft/File-class. If they are classed incorrectly, then they will not get fixed. We appear to talk past eachother. That is not what technically happened. This diff (which you reverted) was made because links to mainspace were introduced for pages not in mainspace. If your script doesn't touch such links in the future when it cannot determine their class, that's an improvement. You say that you don't correct them, but so far they were still "fixed" (converted to erroneous mainspace links). The "loss of information" from my first complaint was about that those bogus links were previously unambiguously recognizable as non-mainspace (those that are now confusing, broken mainspace links when the class is not in the text). —Paleo Neonate – 05:27, 1 December 2017 (UTC)[reply]

Has the bot operator been contacted or responded to this issue?—CYBERPOWER (Merry Christmas) 02:18, 1 December 2017 (UTC)[reply]
@Cyberpower678: From what I understand, there have been multiple attempts at making contact with them. To be thorough, I have emailed Kelson, Theopolisme, and Wolfgang42 in an attempt to get a response and solution. Nihlus 04:36, 1 December 2017 (UTC)[reply]
I have blocked the bot until these issues are resolved.—CYBERPOWER (Merry Christmas) 05:09, 1 December 2017 (UTC)[reply]
Thanks. I will hold off on any task 5 edits with my bot. Nihlus 05:39, 1 December 2017 (UTC)[reply]
You should note that per this post Wolfgang42 cannot do anything for us and I recall that Theopolisme is currently unable to devote time to this, so Kelson seems to be the only one who can assist right now. As I've mentioned a number of times in the last 2 years we really need to find someone with the skills to maintain this. ww2censor (talk) 12:14, 3 December 2017 (UTC)[reply]
Is there any progress on this? The blocking of the bot has killed the assessment logs (ex. Wikipedia:Version 1.0 Editorial Team/Military history articles by quality log) stone dead. - The Bushranger _{One ping only} 23:58, 29 December 2017 (UTC)[reply]
@Kelson: is working this week on updating the code to fix bugs, which arose due to server problems as discussed here. How would he go about getting the bot unblocked for testing? (He mainly works on FR:WP, but helps us out when we get stuck.) Thanks, Walkerma (talk) 18:08, 9 January 2018 (UTC)[reply]
I'm active and watching this. He should just ask me to unblock it.—CYBERPOWER (Chat) 20:00, 9 January 2018 (UTC)[reply]
Update: Kelson has fixed a lot of bugs, and there has been some testing and discussion is going on on the WP1.0 tech talk page. Walkerma (talk) 18:04, 17 January 2018 (UTC)[reply]

Commons Deletion Notification Bot

I am developing a bot which notifies Wikipedia articles when images associated with them in Wikimedia Commons are

nominated for deletion
deleted
Nominated not to be deleted

How can I detect that an image is deleted from commons after nomination? Is there any APIs available for that? Harideepan (talk) 13:07, 2 January 2018 (UTC)[reply]

@Harideepan: You should probably go talk to the Community Tech team, as they just had this topic place in their top ten wishes for 2017. See meta:Community Tech/Commons deletion notification bot. --Izno (talk) 14:13, 2 January 2018 (UTC)[reply]

Thank you for your response. I am new here. Harideepan (talk) 14:34, 2 January 2018 (UTC)[reply]

CommonsDelinker and Filedelinkerbot

Filedelinkerbot was created to supplement CommonsDelinker, which was performing inadequately, with a lot of unaddressed bugs (including, off the top of my head, breaking templates and galleries) and limited maintenance. Is there any continued need for CommonsDelinker, that cannot be replaced by Filedelinkerbot? There are some issues which I'd like to raise, such as the removal of images from discussion archives (which should really be left as red links), and having single location to discuss such issues would really be preferable. --Paul_012 (talk) 03:46, 27 January 2018 (UTC)[reply]

Slow-burn bot wars

Moved from WP:ANI#Slow-burn bot wars Primefac (talk) 15:44, 27 January 2018 (UTC)

Does anyone know why two bots edit war over which links to use for archived web refs? By way of example, the edit history of Diamonds Are Forever (novel) shows InternetArchiveBot and GreenC bot duking it out since September 2017. I've seen it on a couple of other articles too, but I can't be that bothered to dig them out. Although no real harm is done, it's mildly annoying when they keep cluttering up my watchlist. Cheers - SchroCat (talk) 14:17, 27 January 2018 (UTC)[reply]

That would have to be resolved by the bot owners, probably at Wikipedia:Bots/Noticeboard. NinjaRobotPirate (talk) 14:35, 27 January 2018 (UTC)[reply]

Another in my series of shameless plugs: Museum_of_Computer_Porn. E Eng 15:10, 27 January 2018 (UTC)[reply]

See WP:LAME#Bot vs bot :-) I've notified the bot operators. Nyttend (talk) 15:26, 27 January 2018 (UTC)[reply]

I added a {{cbignore}} (respected by both bots) until we figure it out. Notify us on our talk page or WP:BO is easiest. -- GreenC 15:40, 27 January 2018 (UTC)[reply]

This appears to be an issue with GreenC bot. IABot is repairing the archive link and the URL fragment, and GreenC bot is removing it for some reason.—CYBERPOWER (Chat) 16:55, 27 January 2018 (UTC)[reply]

GreenC bot gets the URL from the WebCite API as data authority - this is what WebCite says the archive is saved under. -- GreenC 17:35, 27 January 2018 (UTC)[reply]

GreenC bot could use the |url= as data authority, but most of the time it is the other way around where the data in |url= is truncated and the data from WebCite is more complete. Example, example. So I went with WebCite as being more authoritative since that is how it's saved on their system. -- GreenC 17:47, 27 January 2018 (UTC)[reply]

That's not the problem though. It's removing the fragment from the URL. It shouldn't be doing that.—CYBERPOWER (Chat) 18:15, 27 January 2018 (UTC)[reply]

It's not removing the fragment. It's synchronizing the URL with how it was saved on WebCite. If the fragment is not there, it's because it was never there when captured at WebCite, or WebCite removed it during the capture. The data authority is WebCite. This turns out to be a good method as seen in the examples because often the URL in |url= field is missing information. -- GreenC 20:20, 27 January 2018 (UTC)[reply]

I'm sorry, but that makes no sense. Why would WebCite, or any archiving service, save the fragment into the captured URL? The fragment is merely a pointer for the browser to go to a specific page anchor. IABot doesn't capture the fragments when reading URLs, but carries them through to archive URLs when adding them.—CYBERPOWER (Chat) 20:27, 27 January 2018 (UTC)[reply]

Why is IABot carrying the fragment through into the archive URL? It's not used by the archive (except archive.is in certain cases where the '#' is a '%23'). -- GreenC 21:26, 27 January 2018 (UTC)[reply]

Do you understand what the fragment is for? It's nothing a server ever needs to worry about, so it's just stripped on their end. It is a browser pointer. If the original URL had a fragment, attaching the same fragment to the archive URL makes sense so the browser goes straight to the relevant section of the page as it did in the original URL.—CYBERPOWER (Chat) 21:39, 27 January 2018 (UTC)[reply]

Yeah I know what a fragment does (though was temporarily confused I forgot they worked at other services). But fragments don't work with WebCite URLs. We tack the "?url=.." on for RFC long-URL reasons but it is dropped when doing a replay (example). So there is no inherent reason to retain fragments at WebCite. However.. I can see the logic to keep them for some future purpose we can't guess at. And since it's already been done, by and large. So I will see about modifying GreenC bot to retain the fragment for WebCite (it already does for other services).

There is the other problem as noted: IABot -> GreenCbot - any idea what might have caused it? -- GreenC 22:17, 27 January 2018 (UTC)[reply]

Well even if it is dropped, which it should do, it still doesn't change the fact the page anchors exist. I'll give you an example of what I mean.—CYBERPOWER (Chat) 22:23, 27 January 2018 (UTC)[reply]

The fragment is not the part after the ?, that is the query string. The fragment is the part after the #. --Redrose64 🌹 (talk) 22:24, 27 January 2018 (UTC)[reply]

@GreenC: here is what I'm trying to explain. Suppose you have the live URL with a fragment (https://en.wikipedia.org/wiki/Wikipedia:Bots/Noticeboard#Bot_causing_multi_colon_escape_lint_error), which in this case goes to a section of the page above us. Suppose said original URL dies and IABot adds an archive URL. It will add the archive, and carry over the fragment, https://web.archive.org/web/20180115043757/https://en.wikipedia.org/wiki/Wikipedia:Bots/Noticeboard#Bot_causing_multi_colon_escape_lint_error, so that when a user clicks it, they are still taken to the relevant section of the page. If you dropped the fragment, either in the original or the archive, you will still get the same page, but the browser won't take the user straight to the relevant content that was originally being cited.—CYBERPOWER (Chat) 22:31, 27 January 2018 (UTC)[reply]

Yes I understand but it's different with WebCite URLs fragments don't work for reasons noted above. Try it: https://www.webcitation.org/5utpzxf0T?url=http://www.ymm.co.jp/p/detail.php?code=GTP01085336#song . Also on a different matter, what about this edit sequence? IABot -> GreenCbot -- GreenC 23:36, 27 January 2018 (UTC)[reply]

GreenC bot is now carrying through the fragment in-line with IABot per above. -- GreenC 00:47, 28 January 2018 (UTC)[reply]

Oh I see what you mean. The anchors don't actually work there, despite the fragment. In any event, IABot doesn't selectively remove them from WebCite URLs, as the fragment handling process happens during the archive adding process during the final stages of page analysis, when new strings are being generated to replace the old ones. I personally don't see the need to bloat the code to "fix" that, but then there's the question, what's causing the edit war?—CYBERPOWER (Chat) 00:52, 28 January 2018 (UTC)[reply]

GreenC bot is fixed so it won't strip the fragment there shouldn't be any more edit wars over it, but there are probably other edit wars over other things we don't know about. Not sure how to find edit wars. -- GreenC 04:31, 28 January 2018 (UTC)[reply]

Not sure how to find edit wars. Perhaps your bots could look at the previous edit to a page, and if it was made by its counterpart, log the edit somewhere for later analysis. It won't catch everything, and it might turn up false positives, but it's something. —DoRD (talk) 14:42, 28 January 2018 (UTC)[reply]

GreenC bot targets pages previous edited by IABot so there always overlap. -- GreenC 15:01, 28 January 2018 (UTC)[reply]

Maybe a pattern of the two previous edits being GreenC and IAbot? Galobtter (pingó mió) 15:09, 28 January 2018 (UTC)[reply]

And/or the edit byte sizes being the same.. but it would take a program to trawl through 10s of thousands of articles and 100s of thousands of diffs it wouldn't be trivial to create. But a general bot-war detector would be useful to have for the community. -- GreenC 15:18, 28 January 2018 (UTC)[reply]

Many thanks to all. I never knew this board existed (thus the original opening at ANI), but thanks to all for sorting this out. Cheers - SchroCat (talk) 09:53, 28 January 2018 (UTC)[reply]

Need someone with a mass rollback script now.

Would someone who has a mass rollback script handy please revert InternetArchiveBot's edits going all the way back to the timestamp in this diff? Kind of urgent. IABot destroyed roughly a thousand articles, due to some communication failure with Wikipedia.—CYBERPOWER (Chat) 22:29, 29 January 2018 (UTC)[reply]

Done Nihlus 23:07, 29 January 2018 (UTC)[reply]

@Cyberpower678: When you say "destroyed", this means...? --Redrose64 🌹 (talk) 23:38, 29 January 2018 (UTC)[reply]

It deleted chunks of articles or stuffed chunks of it into the references section, by making massive references out of them.—CYBERPOWER (Chat) 23:40, 29 January 2018 (UTC)[reply]

v t e Noticeboards
Wikipedia's centralized discussion, request, and help venues. For a listing of ongoing discussions and current requests, see the dashboard. For a related set of forums which do not function as noticeboards see formal review processes.
General	Administrators Main Incidents Bots Bureaucrats Centralized discussion Closure requests Education Interface admins Main Page errors Open proxies Volunteer response team Oversight User permissions
Articles, content	AI Bad image list Biographies of living persons Copyrights Questions on media Problems Dispute resolution External links Fringe theories Neutral point of view Original research Pending changes Reliable sources Resource requests Scalable vector graphics Spam Blacklist Whitelist Style Titleblacklist Translation
Page handling	History merges Splits Moves Protection Batch Importation XfD Articles Redirects Categories Templates Files Miscellany Undeletion Deletion review
User conduct	Conflict of interest Contributor copyright Edit warring and 3RR Sanctions Personal restrictions General sanctions Contentious topics Sockpuppets Usernames Report violations Requests for comment Request rename Vandalism
Other	Arbitration Committee noticeboard Requests Enforcement Edit filters Requested False positives Questions Help desk Teahouse Reference desk New articles Requests for comment Village pump Policy Technical Proposals Idea lab WMF Miscellaneous WikiProject proposals Discussions for discussion
Category:Wikipedia noticeboards

Bot-related archives
Noticeboard 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
Bots (talk) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 21, 22 Newer discussions at WP:BOTN since April 2021
Bot policy (talk) 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 29, 30 Pre-2007 archived under Bots (talk)
Bot requests 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 31, 32, 33, 34, 35, 36, 37, 38, 39, 40 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 51, 52, 53, 54, 55, 56, 57, 58, 59, 60 61, 62, 63, 64, 65, 66, 67, 68, 69, 70 71, 72, 73, 74, 75, 76, 77, 78, 79, 80 81, 82, 83, 84, 85, 86, 87, 88, 89
Bot requests (talk) 1, 2 Newer discussions at WP:BOTN since April 2021
BRFA Old format: 1, 2, 3, 4 New format: Categorized Archive (All subpages)
BRFA (talk) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14, 15 Newer discussions at WP:BOTN since April 2021
Bot Approvals Group (talk) 1, 2, 3, 4, 5, 6, 7, 8, 9 BAG Nominations
Wikipedia Wikipedia_talk
v t e