Wikipedia talk:AutoWikiBrowser
- Home
Introduction and rules - User manual
How to use AWB - Discussion
Discuss AWB, report errors, and request features - User tasks
Request or help with AWB-able tasks - Technical
Technical documentation
This is the discussion page for the AutoWikiBrowser (AWB) project. It is also the place to discuss using the AWB program (for help, questions, or general inquiries about AWB). Specific guidelines on where to make particular reports or requests are provided in the § Before you post section below. Before asking a question, please refer to the read the § Frequently asked questions below.
Before you post
[edit]| Do you want to ... | Please use | ||||
|---|---|---|---|---|---|
| Report a bug or request a feature in AWB? | Check reported bugs on Phabricator before filing a new bug report. You do not need to create another account there; just log in with your global Wikimedia account. See this MediaWiki wiki page on how to report bugs and request features on Phabricator.
| ||||
| Report an incorrectly fixed typo? | Wikipedia talk:AutoWikiBrowser/Typos | ||||
| Request approval to use AWB? | Wikipedia:Requests for permissions/AutoWikiBrowser | ||||
| Ask a question about AWB or ask for help? | This page |
Frequently asked questions
[edit]
Frequently asked questions
|
|---|
//Detect IE5.5+
if (navigator.appVersion.indexOf("MSIE")==-1)
{
// Previous contents go here
....
}
|
Discussion
[edit]Odd behaviour in WikiFunctions
[edit]Behaviour the first
[edit]I'm doing a run right now where there are some parameters being removed. In Special:Diff/1348415854 something odd happened:
| − | | | + | | p = 506">{{harvp|"Lives"|1965|p= 506}}.</ref> |
I'm using WikiFunctions.Tools.RemoveTemplateParameter to remove that parameter, so what I'm thinking is that the code just doesn't expect to find three pipes inside of a named reference? Is this something that should be looked at further or is it soooo far down the GIGO rabbit hole that it's probably easier to just fix on the article if/when it occurs? Primefac (talk) 15:32, 12 April 2026 (UTC)
- Tracing that code, it seems that the replacement is innocent of the "..." construct. But when I try to repro it, I get
|Lives|1965|p= 506">(etc) as I would expect. So it's sort-of explicable, but I don't get the same as you do. Finding matching pairs of anything in a regular expression is notoriously tricky. I suppose we could put that one into the bug backlog though. David Brooks (talk) 03:41, 13 April 2026 (UTC)- Did you call some form of RemoveDuplicateTemplateParameters() after this call? I haven't tested, but I think it also removes named parameters with blank value. David Brooks (talk) 16:09, 13 April 2026 (UTC)
- I take that back. It doesn't recognize a parameter with no = sign. Perhaps it should. David Brooks (talk) 21:52, 13 April 2026 (UTC)
- No, I'm using a slightly modified version of GetTemplateParameterValues. Primefac (talk) 21:38, 18 April 2026 (UTC)
- I take that back. It doesn't recognize a parameter with no = sign. Perhaps it should. David Brooks (talk) 21:52, 13 April 2026 (UTC)
- Did you call some form of RemoveDuplicateTemplateParameters() after this call? I haven't tested, but I think it also removes named parameters with blank value. David Brooks (talk) 16:09, 13 April 2026 (UTC)
- @Primefac OK, this one is a bug in Tools too. A simple fix works in this case, but I'd have to think about malevolent cases. To fix it in your module you would have to put the fix in PipeCleanTemplate, write a version of RemoveTemplateParameter that calls PipeCleanTemplate, and use it in ParamKiller. The addition to PipeCleanTemplate is something like this as the last replacementwhere QuotePair is the Regex ("\".*?\""). — Preceding unsigned comment added by DavidBrooks (talk • contribs) 17:10, 27 May 2026 (UTC)
if (restoftemplate.Contains("\"")) restoftemplate = ReplaceWith(restoftemplate, QuotePair, RepWith);
Behaviour the second
[edit]I just got reminded of another issue. In Special:Diff/1347282277 it simply removes the entire |jr/sr=United States Senator value -- it should not have done this (it wasn't on the list to remove). Is it because of the / in the parameter name? Primefac (talk) 15:34, 12 April 2026 (UTC)
- That I cannot repro. What is the SVN version of your AWB app? (in the Help/About dialog). But I doubt the code has changed much in a good while. David Brooks (talk) 03:46, 13 April 2026 (UTC)
- SVN 13002 (2025-08-09 18:02:47). Version 6.4.0.1. Primefac (talk) 21:10, 18 April 2026 (UTC)
- Hi again - for both your problems it might be useful to have the entire context of your script, and the GetTemplateParameterValues() change, if you ar comfortable with sharing (else you can use wikimail). David Brooks (talk) 15:58, 29 April 2026 (UTC)
- Apologies, though I had already linked to it. /code dump has the full thin minus context-specific parameter values. Primefac (talk) 08:45, 30 April 2026 (UTC)
- @Primefac: Sorry, I have been preoccupied, and just got back to this. I managed to run tests under the debugger when it dawned on me. Your definition of
Regex param = new Regex(@"\|\s*([\w0-9_ -]+?)\s*=([^|}]*)");needs the "/" character added to the character group. Plus anything else that might be reasonably expected to appear in a parameter name. And by the way \w already includes digits and underscore. Now back to the earlier issue! David Brooks (talk) 21:53, 26 May 2026 (UTC)- Interesting. In that case, so does the main repo. Primefac (talk) 09:24, 27 May 2026 (UTC)
- Good point, although as I said above I couldn't repro using the standard Tools. But it is still wrong in principle so I (or you?) will enter a ticket. Do we know the formal list of characters that can appear in a parameter name?
- BTW you are looking at reedy's November snapshot, but it is still true in the latest tree. Current code is here.
- As to "...", I'm pretty sure it also needs a fix. I'll try to get a workaround for your module code. It's not obvious. Maybe later today. David Brooks (talk) 14:51, 27 May 2026 (UTC)
- And, although I can't find it documented in any official sources, it seems that the only characters not allowed as part of parameter names are = | { and }. Gemini and ChatGPT both provide incorrect claims. I even got parameter names like "≠⅔⁝", "αβγ" and "abc def" to work.
- I don't know if it's worth while to rewrite that test or to see if there are other forbidden characters I'm missing. David Brooks (talk) 17:26, 28 May 2026 (UTC)
- Interesting. In that case, so does the main repo. Primefac (talk) 09:24, 27 May 2026 (UTC)
- @Primefac: Sorry, I have been preoccupied, and just got back to this. I managed to run tests under the debugger when it dawned on me. Your definition of
- Apologies, though I had already linked to it. /code dump has the full thin minus context-specific parameter values. Primefac (talk) 08:45, 30 April 2026 (UTC)
- Hi again - for both your problems it might be useful to have the entire context of your script, and the GetTemplateParameterValues() change, if you ar comfortable with sharing (else you can use wikimail). David Brooks (talk) 15:58, 29 April 2026 (UTC)
- SVN 13002 (2025-08-09 18:02:47). Version 6.4.0.1. Primefac (talk) 21:10, 18 April 2026 (UTC)
Special syntax to make AWB ignore something?
[edit]I've seen editors manually cleanup bot mistakes with special syntax like {{Cbignore}} to prevent the bot from "fixing" something that is, in fact, not broken. Is there an equivalent for AWB?
Not that it matters, but the specific issue I'm currently facing is AWB wanting to change "USD 203" to "US$203" at Piper High School (Kansas). In the context of that article, "USD" stands for "unified school district", so I'm looking for a way to make AWB ignore that snippet forever and for always. Thanks! — voidxor 19:12, 17 May 2026 (UTC)
- Try {{Not a typo}}. Using this would look like {{Not a typo|USD 203}}. AWB will skip it. the Stefen 𝕋ower 19:28, 17 May 2026 (UTC)
- The issue of 'School District' and 'USD' has been raised before and I don't think satisfactorily resolved. I think that 'editors should be aware' was more or less the result. There are a little over 2000 pages that use both these terms. Should we turn the specific currency rule off when the term "School District" is within the page, adjacent, or is that excessive? Neils51 (talk) 06:35, 27 May 2026 (UTC)
- Is there a way to test if something is within the page? I only know of looking at text directly around the would-be typo, using a lookahead or lookbehind. And in this specific case, the usual approach might only help us in two of the three instances (lookbehind for "school in" or "student population of"). I imagine we can trim down but not eliminate the false positives. the Stefen 𝕋ower 06:51, 27 May 2026 (UTC)
- Most likely just use a negative lookbehind. It is feasible to model a possible result. I would expect at least an 80-90% hit, possibly better. A search for School District+USD gives 2081 entries, subtract 'School' or 'District' and that goes to zero. I just gave Gemini some rather specific search criteria and the handful of exceptions it gave me used lower case 'school district' (I wasn't quite specific enough!), so I would expect few false positives. I'll have a look at it. Neils51 (talk) 07:47, 27 May 2026 (UTC)
- I assume you mean look near the text in question. Scanning for text that isn't immediate to the "USD nnn" sounds draggy on performance. the Stefen 𝕋ower 08:08, 27 May 2026 (UTC)
- Shouldn't be different to performance for any other string search. Neils51 (talk) 09:05, 27 May 2026 (UTC)
- But what I'm trying to say is... we don't do this. We don't look for out-of-context text in the rest of the article. That would be new, and not necessarily a workable approach. You can't guarantee that a mention of "school district" somewhere else in the article means that USD doesn't mean US dollars. We write regexes contextually. the Stefen 𝕋ower 09:13, 27 May 2026 (UTC)
- Thanks I'll have a look. Neils51 (talk) 07:52, 28 May 2026 (UTC)
- @StefenTower and Neils51: Thank you both for investigating the root of the issue. Since searches or look-behinds for "school district" would be expensive and unreliable, I have three alternate ideas:
- Is it possible for the rule to deactivate on articles in Category:Schools, or any subcategory thereof?
- Is it possible for the rule to deactivate on articles with {{Infobox school}}?
- Can we simply change the regular expression to not make changes on any string like "USD 123", where there is neither a dollar sign nor an intervening period? "USD$123" will never refer to a school district, and neither will "USD 123.45".
- — voidxor 14:13, 27 May 2026 (UTC)
- All good suggestions. I'll do some investigation. Neils51 (talk) 07:52, 28 May 2026 (UTC)
- On the first two, I've investigated this broadly before, and there is no way to deactivate a typo rule with AWB. On the last one, that is what I meant about looking at these contextually. There should be approaches to eliminate probably most of the false positives. In the final analysis, though, we rely on AWB users to use their judgment in approving changes to articles, and the tool makes it somewhat easy to skip suggested changes. the Stefen 𝕋ower 17:37, 28 May 2026 (UTC)
- While AWB's proposed changes can be skipped or undone in the diff, I think it's better not to have a regex "fixing" something that is correct half the time, like "USD 123". I've never personally seen "USD 123" (with no dollar sign nor comma nor decimal) refer to a dollar value, so I'm not convinced that the unified school district sense is a rare false positive. I was leaning toward my third suggestion anyway. I feel strongly that AWB should not be assuming that "USD 123" refers to the dollar value. Edge cases like this can be fixed manually. In a large AWB edit, it's going to be missed, so I feel it's safer to take no action than to assume. — voidxor 18:14, 28 May 2026 (UTC)
- Of course. That's why a typo rule can be modified to do contextual exclusions with regex lookbehinds and lookaheads and likely significantly reduce the false positives. Also, I have done extensive typo fixing with AWB, and way more often than not, this rule gets it right as it is. It can be improved, though. the Stefen 𝕋ower 19:07, 28 May 2026 (UTC)
- Perhaps I misread your previous comment. If you are able to tweak the regex to reduce the false positives that brought me here, that would be great! Thanks for all that you do for AWB. — voidxor 19:11, 28 May 2026 (UTC)
- I would normally work on something like this, but Neils51 said he's working on it, so, no need for two to work on it. :) the Stefen 𝕋ower 19:12, 28 May 2026 (UTC)
- Yes, just to confirm, working on this. FYI, "I've never personally seen "USD 123" (with no dollar sign nor comma nor decimal) refer to a dollar value", there are close to 500 and if you feel like fixing them you will find them here. Neils51 (talk) 08:12, 29 May 2026 (UTC)
- I assume this is a reply to voidxor as you are quoting him. AT any rate, I don't do ad hoc typo fix runs these days, with all the other things I'm working on, on and off wiki. I only do those if I'm actively testing a new rule or change to an existing one. the Stefen 𝕋ower 19:26, 29 May 2026 (UTC)
- It was no doubt a reply to my comment. Regardless, I'm grateful to Neils51 for compiling a list, and I'm actively working on fixing those instances. — voidxor 19:34, 29 May 2026 (UTC)
- I assume this is a reply to voidxor as you are quoting him. AT any rate, I don't do ad hoc typo fix runs these days, with all the other things I'm working on, on and off wiki. I only do those if I'm actively testing a new rule or change to an existing one. the Stefen 𝕋ower 19:26, 29 May 2026 (UTC)
- Yes, just to confirm, working on this. FYI, "I've never personally seen "USD 123" (with no dollar sign nor comma nor decimal) refer to a dollar value", there are close to 500 and if you feel like fixing them you will find them here. Neils51 (talk) 08:12, 29 May 2026 (UTC)
- I would normally work on something like this, but Neils51 said he's working on it, so, no need for two to work on it. :) the Stefen 𝕋ower 19:12, 28 May 2026 (UTC)
- Perhaps I misread your previous comment. If you are able to tweak the regex to reduce the false positives that brought me here, that would be great! Thanks for all that you do for AWB. — voidxor 19:11, 28 May 2026 (UTC)
- Of course. That's why a typo rule can be modified to do contextual exclusions with regex lookbehinds and lookaheads and likely significantly reduce the false positives. Also, I have done extensive typo fixing with AWB, and way more often than not, this rule gets it right as it is. It can be improved, though. the Stefen 𝕋ower 19:07, 28 May 2026 (UTC)
- While AWB's proposed changes can be skipped or undone in the diff, I think it's better not to have a regex "fixing" something that is correct half the time, like "USD 123". I've never personally seen "USD 123" (with no dollar sign nor comma nor decimal) refer to a dollar value, so I'm not convinced that the unified school district sense is a rare false positive. I was leaning toward my third suggestion anyway. I feel strongly that AWB should not be assuming that "USD 123" refers to the dollar value. Edge cases like this can be fixed manually. In a large AWB edit, it's going to be missed, so I feel it's safer to take no action than to assume. — voidxor 18:14, 28 May 2026 (UTC)
- @StefenTower and Neils51: Thank you both for investigating the root of the issue. Since searches or look-behinds for "school district" would be expensive and unreliable, I have three alternate ideas:
- Shouldn't be different to performance for any other string search. Neils51 (talk) 09:05, 27 May 2026 (UTC)
- I assume you mean look near the text in question. Scanning for text that isn't immediate to the "USD nnn" sounds draggy on performance. the Stefen 𝕋ower 08:08, 27 May 2026 (UTC)
- Most likely just use a negative lookbehind. It is feasible to model a possible result. I would expect at least an 80-90% hit, possibly better. A search for School District+USD gives 2081 entries, subtract 'School' or 'District' and that goes to zero. I just gave Gemini some rather specific search criteria and the handful of exceptions it gave me used lower case 'school district' (I wasn't quite specific enough!), so I would expect few false positives. I'll have a look at it. Neils51 (talk) 07:47, 27 May 2026 (UTC)
- Is there a way to test if something is within the page? I only know of looking at text directly around the would-be typo, using a lookahead or lookbehind. And in this specific case, the usual approach might only help us in two of the three instances (lookbehind for "school in" or "student population of"). I imagine we can trim down but not eliminate the false positives. the Stefen 𝕋ower 06:51, 27 May 2026 (UTC)
- The issue of 'School District' and 'USD' has been raised before and I don't think satisfactorily resolved. I think that 'editors should be aware' was more or less the result. There are a little over 2000 pages that use both these terms. Should we turn the specific currency rule off when the term "School District" is within the page, adjacent, or is that excessive? Neils51 (talk) 06:35, 27 May 2026 (UTC)
AWB not applying general fixes to redirects
[edit]I'm encountering an issue where AWB is not applying general fixes, including replacing template redirects, to mainspace redirect pages. I don't believe this was always the case, and I'm not sure what the reasoning behind it would be. There are options to skip or follow redirects – I don't have those selected, so if I want to include redirects, it means I want to apply fixes to them. Is this intentional behaviour or a known issue? Is there an option to change this? Mclay1 (talk) 13:22, 18 May 2026 (UTC)
- Can you provide a specific example, including page, etc, and I could attempt to dupe using current and -2 versions. Neils51 (talk) 21:01, 19 May 2026 (UTC)
- @Neils51: As an example, "Wierd Al" Yankovic uses {{Rcat shell}}, which should be replaced by {{Redirect category shell}} per WP:AWB/TR, but it isn't. And if I add text such as "January 1 1959" and re-parse, it doesn't correct it to "January 1, 1959" like it does when doing exactly the same thing in an article, so the behaviour is clearly different. (Obviously a change like that isn't likely to be needed for a redirect, but I can't find a real example right now.) Mclay1 (talk) 06:35, 28 May 2026 (UTC)
Memory leak when preparsing large sparse list
[edit]The only preparsing criteria I'm using is whether or not human category changes were made, so only general fixes are being applied, and no other skip condition/regex rules/module/external processing. There are a few needles in a haystack of ~20,000 pages, and logging is off, yet the process continues to consume memory and eventually throws an out of memory error at ~1.5 GB, which usually results in lost work. I have plenty of free memory at the time of OOM. I've processed larger lists in the past with no problems, so I'm wondering if this was introduced in the last few versions (I'm using 6.5.0.0 SVN 13019), or if this is 'normal' for sparse lists. (for me, the most common culprit for being OOM is a regex rule with extensive unbounded nested quantifiers that runs forever, not a list being emptied) ~ Tom.Reding (talk ⋅dgaf) 16:14, 29 May 2026 (UTC)
- This seems strange, but first suggestion: did you try one of the 64-bit 6.4.0.1 versions I've built? Or I can produce a 64-bit 6.5.0.0 if you really need it. If it "only" takes 1.5GB that may not be the problem, but let's eliminate it first. A classic memory leak is unlikely in code of this age, but perhaps the app is keeping references to old copies of the list objects for some reason. David Brooks (talk) 18:23, 29 May 2026 (UTC)
- I have 3 slightly different 6.4.0.1 versions from when you were helping me with the GetHTML authentication issue, zips dated April 2, 4, and 6. I'll try to reproduce the OOM on the April 6 version, since I think that's the one I was using until 6.5.0.0. ~ Tom.Reding (talk ⋅dgaf) 19:18, 29 May 2026 (UTC)
- OOM with 6.4.0.1 at ~1.5 GB & ~10k pages. I'll try again with 6.3.1.1, which is the oldest enabledversions, from 2024-08-09. ~ Tom.Reding (talk ⋅dgaf) 21:09, 29 May 2026 (UTC)
- I lost track of all the versions I put out - trying to be too helpful! Anyway, can you confirm that you used the 64-bit 6.4.0.1? You can tell using Task Manager (on Win11 it's Details tab, Architecture column).
- By the way, because I'm not sure everyone realizes: every snapshot build after the 6.4.0.0 release was tagged 6.4.0.1. Updating the minor rev after each checkin gets too cumbersome and has rarely been done. David Brooks (talk) 22:22, 29 May 2026 (UTC)
- All of the AWB versions I'm using are running as 32-bit.
- There's no leak in 6.3.1.1 after 30k pages while the process stayed steady at ~100 MB. What's more is that pages processed more quickly & consistently (250~300 pages/min steady from start to finish) than the OOM versions. The OOM versions process pages inversely proportional to the total # of remaining pages: slower with more pages remaining (80~90 pages/min @ 20k), and faster with fewer pages remaining (~225 pages/min with < 1k pages), so I think the leak & the page processing rate are related.
- I'll try different versions to narrow down when the OOM first manifests. ~ Tom.Reding (talk ⋅dgaf) 23:19, 29 May 2026 (UTC)
- I can already tell that 6.4.0.0 is going to have problems, so the change happened between 6.3.1.1 (SVN 12633, 2024-08-09) & 6.4.0.0 (SVN 12927, 2025-02-19). ~ Tom.Reding (talk ⋅dgaf) 23:31, 29 May 2026 (UTC)
- The only change to large lists that I could find is 12716 "ListBox: removal of large number of articles: previous logic using SelectedIndex doesn't work when selecting a large list bottom to top rather than top to bottom. So use faster bulk methods when > 500 items rather than based on index."
- If there are >500 items it uses set operations to intern the article list, clears the listbox and reloads it. That does sound expensive but the lists should be garbage collected if there is memory pressure. David Brooks (talk) 03:54, 30 May 2026 (UTC) ETA: a thought - if you have "Remove Duplicates" selected in the List menu, and in fact you have no duplicates, unselect it. That forces a different (maybe slower) algorithm. David Brooks (talk) 04:28, 30 May 2026 (UTC)
- My test list is obtained by getting the entire contents of Category:Year of birth missing (20,060) (~20k) and never manipulating it, so I never pressed Remove Duplicates or any other list manipulation features. I'll make a steps to reproduce list & file a proper bug report while I'm at it soon. ~ Tom.Reding (talk ⋅dgaf) 10:12, 30 May 2026 (UTC)
- @Tom.Reding I meant the checkbox in the List menu. It determines the algorithm to use regardless of whether there are in fact duplicates. I'll look at the rest of the context later. David Brooks (talk) 18:11, 30 May 2026 (UTC)
- OK, assuming that change was responsible for the regression I can see some optimizations available. The easiest one would be if there is only one article deleted at a time, which is usually the case. There are some other unnecessary large allocations. But I would want to understand why the change was made in the first place, and why the garbage collector was unable to prevent the OOM - I don't see any persistent references to temporary objects. David Brooks (talk) 23:22, 30 May 2026 (UTC)
- °
Continued at User talk:Tom.Reding § OOM errors
- °
- My test list is obtained by getting the entire contents of Category:Year of birth missing (20,060) (~20k) and never manipulating it, so I never pressed Remove Duplicates or any other list manipulation features. I'll make a steps to reproduce list & file a proper bug report while I'm at it soon. ~ Tom.Reding (talk ⋅dgaf) 10:12, 30 May 2026 (UTC)
- I can already tell that 6.4.0.0 is going to have problems, so the change happened between 6.3.1.1 (SVN 12633, 2024-08-09) & 6.4.0.0 (SVN 12927, 2025-02-19). ~ Tom.Reding (talk ⋅dgaf) 23:31, 29 May 2026 (UTC)
- OOM with 6.4.0.1 at ~1.5 GB & ~10k pages. I'll try again with 6.3.1.1, which is the oldest enabledversions, from 2024-08-09. ~ Tom.Reding (talk ⋅dgaf) 21:09, 29 May 2026 (UTC)
- I have 3 slightly different 6.4.0.1 versions from when you were helping me with the GetHTML authentication issue, zips dated April 2, 4, and 6. I'll try to reproduce the OOM on the April 6 version, since I think that's the one I was using until 6.5.0.0. ~ Tom.Reding (talk ⋅dgaf) 19:18, 29 May 2026 (UTC)
I've tracked down the cause and given Tom.Reding a patch to verify. @StefenTower: this may also fix your OOM without resorting to 64-bit, so worth a try. David Brooks (talk) 01:37, 13 June 2026 (UTC)
- OK, I will try it out. Thanks! the Stefen 𝕋ower 02:34, 13 June 2026 (UTC)
- Using the test AWB (32-bit), I was able to pre-parse a list of 38,000+ with no issue. So it appears the OOM issue is resolved. the Stefen 𝕋ower 15:35, 13 June 2026 (UTC)
- Leak is fixed! ~ Tom.Reding (talk ⋅dgaf) 14:31, 13 June 2026 (UTC)
Performance on large lists
[edit]This section heading was inserted because we aren't looking at OOM any more, but various performance issues on these large lists. David Brooks (talk) 18:27, 20 June 2026 (UTC)
- But...and this might just be icing on the cake...I was hoping that fixing the leak would also fix the slow list issue, but that's not the case. Here is a table of the memory & CPU usage of 6.3.1.1 & 6.5.0.1 both processing the same list of ~20,075 pages (Category:Year of birth missing (20,060)).
Mem (PWS) (K) Commit Size (K) Pages/Min Total CPU Time 6.3.1.1 Start 26,860 39,792 217 Mid 87,500 103,500 218 End 106,400 122,900 168 31 minutes 6.5.0.0 Start 29,104 42,012 71 Mid 1,420,472 1,434,680 85 77 minutes End OOM OOM OOM (est) 154 minutes 6.5.0.1 Start 28,492 41,532 106 Mid 106,500 122,200 145 End 105,700 122,400 178 62 minutes
- 6.3 is considerably faster than 6.5.0.1. I wouldn't have been surprised if 6.3 was a little faster than 6.5.0.1, assuming there'd be additional GenFixes and/or checks added to 6.5.0.1 that would slow it down a bit, but I would not expect 6.3 to be 2x faster. I noticed that 6.3's list "blinks" much less often and much faster than 6.5.0.1's list does. By blink I mean the items in the list disappear briefly and are re-written to the screen after the top page's removal. The effect is only occasionally noticeable in 6.3, maybe 15% of the time, and 6.3's blink is very fast, but in 6.5.0.1 the list blinks on every page and lasts noticeably longer, like 6.5.0.1 is struggling to refresh it. The effect is even more pronounced on larger lists.
- I compared the CPU usage of both 6.3 & 6.5.0.1 during their runs, and 6.3 is always less than or equal to 6.5.0.1, and 6.3's CPU usage is 1/2 that of 6.5.0.1 most of the time. 6.5.0.1 never got above 10%, while 6.3 never got above 5%. 6.5.0.1's total CPU time on my laptop was 62 minutes, while 6.3's was exactly 1/2 that at 31 minutes.
- The leak was much more important to fix since it crashed AWB and potentially resulted in lost work. Thank you very much for that. ~ Tom.Reding (talk ⋅dgaf) 14:31, 13 June 2026 (UTC)
- @Tom.Reding: I'm grateful for confirmation the leak was fixed; this is enough for me to check it in. While I was building it I also added what I thought was an optimization for the case of deleting a single item from a list. It makes a clear difference on my machine, which is emulating (that word hides a lot) x86 on an ARM64 box. So clearly it needs more work on actual x86-64 hardware. That's next on my todo list. Also OOM needs to be handled intelligently.
- But can you compare that performance against 6.5.0.0 first? David Brooks (talk) 00:15, 14 June 2026 (UTC) Adding - a change in rev 12716 changed the way items are removed from a long list, and I believe it made single-item-remove slower. The change would have first appeared in 6.3.2.0. The version you have reverts the change for that single-item case. David Brooks (talk) 01:21, 14 June 2026 (UTC)
- I've updated the table to include 6.5.0.0 (and disambiguated "6.5" in that post). Since 6.5.0.0 crashed after 10,381 pages, I was only able to log start & mid. 6.5.0.1 is a significant speed improvement over 6.5.0.0; 50~70% faster in terms of pages/min, and ~40% faster in terms of total CPU time. ~ Tom.Reding (talk ⋅dgaf) 13:16, 14 June 2026 (UTC)
- To be clear: the version I sent you isn't any 6.5.0.1 in particular. It contains the memory leak fix but I also added a perf improvement to "remove list article" to make our tests faster. The improvement is considerable on my machine, and I'm sure it doesn't affect the validity of the tests. I'll deal with that separately, as it only affects long lists like yours.
- In any case, as the lambda function in question seems to rarely/never actually do anything, I may do a further optimization while I'm there. David Brooks (talk) 14:50, 14 June 2026 (UTC)
- I was wondering why the list processing seemed faster. I hope this change can be kept along with the OOM fix. the Stefen 𝕋ower 17:36, 14 June 2026 (UTC)
- I'll write up a bug on the performance improvements, but I need to do some more measurements in the multiple-selection cases. But, yes, I plan to check them both into the official source. I do seem to be the only one making changes right now, but rjwilmsi does make actual functional improvements from time to time.
- The builds will be snapshots for a while. I don't know if there is a good reason for releasing a 6.5.1.0 yet. David Brooks (talk) 02:04, 15 June 2026 (UTC)
- When a long list of regex rules (~382 in this case) is pasted into 6.3.1.1, the list blinks with every rule added for about the first 5 seconds. Then, the entire window (rules list on the left, and find/if tabs on the right, everything) turns white and unresponsive, and the window title becomes
Replace Special (Not Responding). The paste process then proceeds much faster in the background, without the screen updating constantly to slow it down. The entire paste process takes ~17 seconds in 6.3.1.1. - In 6.4.0.0 onward, each rule is pasted and the screen refreshed laboriously for the entire rules list, which takes about 2x longer than 6.3.1.1. The entire paste process takes 32~33 seconds in 6.4.0.0, 6.5.0.0, and 6.5.0.1.
- I suspect these issues are related to the page list blinking and generally slower list processing since 6.3.1.1, so I'm hoping this can add some insight into what's going on under the hood. ~ Tom.Reding (talk ⋅dgaf) 11:18, 16 June 2026 (UTC)
- I don't think the lists have anything to do with the issues in the Article list, and I am completely unfamiliar with that code (which has some interesting abstraction constructs). But I'll see if I can find a band-aid and/or look at what changed before 6.4.0.0.
- But - what's the scenario? When you open the window, when you load the settings file, when you add a rule? David Brooks (talk) 14:55, 16 June 2026 (UTC)
- The scenario above only happens when copying a large # of rules (in this case a single rule with a large # of sub-rules) from a different instance of AWB.
- Loading a large settings file while having the Replace Special window open exhibits different behavior than copying rules, most notably that the window never becomes unresponsive in 6.3.1.1, and visually loads rules one-by-one for the entire duration. The entire process takes 70~80 seconds for my largest settings file regardless of AWB version (6.3.1.1 is still the fastest, though only by ~12%). This implies that loading a settings file triggers different code than pasting in a large block of rules. ~ Tom.Reding (talk ⋅dgaf) 15:51, 16 June 2026 (UTC)
Continued at User talk:Tom.Reding § ReplaceSpecial slow load
- When a long list of regex rules (~382 in this case) is pasted into 6.3.1.1, the list blinks with every rule added for about the first 5 seconds. Then, the entire window (rules list on the left, and find/if tabs on the right, everything) turns white and unresponsive, and the window title becomes
- I was wondering why the list processing seemed faster. I hope this change can be kept along with the OOM fix. the Stefen 𝕋ower 17:36, 14 June 2026 (UTC)
- I've updated the table to include 6.5.0.0 (and disambiguated "6.5" in that post). Since 6.5.0.0 crashed after 10,381 pages, I was only able to log start & mid. 6.5.0.1 is a significant speed improvement over 6.5.0.0; 50~70% faster in terms of pages/min, and ~40% faster in terms of total CPU time. ~ Tom.Reding (talk ⋅dgaf) 13:16, 14 June 2026 (UTC)
Unable to login with two-factor authentication
[edit]As a Bureaucrat on several Wikimedia projects, I am now required to have two-factor authentication. However, enabling this has made me unable to log in to AWB. I understand that this can be worked around using a bot account, but I would prefer to be able to log in under my own name, particularly when making tweaks and non-botlike edits in the course of using the tool. Can AWB be fixed to enable logins from a 2FA account? BD2412 T 20:54, 1 June 2026 (UTC)
- As long as you've upgraded to the latest version 6.5.0.0 in the last two weeks (T421991: Don't request notifications if you don't have the right), you can set up a bot password for AWB (note this is not the same as a bot account) - Wikipedia:Using AWB with 2FA. Edits with the bot password will appear under your main username (as an admin with MFA, you can see my contributions in the last little while are mainly AWB edits). The bot password also allows you to only allow certain grants to the bot password (i.e. to restrict AWB to only general editor access and prevent bureaucrat/admin-specific access). Harryboyles 23:08, 1 June 2026 (UTC)
- @Harryboyles: Thanks, I will check out this option! BD2412 T 00:01, 2 June 2026 (UTC)
- Worked like a charm. The AWB machine is back in business! BD2412 T 00:09, 2 June 2026 (UTC)
- Awesome! Harryboyles 08:50, 2 June 2026 (UTC)
- Oh good. I myself was about to create a separate AWB account just to get around the API issues 6.4 was having... Thanks! Primefac (talk) 21:16, 4 June 2026 (UTC)
- Awesome! Harryboyles 08:50, 2 June 2026 (UTC)
Wierdness with depreciated Parameters
[edit]I was working to clean up Category:Pages using infobox television with deprecated parameters and there I saw that even without using Find and Replace that it was replacing the depreciated parameter with its non depreciated parameter. When I went to do the same thing with Infobox Noble, it didn't. Does that require an X replaced with Y in the definition of the infobox or did I somehow turn something off in AWB? Naraht (talk) 12:34, 8 June 2026 (UTC)
- @Naraht: If you have the general fixes turned on, then AWB will change parameters in {{Infobox television}} according to the instructions at Wikipedia:AutoWikiBrowser/Rename template parameters#Infobox television. -- John of Reading (talk) 13:01, 8 June 2026 (UTC)
- John of ReadingI have "Apply general fixes" turned on.Naraht (talk) 13:25, 8 June 2026 (UTC)
- @Naraht: So if you add a section for {{Infobox noble}} to Wikipedia:AutoWikiBrowser/Rename template parameters, your replacements will then happen by magic. The syntax on that page isn't too complicated. You'll need to restart AWB, or choose File>Refresh status, to get it to reload the configuration page. -- John of Reading (talk) 13:56, 8 June 2026 (UTC)
- John of Reading Looks cool. I'll get to that when I can.Naraht (talk) 17:14, 8 June 2026 (UTC)
- @Naraht: So if you add a section for {{Infobox noble}} to Wikipedia:AutoWikiBrowser/Rename template parameters, your replacements will then happen by magic. The syntax on that page isn't too complicated. You'll need to restart AWB, or choose File>Refresh status, to get it to reload the configuration page. -- John of Reading (talk) 13:56, 8 June 2026 (UTC)
- John of ReadingI have "Apply general fixes" turned on.Naraht (talk) 13:25, 8 June 2026 (UTC)
- For what it's worth I'm glad you had this issue/question, because this will make my life so much easier when I'm doing huge bot runs for parameter changes. Primefac (talk) 16:32, 14 June 2026 (UTC)
AWB Didn't Detect Curly Apostrophe
[edit]On the Mikoy Morales article, I tried using AWB after copyediting and expanding the lead section. I used AWB to clean up the final outcome, but it seems it didn't detect the curly apostrophe. The article had a curly apostrophe (”) instead of a straight apostrophe ('). Normally AWB detects and fixes those in other articles, but in this case it didn't catch it, even though it was in the lead section. I ended up replacing it manually. - Arcrev1 (talk) 19:42, 9 June 2026 (UTC)
Usage of JWB/AWB for reducing phonetics links with templates
[edit]I recently started a project to reduce (in my opinion) page bloat via text redundancy on language and phonology articles, replacing [[X consonant|X]] and [[Y vowel|Y]] with {{lcons|X}} and {{lvow|Y}}; these templates produce the same content on output rendering, and if substituted would likewise produce the same results as what they are replacing. In some cases this results in a notable page byte decrease, e.g. [[alveolo-palatal consonant|alveolo-palatal]] > {{lcons|alveolo-palatal}}, while in others it is more minor but still redundancy reductive, e.g. [[mid vowel|mid]] > {{lvow|mid}}.
I had been using JWB to assist with regex replacement of these links with their equivalent template calls. @Mathglot left a message on my talk page objecting both to the change and to my usage of JWB for doing so, citing wikitext readability for editors on the former and WP:AWBRULES #4 on the latter. They asked for me to seek consensus before going any further; I understand their concerns and agreed to seek such consensus. I also suggested that I would revert my replacements if consensus was not in my favor (or, as mentioned, these could be turned into subst templates, so that cleanup is taken care of by an actual bot).
There are therefore three questions that I ask out of this:
- Is this change viewed as:
A. a productive one? (go forward with the change, use wherever possible)
B. an insignificant one? (does not matter or no opinion, use at discretion)
C. or a harmful one? (roll back the change, use only for new content/do not alter existing content) - Does using the templates over wikilinks result in any performance impact (technical burden) on page rendering?
- Is my usage of JWB for this a violation of semi-automated edit policies?
~ oklopfer (💬) 17:22, 10 June 2026 (UTC)
- That's a good and neutral synopsis of the issues. Details here. Thanks, Mathglot (talk) 17:29, 10 June 2026 (UTC)
- Addendum: re #2, it could conceivably cause a problem in page rendering by hitting a WP:PEIS threshold on a page where there was a large table where the template was used regularly in several table columns. Mathglot (talk) 17:36, 10 June 2026 (UTC)
- "Link" templates (things that just link) are routinely deleted in fact as they are insufficiently complex to merit being a template. VisualEditor gets sad when a template is inside another template. PEIS is also a consideration, which should answer #2. Izno (talk) 17:51, 10 June 2026 (UTC)
- It also requires two rows in the database (template link and the link itself), instead of one (just the link itself) for each page it is used on. So a bit of a waste. Whether that is a good use of resources depends on the functionality provided. —TheDJ (talk • contribs) 20:36, 10 June 2026 (UTC)
- To +1 even more to Izno’s point, these templates are not actually substitutable in the current form, as that would create a bunch of
#ifmarkup in the article, and they don’t support TemplateData, so editing those templates in the visual editor is pretty bad. These sorts of templates should exist as substitution-only. And they should definitely not be inserted into articles en masse (especially by AWB/JWB). stjn 22:19, 10 June 2026 (UTC) - Others above have mentioned technical issues but my concern is more social. The example in the {{lvow}} documentation is
{{lvow|close}} is shorthand for [[close vowel|close]]. That makes sense for someone who has spent weeks examining vowels and their templates, but it is a mental hurdle for everyone else. The spelled-out wikilink is much better for editors. Johnuniq (talk) 00:22, 11 June 2026 (UTC) - @Mathglot, Izno, TheDJ, Stjn, and Johnuniq: thanks for all of your replies. It is rather clear from them that the prevailing opinion is that the usefulness of these templates is outweighed by the technical burden and editor-friendly readability.
- Stjn correctly pointed out that the templates are actually currently not properly substitutable; I have made corrections for this at {{lcons/sandbox}} and {{lvow/sandbox}}, and confirmed that AnomieBOT can properly substitute them in my sandbox: Special:Diff/1358808862. Would implementing this change be the preferred path forward? It would both undo my replacements and future-proof usage of these templates. ~ oklopfer (💬) 03:39, 11 June 2026 (UTC)
- This hasn't been open for long, and I, for one, am willing to wait for more opinions if you wish to; these things can swing sometimes. Whatever path you choose, I'm not the best person to answer your question, so I'll let others weigh in on that. Mathglot (talk) 05:29, 11 June 2026 (UTC)
GenFixes failed to remove year of death missing category when a YYY BC deaths category was present
[edit]Phabricator ticket created. ~ Tom.Reding (talk ⋅dgaf) 13:53, 13 June 2026 (UTC)
Option for addition of "pipe trick"?
[edit]Is there an option that would make AWB substitute in Pipe Trick entries? Right now apparently by default, it will take [[Sandal|Sandals]] and turn it into [[Sandal]]s. Is there an option currently that would cause [[Glas (book)|Glas]] to become [[Glas (book)|]] ?Naraht (talk) 12:01, 18 June 2026 (UTC)
- There is an important distinction between those two examples. [[Sandal|Sandals]] and [[Sandal]]s are two alternative ways to produce the same result, and AWB is preferring the latter. [[Glas (book)|]] is not stored as wikitext: the standard page editors convert it to [[Glas (book)|Glas]] when the user clicks Publish. If you then edit the page again, you see [[Glas (book)|Glas]] in the wikitext. If you do manage to store [[Glas (book)|]] as wikitext, perhaps by wrapping it in a <ref> tag where the pipe trick isn't applied, it will literally display the text [[Glas (book)|]] rather than a link. In other words, the pipe trick is applied when saving a page, not when displaying it. Certes (talk) 13:21, 18 June 2026 (UTC)
Template Pahýl on cswiki
[edit]Hello, AWB automatically suggests (the Apply changes automaticaly option) to move the Template:Pahýl template on the Czech Wikipedia below the Template:Portály template, while we are used to placing it above them. Is there any way to change this?
I would also like to know if it is possible to avoid having to log in and set up a project on cswiki every time I open the program.
Best regards, Henryk Siuda (talk) 15:08, 23 June 2026 (UTC)