Wikipedia:Bot requests: Difference between revisions
ClueBot III (talk | contribs) m Archiving 1 discussion to Wikipedia:Bot requests/Archive 81. (BOT) |
Kwamikagami (talk | contribs) |
||
Line 286: | Line 286: | ||
As for how the bot can ID the correct info box, it should follow the ISO redirect (e.g., the rd for ISO code [abc] is [[ISO 639:abc]]), and verify that the info box at the article it arrives at does indeed have the param ISO3 = abc or lc[n] = abc (where [n] is a digit). If there isn't a match (and it's been years since we've run a maintenance bot to verify them all), then that ELP code should instead be added to an error list. If there's already an ELP link in the box from a previous pass by the bot, then the bot should add the new one as ELP2 etc., and keep a tabbed list of those so we can code the template to support the article with the largest number of links. |
As for how the bot can ID the correct info box, it should follow the ISO redirect (e.g., the rd for ISO code [abc] is [[ISO 639:abc]]), and verify that the info box at the article it arrives at does indeed have the param ISO3 = abc or lc[n] = abc (where [n] is a digit). If there isn't a match (and it's been years since we've run a maintenance bot to verify them all), then that ELP code should instead be added to an error list. If there's already an ELP link in the box from a previous pass by the bot, then the bot should add the new one as ELP2 etc., and keep a tabbed list of those so we can code the template to support the article with the largest number of links. |
||
The bot should test that the URL lands on a page. For instance, [http://www.endangeredlanguages.com/lang/8117] is a language code in the list, but the URL says "Page not found :(". These should be collected into an error list for submission to the ELP. |
|||
Non-ASCII names have been scrambled, maybe a Mac > PC conversion error, and will need to be fixed by hand, so the bot should to keep a list of all names with non-ASCII characters. |
|||
A couple weeks ago I mentioned I was planning this at [[WT:LANG]], and just recently I posted a notice on the talk page that I was making this request, and |
A couple weeks ago I mentioned I was planning this at [[WT:LANG]], and just recently I posted a notice on the talk page that I was making this request, and the responses have been positive. |
||
Last time I did something like this I used PotatoBot, but Anypodetos tells me it's no longer working. |
Last time I did something like this I used PotatoBot, but Anypodetos tells me it's no longer working. |
Revision as of 02:53, 18 January 2021
This page has a backlog that requires the attention of willing editors. Please remove this notice when the backlog is cleared. |
Commonly Requested Bots |
This is a page for requesting tasks to be done by bots per the bot policy. This is an appropriate place to put ideas for uncontroversial bot tasks, to get early feedback on ideas for bot tasks (controversial or not), and to seek bot operators for bot tasks. Consensus-building discussions requiring large community input (such as request for comments) should normally be held at WP:VPPROP or other relevant pages (such as a WikiProject's talk page).
You can check the "Commonly Requested Bots" box above to see if a suitable bot already exists for the task you have in mind. If you have a question about a particular bot, contact the bot operator directly via their talk page or the bot's talk page. If a bot is acting improperly, follow the guidance outlined in WP:BOTISSUE. For broader issues and general discussion about bots, see the bot noticeboard.
Before making a request, please see the list of frequently denied bots, either because they are too complicated to program, or do not have consensus from the Wikipedia community. If you are requesting that a template (such as a WikiProject banner) is added to all pages in a particular category, please be careful to check the category tree for any unwanted subcategories. It is best to give a complete list of categories that should be worked through individually, rather than one category to be analyzed recursively (see example difference).
- Alternatives to bot requests
- WP:AWBREQ, for simple tasks that involve a handful of articles and/or only needs to be done once (e.g. adding a category to a few articles).
- WP:URLREQ, for tasks involving changing or updating URLs to prevent link rot (specialized bots deal with this).
- WP:USURPREQ, for reporting a domain be usurped eg.
|url-status=usurped
- WP:SQLREQ, for tasks which might be solved with an SQL query (e.g. compiling a list of articles according to certain criteria).
- WP:TEMPREQ, to request a new template written in wiki code or Lua.
- WP:SCRIPTREQ, to request a new user script. Many useful scripts already exist, see Wikipedia:User scripts/List.
- WP:CITEBOTREQ, to request a new feature for WP:Citation bot, a user-initiated bot that fixes citations.
Note to bot operators: The {{BOTREQ}} template can be used to give common responses, and make it easier to keep track of the task's current status. If you complete a request, note that you did with {{BOTREQ|done}}
, and archive the request after a few days (WP:1CA is useful here).
Legend |
---|
|
|
|
|
|
Manual settings |
When exceptions occur, please check the setting first. |
Bot-related archives |
---|
WP:TALKORDER issues
I come across way too many article talks, like Talk:Jennifer Lawrence, where the {{Archives}} causes that ugly overlap. It happens whenever the template isn't at the bottom of the list of talk banners (view source to see what I mean). To fix, we'd need a continuous bot to make sure this template keeps getting moved to the bottom of talk page banners. I don't think a CSS fix is really possible for this, and a JS fix would not be preferable to just having a bot maintain talk pages. I've made a discussion on the talk last week, see Redrose's response there for useful info as well (perhaps a broader bot for that purpose should be considered). It reminds me that another issue we see is DS templates constantly in the wrong order, it's advised by the template itself, and WP:TALKORDER, to have them below the talk header. Yet they seem to be scattered randomly. We commonly have random whitespace in talk page banners, too, thus random newlines. Really a bot to clean all this up would be a good idea, and enforce order (except when opted-out, I suppose). ProcrastinatingReader (talk) 22:04, 2 September 2020 (UTC)
- You'll have to see Special:PermaLink/973120366 for the version PR refers to, since I went ahead and removed the {{archives}} (there's already a {{talk header}} so it makes it redundant). Primefac (talk) 22:19, 2 September 2020 (UTC)
- @ProcrastinatingReader: I was just thinking about this the other day. WP:Talk page layout gives a fairly consistent indication of what the order for talk page banners should be, but they regularly end up more random. It'd be very nice to have a bot fixing that, and over time as people get used to a certain order, it could make the maze of talk banners easier to navigate.
- Programming will be a fairly big task, though. You'd have to go through every talk page banner available and assign it an order. You'd also probably want to automate things like when to introduce collapsing of WikiProjects or {{banner holder}}. And we'd need to discuss what should happen when someone creates and adds a new banner that isn't part of the queue, or how to handle custom notice banners. I could also see complaints that if the bot operates too frequently, it's just making edits without a strong purpose. All those obstacles are possible to overcome, however, and I think if we did it'd make talk pages a lot nicer. {{u|Sdkb}} talk 20:00, 24 September 2020 (UTC)
- I do think this would make talk pages a fair bit more friendly, to be honest, just by improving the consistency. I think the first part of dealing with this may be to get a consensus and/or a more complete list on what talk order is preferred. Headbomb as someone who edited WP:Talk page layout, might you have any thoughts on this proposal? ProcrastinatingReader (talk) 21:13, 30 October 2020 (UTC)
- This is a pretty damn tricky task, because WP:TPL is descriptive (observations) more than prescriptive (thou shall do this or else the cable gremlins will make you regret it). There's certainly more than a few things in there that are tricky and iffy. The only thing, AFAICT, that I'd consider 'safe' to do by bot is to put the archives at the very bottom, put banners in a {{WPBS}}, and put whatever can be shoved in {{Article history}} in {{Article history}}. Headbomb {t · c · p · b} 23:03, 30 October 2020 (UTC)
- For what it's worth (having just done an unrelated-but-made-me-think-of-it bot run), AWB has a set "order" that defines the order of talk page banners. Given that one of our central programs already has a metric, I'm not really sure how "tricky" this would be. Of course, how necessary is another question entirely, since any changes my bot makes are largely incidental to the overall task that it's performing.
- Of course, the additional issue is that it will add yet another bot that will hide updates to a page due to a bot edit, but it's rather unlikely that bug will be fixed any time soon. Primefac (talk) 21:37, 1 November 2020 (UTC)
- One challenge that AWB has with ordering talk page banners is dealing with redirects. You may want to look at the AWB custom module User:Magioladitis/WikiProjects, which probably needs some updating. GoingBatty (talk) 05:40, 28 December 2020 (UTC)
- ProcrastinatingReader and Headbomb, the above prompted me to add the Template:Banner holder#Choosing banners to collapse documentation section. If you're interested, additional input/expansion would be welcome, and might eventually lead to enough standardization that a bot could take over the task. {{u|Sdkb}} talk 19:24, 8 November 2020 (UTC)
- This is a pretty damn tricky task, because WP:TPL is descriptive (observations) more than prescriptive (thou shall do this or else the cable gremlins will make you regret it). There's certainly more than a few things in there that are tricky and iffy. The only thing, AFAICT, that I'd consider 'safe' to do by bot is to put the archives at the very bottom, put banners in a {{WPBS}}, and put whatever can be shoved in {{Article history}} in {{Article history}}. Headbomb {t · c · p · b} 23:03, 30 October 2020 (UTC)
- I do think this would make talk pages a fair bit more friendly, to be honest, just by improving the consistency. I think the first part of dealing with this may be to get a consensus and/or a more complete list on what talk order is preferred. Headbomb as someone who edited WP:Talk page layout, might you have any thoughts on this proposal? ProcrastinatingReader (talk) 21:13, 30 October 2020 (UTC)
- Slightly separate but you also have pages using the graphs directly, eg Talk:Robert Hunter (lyricist), rather than via their talk page wrappers. Imagine new users stumbling across this - looks a mess. ProcrastinatingReader (talk) 03:25, 21 November 2020 (UTC)
adding "nobots", and "category:wikipedians who opt out of message delivery" to indef blocked users
I have seen many users who have been blocked indefinitely for various reasons (socking, disruptive editing, CIR, and what not), but they receive many newsletters, and other notifications. Currently, there is User:Yapperbot/Pruner to remove inactive users from lists (WikiProject membership, FRS, etc), notifying the removed users appropriately.
I am not sure what is the extent of this task. Would it be feasible to spend resources on creating a bot task to add {{nobots}}, and "category:wikipedians who opt out of message delivery" on the talkpages of users who have been blocked indefinitely, and do not have {{unblock}} on talkpage for more than 30 days? That way, resources can be conserved by avoiding new bot messages being posted, and later being archived. In case the user returns after a while, or after standard offer, they can simply remove the "nobots", and the category. Opinions are welcome. Regards, —usernamekiran (talk) 13:22, 15 September 2020 (UTC)
- FWIW, I created a custom module, which did the edit(s) successfully: special:diff/978555212. I tested the module under different scenarios, and I also tested on a few (talk)pages from Category:Indefinitely blocked Wikipedians. I couldnt find any errors, as it is fairly a basic task. I didn't save these edits, just previewed. —usernamekiran (talk) 16:30, 15 September 2020 (UTC)
- bump. —usernamekiran (talk) 10:03, 8 November 2020 (UTC)
- Well, doing this to all indef blocked users is a bad idea, because we have 1 million indef blocked users. Doing it to all indef blocked users subscribed to a newsletter may be feasible, if all newsletter user lists are categorised in some way. Worth making a feature request to Naypta? ProcrastinatingReader (talk) 18:01, 27 November 2020 (UTC)
- yeah, thats what I meant, blocked users with subscriptions. Apologies for the vagueness. Maybe we can run the bot through mailing list, to look for the users fitting in the criteria of being indef, and no unblock request (instead of going through indef blocked category). —usernamekiran (talk) 18:39, 27 November 2020 (UTC)
- Well, doing this to all indef blocked users is a bad idea, because we have 1 million indef blocked users. Doing it to all indef blocked users subscribed to a newsletter may be feasible, if all newsletter user lists are categorised in some way. Worth making a feature request to Naypta? ProcrastinatingReader (talk) 18:01, 27 November 2020 (UTC)
- bump. —usernamekiran (talk) 10:03, 8 November 2020 (UTC)
Clearing the category "Wikipedia usernames with possible policy issues"
Have a bot remove a user from the category Category:Wikipedia usernames with possible policy issues when they have been inactive for over one year or have been blocked indefinitely. Heart (talk) 03:15, 9 October 2020 (UTC)
- I don't work with that category, but would it make sense for a bot to move those pages to a corresponding "inactive user" category so that the usernames could still be tracked? – Jonesey95 (talk) 05:47, 9 October 2020 (UTC)
- Jonesey95, well, the notice explicitly states that users that haven't been active in a week can be removed. I think that rule is ludicrous and have extended it to a year to give time to change names, or to come back from a wikibreak. So I would see no need for the category, but it is up the user who creates the bot to decide this. Heart (talk) 06:21, 9 October 2020 (UTC)
- Doing... Good idea, I'll get working on this. BJackJS talk 18:15, 4 December 2020 (UTC)
- TheSandBot 6 is already approved for removing blocked users from that category. – SD0001 (talk) 19:48, 4 December 2020 (UTC)
- Damn. BJackJS talk 20:32, 4 December 2020 (UTC)
- Ping TheSandDoctor to see if he can do a run? ProcrastinatingReader (talk) 12:00, 5 December 2020 (UTC)
- @ProcrastinatingReader and BJackJS: I really need to make that a daily chron job...running inside 30min. Thanks for the ping, ProcrastinatingReader. --TheSandDoctor Talk 23:18, 5 December 2020 (UTC)
- Ping TheSandDoctor to see if he can do a run? ProcrastinatingReader (talk) 12:00, 5 December 2020 (UTC)
- Damn. BJackJS talk 20:32, 4 December 2020 (UTC)
- TheSandBot 6 is already approved for removing blocked users from that category. – SD0001 (talk) 19:48, 4 December 2020 (UTC)
Redundant template pairs
The following pairs of cleanup templates:
- {{COI}} and {{Autobiography}} - on circa 200 articles
- {{More citations needed}} and {{One source}} - on over 3,200 articles
- {{More footnotes needed}} and {{More citations needed}} - on over 5,500 articles
- {{More footnotes needed}} and {{One source}} - on over 1,000 articles
should not be used on the same article; but often are.
We need a bot, please, to remove first template in each of the pairs named above.
The bot should not do this when the templates are section-specific (e.g. {{One source|section|date=October 2020}}
)
The bot should remove {{Multiple issues}}, where appropriate.
The bot needs to take into account common redirects (for example, {{More citations needed}} is often used via {{Refimprove}}; {{More footnotes needed}} as {{More footnotes}}, etc.).
This can be done as a one-off and then either run occasionally, or added to one of the regular clean-up tasks.
Other such pairs might be identified in future.
Prior discussion is here . Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:51, 21 October 2020 (UTC)
- There's often overlap between the templates in each pair, but they serve different purposes. For example, {{More footnotes needed}} indicates a need for more in-line referencing based on the article's existing sources, which may be alright in number and quality, whereas {{More citations needed}} points out that the article needs more, or better references, and not necessarily in-line ones. – Uanfala (talk) 19:59, 29 October 2020 (UTC)
- But {{More footnotes needed}} is a subcategory of {{More citations needed}} (in the non-technical sense of the term); if an article
needs more, or better references, and not necessarily in-line ones
, then to say that there isa need for more in-line referencing based on the article's existing sources
is superfluous, as the articles existing sources have been tagged as insufficient. WT79 (speak to me | editing patterns | what I been doing) 16:23, 2 November 2020 (UTC)
- But {{More footnotes needed}} is a subcategory of {{More citations needed}} (in the non-technical sense of the term); if an article
- I think this one is quite simple to do with AWB, as the genfixes will handle the {{Multiple issues}} bundle, and GetZerothSection can handle the lead-only requirement. ProcrastinatingReader (talk) 14:31, 23 November 2020 (UTC)
- @ProcrastinatingReader: Thank you. Is that an offer to do so, or a suggestion that someone else does so? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:41, 19 December 2020 (UTC)
- I'd do it myself but I have little access to AWB (I mostly use macOS, unless I have a reason to boot into Windows) and JWB can't do this. So it's just some thoughts to help someone else do this. ProcrastinatingReader (talk) 14:12, 19 December 2020 (UTC)
- @ProcrastinatingReader: Thank you. Is that an offer to do so, or a suggestion that someone else does so? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:41, 19 December 2020 (UTC)
AFC drafts that have titles identical to titles on other Wikipedias
From user talk:SD0001:
... would there be a way to do a bot report of pending articles that have titles identical to titles on other Wikipedias, with links to those foreign-language Wikipedia pages? If an article exists on another Wikipedia, it's a good indication that the draft should be approved. Thanks, Calliopejen1 (talk) 18:19 pm, 11 November 2020 (UTC)
I don't think using wikidata is an option since AFC drafts are very unlikely to have been linked to wikidata. Is there another way this could be done? – SD0001 (talk) 06:33, 13 November 2020 (UTC)
- Agreed. On an only-slightly related note, how feasible would it be to get a report of all draft pages with a corresponding fully-protected/salted article? Primefac (talk) 13:36, 13 November 2020 (UTC)
- @Primefac: like this? (for protected redirects: [1]) ProcrastinatingReader (talk) 17:15, 13 November 2020 (UTC)
- Pretty much. Something on-wiki might be nice for slightly-easier tracking purposes. Primefac (talk) 17:53, 13 November 2020 (UTC)
- @Primefac: like this? (for protected redirects: [1]) ProcrastinatingReader (talk) 17:15, 13 November 2020 (UTC)
- Is it not possible to search Wikidata for an identical title for the Wikidata item (or identical title for the Wikidata link to a foreign-language Wikipedia), even if no links to Wikidata exist? Obviously this would not be perfect, but I think it could still be very useful. Calliopejen1 (talk) 19:40, 16 November 2020 (UTC)
- It might be possible to do that, but I'm not sure how well that would translate to a bot being able to keep a list of drafts-with-the-same-name-as-existing-WD-pages updated. Primefac (talk) 13:34, 17 November 2020 (UTC)
- Yes. Directly querying the database for an exact match is too slow (6+ minutes for a single page itself). Simply using the wikidata search and processing the result seems like the way to go. It's more flexible, but there could be some caveats. – SD0001 (talk) 18:02, 17 November 2020 (UTC)
Using a SQL query this would be a cross database join. My bot "shadows" does something similar where it looks for File:'s that have the same name on Commons and Enwiki. It's pretty fast and Commons has 60 million File: pages, which exceeds the total of all mainspace pages in all wikis by a fair amount, it is fast. The problem is Wikitech is redesigning the SQL servers and cross database joins will soon no longer be available. A Phab is open to try and find a solution. If you would like to follow developments see Phab T267992. -- GreenC 19:03, 17 November 2020 (UTC)
- @GreenC and SD0001: How about just a report for drafts where the draft title matches the wikidata English-language label, if one exists? Would that be easier? It seems like most WD items have English labels, and this simpler (?) report would probably cover most of what I'm interested in. Calliopejen1 (talk) 18:51, 20 November 2020 (UTC)
- @Calliopejen1: I just put together User:VahurzpuBot/Drafts matching non-English Wikipedias. The current code for this only looks at Category:AfC pending submissions by age/0 days ago and selects Wikidata items based on a case-insensitive but exact match on labels or aliases. Additionally, it doesn't update automatically yet. What other features would be useful? Vahurzpu (talk) 23:17, 20 November 2020 (UTC)
- @Vahurzpu: Thanks! This is exactly what I was thinking of, but to be honest it's less helpful than I expected (still a lot of junk in here). Could you run it for one or two more days worth of AFC submissions, so we can see if this was a fluke or not? Thanks! Calliopejen1 (talk) 18:50, 24 November 2020 (UTC)
- @Calliopejen1: I just put together User:VahurzpuBot/Drafts matching non-English Wikipedias. The current code for this only looks at Category:AfC pending submissions by age/0 days ago and selects Wikidata items based on a case-insensitive but exact match on labels or aliases. Additionally, it doesn't update automatically yet. What other features would be useful? Vahurzpu (talk) 23:17, 20 November 2020 (UTC)
Automatically report highly reverted pages for page protection
The bot would scan recent reverts and inspect the page history. It will then analyse the number of reverts against pre-set thresholds. If one of these thresholds are met, it files an automatic report to WP:RPP requesting page protection.
Example thresholds could be:
- 4 reverts in the last 3 days by non-registered users/non-auto confirmed users - request semi-protection
- 10 reverts in the last 3 days by non-registered users/non-auto confirmed users - request semi-protection
- 3 reverts in the past 24 hours by more than 2 different users where all users are autoconfirmed/extended confirmed - request full protection
I have no programming experience with Wikipedia so unfortunately I won't be able to program this. Eyebeller (talk) 07:59, 18 November 2020 (UTC)
- We do have filters for similar kinds of edit warring, example: 249. In many cases it's a single editor who needs an individual sanction, who are reported to AIV by User:DatBot. Is there evidence that multi-user edit wars happen by new editors, are not caught by filter 249 and hence not reported, and someone doesn't report to WP:RFPP manually? Similar for the third bullet, is there evidence in such a case someone doesn't just go to WP:RFPP manually if needed? Additionally, this overlaps strongly with WP:ANEW, and a bot would not be able to distinguish between genuine content disputes and conduct issues / enforcing consensus etc. Those limits are too close to WP:3RR to allow for flexibility. And I'm not sure a high false positive rate is good; given RFPP's backlogs already often creep up, a high wrong venue rate would not be ideal. ProcrastinatingReader (talk) 10:53, 18 November 2020 (UTC)
- I want to point out that the thresholds are examples and could be changed. Eyebeller (talk) 15:25, 18 November 2020 (UTC)
- The main point is long term vandalism and abuse by multiple users. I should point out to tighten out on false positives we can do a more detailed analysis of those reverts to make sure that they were reverted as disruptive/vandalism. This could include analysing if the revert was made by Cluebot NG, if the revert edit summary contained keywords like "vandalism"/"disruptive" or if it was made with the default Huggle revert summary. Eyebeller (talk) 15:40, 18 November 2020 (UTC)
- I didn't realise this was here, but there's a parallel discussion at AN. Primefac (talk) 15:17, 1 December 2020 (UTC)
Talk page notifications when topic equivalent is promoted to quality status on another project?
Hello! I posted a comment over at the Village Pump and was directed here, so I'll copy here:
I think it'd be cool if a bot could be designed to add Talk page notifications when the subject's article is promoted to Quality status at another Wikipedia project. To pick an artbitrary example, a notification could have been added to en:Talk:G.U.Y. when hu:G.U.Y. was promoted to quality status.
Added benefits could be editors comparing different language versions, encouraging translation efforts, and more editors becoming familiar with Wikidata, depending on the notification's text and bot design. I could also see notifications being posted to WikiProject talk pages, etc.
Thoughts? Concerns? Other feedback? Sorry if this idea has been brought to the table before. ---Another Believer (Talk) 15:38, 24 November 2020 (UTC)
- Another Believer, is it possible to do a bot over multiple wikis? Mr. Heart (talk) 15:39, 24 November 2020 (UTC)
- HeartatSchool, I have no idea! I don't know anything about bots or how they operate on/across projects. I'm just an editor who thinks this would be helpful. Actually, I think so. Notifications are posted to English Wikipedia article Talk pages when an image from Wikimedia Commons has been nominated for deletion, so, this is similar, no? ---Another Believer (Talk) 15:41, 24 November 2020 (UTC)
- What other wikis have article review processes (eg GA) other than enwiki, and I suppose huwiki? ProcrastinatingReader (talk) 17:40, 24 November 2020 (UTC)
- I believe (almost) every wiki has an FA and/or GA process. If an article is an FA/GA on another wiki, an FA/GA icon is shown in the languages section next to the link. – SD0001 (talk) 17:46, 24 November 2020 (UTC)
- SD0001, Right, and that's helpful, but I'd also love to know when an article I'm watchlisting has been promoted in another language. Talk page notifications is one way for this to appear in my watchlist. ---Another Believer (Talk) 18:01, 24 November 2020 (UTC)
- Great idea, I support it. Ludost Mlačani (talk) 23:49, 29 November 2020 (UTC)
- SD0001, Right, and that's helpful, but I'd also love to know when an article I'm watchlisting has been promoted in another language. Talk page notifications is one way for this to appear in my watchlist. ---Another Believer (Talk) 18:01, 24 November 2020 (UTC)
- I believe (almost) every wiki has an FA and/or GA process. If an article is an FA/GA on another wiki, an FA/GA icon is shown in the languages section next to the link. – SD0001 (talk) 17:46, 24 November 2020 (UTC)
I think this is technically difficult to do using a bot. Only reasonable approach I can think of is if we knew the name of the GA template on a given wiki (given that, although we use Legobot, other wikis probably do it manually with differently named templates), we could patrol its recent changes, check for addition of template, and then lookup Wikidata link to find the enwiki article and add a talk page message. Otherwise, this is probably better as a userscript with some kind of "Check other wikis for GA status" button in the toolbar. ProcrastinatingReader (talk) 12:04, 5 December 2020 (UTC)
- Wikidata also keeps quality information as "link badges", you could also listen for changes in that. Majavah (talk!) 14:16, 5 December 2020 (UTC)
- Hmm, is that automatic or manually added? Also does Wikidata allow listening to changes to a specific property only? ProcrastinatingReader (talk) 14:20, 5 December 2020 (UTC)
Bot or other process to keep categories and page renderings up to date
Two related proposals on the Community Wishlist survey have been rejected as out of scope, so I am putting this note here in case there is anyone interested in taking on a project to keep Wikipedia pages and categories up to date.
Basically, pages on Wikipedia are not refreshed often enough, which means that it can take weeks, months, or longer for category membership to update, or for things like age calculation in infoboxes to work correctly.
When a change is made to a template or module that involves category membership, pages that transclude that template or module require a null edit in order to update their category membership. Because of delays in the job queue, such category membership changes can take weeks, or even months. Even worse, changes to the underlying MediaWiki software that apply categories (e.g. those in Special:TrackingCategories) do not force pages into the job queue, which means that category membership for affected pages can take months, years, or forever.
These delays cause outdated information, missing information, and outright errors to be rendered for readers, and cause editors who are working on fixing problems identified by maintenance categories to be delayed in applying those fixes. When a maintenance category should be populated but is empty, it gives editors the false impression that all affected articles are working properly.
One proposed solution/workaround is to set up a background process that tracks all pages based on their last edit time stamp, including null edits. That tracking could be used to make a list of needed null edits for "stale" pages. There is some detail in the phab links below about how to generate such lists and (possibly) how to force pages into the job queue so that a null-edit bot might not be needed.
For details and links to phabricator tickets, see meta:Community Wishlist Survey 2021/Archive/Set maximum delay in updating category membership and meta:Community Wishlist Survey 2021/Archive/Correct wrong tenure lengths. (Actually, I'll just put the phab links here: T132467, T135964, T157670, T159512.) – Jonesey95 (talk) 16:34, 7 December 2020 (UTC)
- Haven't dove deep into the phab tickets. Is the problem here that MediaWiki doesn't have the server resources to keep those millions of pages fresh, or is it that the resources exist but the there are no algorithms in Mediawiki to do the purges automatically? Or something in between? – SD0001 (talk) 13:23, 11 December 2020 (UTC)
- It's the latter. There are good comments at T157670 from February 2017 that show tables of pages and their last refresh date. If we could somehow make that report, list the pages, and "expire" or refresh/null-edit (not purge) the most out-of-date pages, that would be a start. We would have to be aware of the effect on the job queue, but I think it would be manageable. – Jonesey95 (talk) 14:15, 11 December 2020 (UTC)
- Perhaps there could be a second job queue, processed only when the main one empties, containing all pages ordered by last refresh date. This would keep the process busy when and only when it has nothing more urgent to do. (In practice, I expect we'd do some sort of "find stalest 1000" query rather than actually maintaining a queue of length 40 million.) Certes (talk) 14:43, 11 December 2020 (UTC)
- I'd be interested in seeing the current results of Legoktm's query (
select count(*), SUBSTR(page_links_updated, 1,6) from page group by SUBSTR(page_links_updated, 1,6) order by SUBSTR(page_links_updated, 1,6) desc;
), and probably some variations on it, including that same query limited to article and template space. If we could get a reasonable list of the stalest articles and templates, a bot could null-edit them systematically. – Jonesey95 (talk) 16:35, 11 December 2020 (UTC)- We may also be interested in any pages where
page_links_updated IS NULL
andpage_touched
is old. They won't have been re-parsed since creation. Unfortunately, thepage
table does not seem to be indexed on those columns and I don't see a relevant alternative view. Certes (talk) 17:16, 11 December 2020 (UTC) - @Jonesey95: see https://people.wikimedia.org/~legoktm/T157670/ - let me know what other queries you want me to run. Legoktm (talk) 18:25, 11 December 2020 (UTC)
- Those queries are helpful. From the NS0 query, it appears to me that we have about 15 million pages in article space (although {{NUMBEROFARTICLES}} gives me 6 million pages, so if someone could explain that, please do), of which 8 million have been refreshed in the last two months. That leaves about 7 million "stale" article pages, if I understand the report (which I clearly do not). If we refresh one article per second with a bot, which doesn't seem like a heavy load, we can do 2.6 million articles every 30 days. How do we get this process started? I think we would need generate a list of the names of the stale articles somehow.
- If we can get this working for articles, we can look at expanding it to other namespaces. – Jonesey95 (talk) 20:43, 11 December 2020 (UTC)
- The other 9 million are non-article pages such as redirects and dabs. Further reading: Wikipedia:Database reports/Page count by namespace. Certes (talk) 21:31, 11 December 2020 (UTC)
- Thanks! That might make it even easier. If we can get a list of the X thousand most stale articles (non-redirect, non-dabs) and feed them to a null-edit bot at one per second, we might be able to get the whole (actual) article space refreshed in less than a month, and then keep it that way with a background process that null-edits newly stale articles. – Jonesey95 (talk) 23:20, 11 December 2020 (UTC)
- We can go beyond main namespace but need to be a bit careful. (Refreshing Template:Pagetype might take a while!) Certes (talk) 23:48, 11 December 2020 (UTC)
- FWIW, you can purge the links of a page at a rate of around 20/request (each request every 5-10 secs incl delay). Any more and the request times out. So it's closer to 2-4 pages per second you can update. ProcrastinatingReader (talk) 14:32, 19 December 2020 (UTC)
- We can go beyond main namespace but need to be a bit careful. (Refreshing Template:Pagetype might take a while!) Certes (talk) 23:48, 11 December 2020 (UTC)
- Thanks! That might make it even easier. If we can get a list of the X thousand most stale articles (non-redirect, non-dabs) and feed them to a null-edit bot at one per second, we might be able to get the whole (actual) article space refreshed in less than a month, and then keep it that way with a background process that null-edits newly stale articles. – Jonesey95 (talk) 23:20, 11 December 2020 (UTC)
- The other 9 million are non-article pages such as redirects and dabs. Further reading: Wikipedia:Database reports/Page count by namespace. Certes (talk) 21:31, 11 December 2020 (UTC)
- We may also be interested in any pages where
- I'd be interested in seeing the current results of Legoktm's query (
- Perhaps there could be a second job queue, processed only when the main one empties, containing all pages ordered by last refresh date. This would keep the process busy when and only when it has nothing more urgent to do. (In practice, I expect we'd do some sort of "find stalest 1000" query rather than actually maintaining a queue of length 40 million.) Certes (talk) 14:43, 11 December 2020 (UTC)
- It's the latter. There are good comments at T157670 from February 2017 that show tables of pages and their last refresh date. If we could somehow make that report, list the pages, and "expire" or refresh/null-edit (not purge) the most out-of-date pages, that would be a start. We would have to be aware of the effect on the job queue, but I think it would be manageable. – Jonesey95 (talk) 14:15, 11 December 2020 (UTC)
Generating category directs for species common names
When uploading images to Wikimedia Commons, I often notice that there are no category redirects for the common names of most species, so there are too many redirects that need to be created manually. Is there a bot that could create these missing redirect pages, using data from Wikispecies or WikiData? For example: commons:Category:Red fox is {{category redirect|Vulpes vulpes}}. Jarble (talk) 18:23, 10 December 2020 (UTC)
- Beware that some common names are ambiguous and require disambiguation pages listing multiple species and/or other meanings (or at least a hatnote from the primary topic). Many such pages exist but some may be missing. Certes (talk) 00:48, 12 December 2020 (UTC)
Cleaning up WantedPages by putting <nowiki/> in red links on talk pages
WantedPages is pretty useless as it is since it considers links from and to talk pages. Does the requested action above help at all? JsfasdF252 (talk) 03:06, 5 January 2021 (UTC)
- A better solution has been discussed and requested in Phabricator. Until then, WP:Most-wanted articles may be a more helpful alternative. Certes (talk) 10:58, 5 January 2021 (UTC)
Replace Template:IPC profile with Template:IPC athlete
There are some 800+ transclusions of Template:IPC profile. They go to an archive page, because the original link doesn't work, but with the first five I at random checked, the archive page doesn't work either: Scot Hollonbeck, Stephen Eaton, Jonas Jacobsson, Sirly Tiik, Konstantin Lisenkov.
It seems possible to replace the template with Template:IPC athlete: {{IPC profile|surname=Tretheway|givenname=Sean}} becomes {{IPC athlete|sean-tretheway}}. It is safer to take the parameter from the article title than from the IPC profile template though: at Jacob Ben-Arie, {{IPC profile|surname=Ben-Arie|givenname=<!--leave blank in this case, given name not listed-->}} should become {{IPC athlete|jacob-ben-arie}}[2].
If the replacement is too complicated, then simply removing the IPC profile one is also an option, as it makes no sense to keep templates around which produce no useful results. Fram (talk) 11:34, 5 January 2021 (UTC)
- @Fram: if I’m understanding you right, is the template totally redundant and should all transclusions be replaced with IPC athlete? If so, you can just TfD the template, then an existing bot with a general TfD authorisation can easily do this task. It’s also probably faster (it’ll probably take at least 7 days for community input + BRFA for the task alone otherwise). ProcrastinatingReader (talk) 00:33, 6 January 2021 (UTC)
- Thanks, I'll bring it up at TfD then, didn't know that their "power" went that far (but it is a good thing). Fram (talk) 08:23, 6 January 2021 (UTC)
Need a bot to add remove contents to wiki pages
Hey I need a simple bot that could be able to add words to the links I send it. Maybe have the option where to add the text, but also have an option to remove all the text that you put in the bot once it comes across one of the words on the links. Might've not expressed myself the best but I hope you guys got my message. — Preceding unsigned comment added by JokerLow (talk • contribs) 23:51, 5 January 2021 (UTC)
- I didn't. What are you trying to accomplish with this bot? Primefac (talk) 00:23, 6 January 2021 (UTC)
Make archive bots assume standard naming
AT Wikipedia talk:Moving a page#Updating archive bot settings when moving a page you can learn PrimeHunter has recently created Category:Pages where archive parameter is not a subpage, and that by far the biggest reason for pages to end up there is that Wiki editors move pages without updating talk page archival bot instructions.
But why should humans have to do menial tasks like that at all?
I assume when the bots were created there were no real standards and practices regarding auto archiving, but now there is. Seems to me we can avoid needless administration (and a lot of pages that don't archive properly) if we change the code of the two main archival bots to assume the standard naming as the default. If the |archive=User talk:Example/Archive %(counter)d
parameter (Lowercase Sigmabot III) and the |archiveprefix=User talk:Example/Archive
(ClueBot III) parameters could be made optional we could remove them from the standard instructions while still allowing manual override for the (few) cases where it's needed. This should mean that moving a page would no longer break auto archiving.
Of course, if there were a good reason this wasn't implemented back when, feel free to enlighten your audience :) CapnZapp (talk) 09:59, 7 January 2021 (UTC)
- A page move sometimes fails to move existing archives. It could be messy if archiving automatically starts over with new archive names. PrimeHunter (talk) 10:11, 7 January 2021 (UTC)
- Agree with PrimeHunter; we should not assume that a talk page's archives were moved along with the talk page. Primefac (talk) 10:33, 7 January 2021 (UTC)
- Quite, normal confirmed editors do not have the "Move subpages (up to 100)" option that is provided to admins and page movers, and they may overlook some of the directions at the "Please clean up after your move" page that is displayed following the move. --Redrose64 🌹 (talk) 13:14, 7 January 2021 (UTC)
- Well, the easy solution to this is to check if the value of
|archive=
(minus the subpage) is a redirect, and if it has any subpages matching the subpage pattern that are non-redirects. So in that way it could be automated. For ones that don't meet the criteria, it's likely post-move cleanup is needed and it could build a report. ProcrastinatingReader (talk) 13:17, 7 January 2021 (UTC)- The suggestion was to make
|archive=
optional. A bot cannot check a parameter if it isn't there. It would have to look for moves in those cases. Moves aren't logged at the target name so it would have to examine the page history or incoming redirects. If somebody copy-pastes the talk page instead of moving then there might be no trace. Not demanding a subpage name will also increase the number of poor archive parameters when somebody copies the archive parameters from a random page with very different activity. PrimeHunter (talk) 22:39, 7 January 2021 (UTC)
- The suggestion was to make
- Well, the easy solution to this is to check if the value of
Thank you all for your consideration so far, @PrimeHunter, Primefac, Redrose64, and ProcrastinatingReader: Are you saying the occasional "overarchiving" (or whatever you feel is an appropriate title for the issue you have brought up) is deemed more disruptive than the (presumably) much larger load on human administration? That a big reason the bot writers mandated the archive name was so nothing was ever archived in the wrong place, even though it added a workload on humans that (from the layperson's perspective) is unnecessary? Perhaps a suggestion of this nature has been discussed previously? Cheers PS. If this place is the wrong venue for taking a holistic approach and here discussion should be limited to only unproblematic suggestions, please direct me to a more appropriate venue and thank you for your time. CapnZapp (talk) 10:29, 8 January 2021 (UTC)
- There are large backlogs in every area requiring human attention. There tends to be skepticism to automating these, in fear of some false positives or errors, and Prime makes a good point above as to possible pitfalls here. Here it seems like you're not requesting a new bot, but rather a tweak to the existing archive bots? In that case, you'd need to communicate with those botops and get them to implement the desired change in their bot. ProcrastinatingReader (talk) 10:31, 8 January 2021 (UTC)
- I don't know the original reasoning for demanding the parameter. I just think there are valid reasons for doing it. Category:Pages where archive parameter is not a subpage currently has 2875 pages (including 710 in userspace) but the tracking was added only a week ago and some of the wrong parameters are more than 10 years old. If maintenance editors with knowledge of archiving get it down to zero and monitor it then wrong parameters should be fixed quickly, often with better results than an archive bot ignoring the wrong parameter. Many of the pages are tiny or empty and don't even need any archiving like [3] PrimeHunter (talk) 10:55, 8 January 2021 (UTC)
Article Alert for WP:WILDFIRE
Hello, I'm here for requesting a bot to make an article alert page for WP:WILDFIRE wikiproject, like [[WP:CALI] and WP:USA does. --🔥LightningComplexFire🔥 17:51, 8 January 2021 (UTC)
- Follow the instructions at Wikipedia:Article alerts/Subscribing Majavah (talk!) 17:54, 8 January 2021 (UTC)
Replace Airdisaster.com links
The website airdisaster.com appears to be used in several articles about aviation accidents, but now links to a spam site/domain hoarder, which seems very undesirable for users. Can someone get the direct links removed and where possible linked to an archived page? In particular where it is linked as an external link, occurrences in references appear to be fixed already Pieceofmetalwork (talk) 16:07, 9 January 2021 (UTC)
- @Pieceofmetalwork: Are you suggesting adding {{webarchive}} like this edit? GoingBatty (talk) 18:46, 10 January 2021 (UTC)
- Yes, that would be a good solution. Pieceofmetalwork (talk) 18:48, 10 January 2021 (UTC)
Fixing punctuation before citations
Per MOS:REFPUNCT, citations are supposed to go after punctuation like periods and commas, not before it. This is already included in GENFIXes, but I think it's noticeable enough to readers that it'd be good to have a bot working on it; it's not really WP:COSMETICBOT to my reading. Yobot has an approved task for doing this, but given how many pages I've come across with this issue, I'm guessing it's no longer working. {{u|Sdkb}} talk 20:29, 10 January 2021 (UTC)
- Have you asked the botop why the task is not running? Primefac (talk) 20:33, 10 January 2021 (UTC)
- I gave them a ping above. {{u|Sdkb}} talk 22:27, 10 January 2021 (UTC)
- I can work with it. The task was stopped because there were comments on some bugs pending. -- Magioladitis (talk) 21:51, 11 January 2021 (UTC)
- Thanks for doing that. Probably helps that it's now part of the genfixes. Primefac (talk) 21:52, 11 January 2021 (UTC)
- It always was. But to run properly it has to run with general fixes which means that the edit sometimes is lost within other minor fixes which gives the impression the bot is doing nothing worthy. -- Magioladitis (talk) 21:53, 11 January 2021 (UTC)
- Thanks for doing that. Probably helps that it's now part of the genfixes. Primefac (talk) 21:52, 11 January 2021 (UTC)
- I can work with it. The task was stopped because there were comments on some bugs pending. -- Magioladitis (talk) 21:51, 11 January 2021 (UTC)
- I gave them a ping above. {{u|Sdkb}} talk 22:27, 10 January 2021 (UTC)
I resumed the bot task. If there is any problem, please report it immediately. -- Magioladitis (talk) 09:44, 14 January 2021 (UTC)
wikilink to Sansoni (publisher) in Cite book templates
Sansoni (publisher) is an old and important Italian publisher, whose page was recently created.
There are hundreds of pages with Cite book templates for works published by Sansoni.
It would be useful to link them to the publisher page.
So my proposal is that a bot should look for instances of {{Cite book }} where there is one of these parameters:
|publisher=G. C. Sansoni |publisher=G.C. Sansoni |publisher=Sansoni
And replace it respectively with:
|publisher=[[Sansoni (publisher)|G.C. Sansoni]] |publisher=[[Sansoni (publisher)|G.C. Sansoni]] |publisher=[[Sansoni (publisher)|Sansoni]]
The replace should only be done on the first instance in each page, of course, to avoid excessive wikilinks.
Thank you in advance!
--Lou Crazy (talk) 02:21, 14 January 2021 (UTC)
- Out of curiosity, when you say "hundreds", is that low-hundreds or high-hundreds? Just looking for a ballpark figure. Primefac (talk) 02:26, 14 January 2021 (UTC)
- Probably around 200 or slightly more. Searching for the word "Sansoni" yelds 395 pages, and many of them use the Cite book template. Some mention instead other people by that surname, some use alternate names for this publisher, so I'd guess about half of that would be replaced by looking for those strings. Opening a few pages at random confirms my estimate. --Lou Crazy (talk) 02:31, 14 January 2021 (UTC)
- Okay. Generally speaking (and I do use even that loosely) a bot run is not really necessary for <500 edits. That of course doesn't preclude someone from filing it, but for something small a) by the time trials etc are done the task is basically finished, and b) AWB has loads of users who would be willing to do this. I'll leave this open for a few days but if you don't get any takers I suggest making a request at Wikipedia:AutoWikiBrowser/Tasks. Primefac (talk) 03:19, 14 January 2021 (UTC)
- Probably around 200 or slightly more. Searching for the word "Sansoni" yelds 395 pages, and many of them use the Cite book template. Some mention instead other people by that surname, some use alternate names for this publisher, so I'd guess about half of that would be replaced by looking for those strings. Opening a few pages at random confirms my estimate. --Lou Crazy (talk) 02:31, 14 January 2021 (UTC)
Please remove all files in this category, because it's not necessary (all files in this category are out of copyright since this year). 185.172.241.184 (talk) 09:24, 15 January 2021 (UTC)
- Not done, if I interpret this category correctly, it's meant to mark which pages need their copyright status updated. If and when that happens, the cat will be empty and then it can be deleted. Primefac (talk) 11:47, 15 January 2021 (UTC)
Add ELP links to language info boxes
I got the full list of online language articles maintained by the Endangered Languages Project and request a bot to add links to those articles from transclusions of {{Infobox language}}, parallel to the existing links to other online linguistic resources such as ISO/Ethnlogue, Glottologue, AIATSIS.
I did one by hand at Dâw language as an example. There are a bit over 3000 ELP articles to link to. They provide demographic data as an alternative to what is now behind a paywall at Ethnologue. (And in some cases may be a check on Ethnologue, as they often rely on different primary sources.)
The list has ELP codes (actually the identifying part of the article URL), names, and ISO codes.
The bot should add params "ELP" and "ELPname" to the appropriate box. "ELPname" currently doesn't do anything and should be enabled after the bot run. (Some of the names won't display correctly at first and will need to be cleaned up by hand.)
As for how the bot can ID the correct info box, it should follow the ISO redirect (e.g., the rd for ISO code [abc] is ISO 639:abc), and verify that the info box at the article it arrives at does indeed have the param ISO3 = abc or lc[n] = abc (where [n] is a digit). If there isn't a match (and it's been years since we've run a maintenance bot to verify them all), then that ELP code should instead be added to an error list. If there's already an ELP link in the box from a previous pass by the bot, then the bot should add the new one as ELP2 etc., and keep a tabbed list of those so we can code the template to support the article with the largest number of links.
The bot should test that the URL lands on a page. For instance, [4] is a language code in the list, but the URL says "Page not found :(". These should be collected into an error list for submission to the ELP.
A couple weeks ago I mentioned I was planning this at WT:LANG, and just recently I posted a notice on the talk page that I was making this request, and the responses have been positive.
Last time I did something like this I used PotatoBot, but Anypodetos tells me it's no longer working.
Please ping me if you respond. — kwami (talk) 11:14, 16 January 2021 (UTC)