Jump to content

Wikipedia:Bots/Requests for approval/Uncle G's major work 'bot

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Uncle G (talk | contribs) at 22:31, 8 September 2010 (On the technicalities). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

There's a possibility that Uncle G's major work 'bot (talk · contribs) might wake up to do some major work. In this case it is the mass blanking of roughly ten thousand articles, to help address Wikipedia:Contributor copyright investigations/Darius Dhlomo.

Things are currently still at the discussion stage. Here's the background reading:

Main discussion page
Wikipedia:Administrators' noticeboard/Incidents/CCI
List of articles to be touched
As listed in earlier revisions of Wikipedia:Contributor copyright investigations/Darius Dhlomo and Wikipedia:Contributor copyright investigations/Darius Dhlomo 2, the list as supplied by VernoWhitney (talk · contribs). The current list is the articles touched by Darius Dhlomo. The original list was the articles created by xem.
Notice that the 'bot will blank each article with
User:Moonriddengirl/CCIdf (probably to be renamed Wikipedia:Contributor copyright investigations/Darius Dhlomo/Article notice)
Full task explanation, linked to by the 'bot's edit summaries
Wikipedia:Contributor copyright investigations/Darius Dhlomo/Task explanation (piped to something like "What is this 'bot doing?")
Further information for editors, linked to from the blanking notice
Wikipedia:Contributor copyright investigations/Darius Dhlomo/How to help

Points discussed and to be discussed:

  • I'm in favour of the template being outside of Template: namespace and in the project namespace, so that Wikipedia mirrors don't mirror the notice. But there are arguments to the contrary. Please discuss at the main discussion page.
  • If this goes ahead, I'm going to be using the same rate limits and whatnot that I used when moving VFD to AFD.
  • The 'bot doesn't have the flag. One could argue that ten thousand or so article blankings will light up a lot of watchlists. But drawing people's attention to a copyright problem with their watched articles is partly the desired outcome.

As I said, things are at the discussion stage. But with this sort of major work I want many people to be forewarned about this. There are currently big unsubtle notices on the Village Pump, the Content Noticeboard, the Administrators' Noticeboard, the 'Bot owners' Noticeboard, and the Centralized Discussion template. Feel free to notify anyone else that you think this misses.

Operator: Uncle G (talk) 16:11, 8 September 2010 (UTC)[reply]

Discussion

If you have an opinion on the task, or a better way to do it, please contribute to Wikipedia:Administrators' noticeboard/Incidents/CCI. That's where everyone else is having the discussion. They won't be paying much attention here. ☺ Uncle G (talk) 16:11, 8 September 2010 (UTC)[reply]

I have a few unrelated technical questions:

  1. Is this going to be done with code successfully used in the past, or is this new code? If the latter, I'd want to throw a few test pages at it just to double check that things won't blow up.
  2. What exactly are the proposed rate limits? If your bot can handle maxlag, I could certainly support a WP:IAR of the limits in WP:BOTPOL in favor of "as fast as maxlag allows" for this particular task. If you would want to do that, it should of course be discussed at Wikipedia:Administrators' noticeboard/Incidents/CCI.
  3. Note that even if the bot has the bot flag, it is now possible for the bot to not flag its individual edits as bot. Even when not applied to edits, the bot flag still gives some advantages to the bot account that may be useful. Can your bot do this? If you want to test it, you can ask for the flag at WP:BN and do some edits in a userspace sandbox.

Anomie 17:22, 8 September 2010 (UTC)[reply]

Heh! This is code so old that it predates both api.php and maxlag. (Successful past use includes raking various sandboxes, archiving my talk page, and of course moving that VFD mountain.) Right at the moment I'm working on checking it through and updating it to use api.php where appropriate (and where necessary — index.php functionality has changed since I last used some of the programs.). If you think that I wasn't going to throw a few test pages at it before throwing ten thousand pages at it live … ☺

Since the 'bot tools predate it, there's no maxlag. (There's no automated retry logic at all. If an edit fails, it fails.) My very simplistic approach to rate limiting was a hardwired delay of a fixed number of seconds between each operation. If you go back to 2005 in the contributions history, you'll see the delay in operation fairly clearly.

As for the flag: That's a discussion for other people, really. It doesn't affect the operation of any of the things that this 'bot will be doing. There are no queries involved, for instance. (I didn't even have a query-making tool until just recently, when I wrote one to perform an external link query for the GeoCities cleanup discussion.) Uncle G (talk) 22:31, 8 September 2010 (UTC)[reply]