Wikipedia:Bots/Requests for approval/Jmax-bot
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
Operator: Jmax-
Automatic or Manually Assisted: Automatic
Programming Language(s): Perl, WWW::Mechanize
Function Summary: Wikipedia:FA counting
Edit period(s) (e.g. Continuous, daily, one time run): Daily
Edit rate requested: 1 edit per day
Already has a bot flag (Y/N): N
Function Details: Currently, it merely counts the number of Featured Articles and places them on User:Jmax-bot/FACounter, as per this request by User:BanyanTree. Perhaps more at a future date, pending any requests.
Discussion
- Quick question - can you explain a little about the whitelist? (And how a newly registered account can't mess it up by making a vandalism at the right moment). Also, if/when it is given the admin flag, I assume it can be retargetted at Template:FA number, right? Raul654 03:43, 11 December 2006 (UTC)[reply]
From the example here:
- One question. What happens if an RC patroller reverts the vandalism? Neither UserA nor VandalReverter would be on the whitelist... what would the bot do then? Titoxd(?!?) 04:30, 11 December 2006 (UTC)[reply]
- Another question: non-admin users routinely change the categories or names (to fix redirects) of featured articles: if the last edit is not an admin or a whitelist user, what will the bot do? (Thanks so much for the help, Jmax.) Sandy (Talk) 04:32, 11 December 2006 (UTC)[reply]
- The bot does not currently support any whitelist, as I was expecting to encounter issues, such as these, that I had not previously prepare for, and it would have been a waste of resources to design a system without fully understanding the requirements. So, I was thinking that in the event the current revision was not made by a whitelisted user, the bot would check the last 5 or so revisions, and comparing them for differences, and attempting to determine if any vandalism had occurred (perhaps someone could clue me in on a good method of determining vandalism). Of course, this introduces some faults: the implementation of the vandalism detection may not, and probably will not, be perfect; as well as the issue of attempting to craft a perfect whitelist for a community-contributed project. Thoughts? --Jmax- 07:16, 11 December 2006 (UTC)[reply]
- Hi Jmax-, Thanks for all your work on this. Last night, I came up with an example of how the whitelist might work with a 15 minute wait after an edit by an editor not on the whitelist on the FA talk page, which I'll paste here for convenient reference:
- 11:30: AdminX removes two articles as part of FARC
- 11:58: UserA blanks the page.
- 12:00: The bot checks the page, sees that UserA is not on the whitelist and waits
- 12:02: WhitelistedUserB reverts the vandalism by UserA to the version by AdminX
- 12:15: 15 minutes having elapsed, the bot checks again. It sees that the last editor was whitelisted and updates the count.
- Both Titoxd and Sandy both make good points above in response. Sandy's point can perhaps be addressed by having the bot check for changes to the FA number, rather than any change at all. So for example: (I have commented out my extended examples in the interests of keeping the viewed page to a manageable size. - BanyanTree 15:14, 12 December 2006 (UTC))[reply]
- My thought is that the vandal spoofing feature is to stop ridiculous changes to the number. An additional feature throttling changes to 5% or less may do that, while the whitelist check and pause greatly reduces the chances of a bad change going through at all. (The FA regulars can state if an update has ever involved a greater proportion than that.) I'm not sure that the effort involved in a technical solution comparing a user who removes an article and one who reverts him, and trying to decide who is correct, is worth the effort. Thoughts on this?
- Also, the discussion at Wikipedia talk:Featured articles suggested an update closer to every hour. Is that possible? - BanyanTree 13:30, 11 December 2006 (UTC)[reply]
- I have a simple suggestion which should be relatively simple to code (simpler than most of the above suggestions). If we have to choose between having the count slightly stale, or allowing vandals the opportunity to vandalize it, I think it's obvious everyone here would choose the former. Bearing that in mind, I suggest: run the bot once an hour. When it runs, it goes to the page history, and selects the last version by a trusted user, and generates the count based on that version. (Only three users regularly add/remove articles from that page - myself, Marskell, and Joel. For now, that should suffice as a list) This technique is 100% vandalproof and should produce very accurate results. Raul654 16:55, 11 December 2006 (UTC)[reply]
- I'll think further on it, but Raul's plan seems to make sense, since the only users who change the number (add or delete FAs per FAC or FARC) are Raul, Joelr31 and Marskell, while other users may make other sorts of changes. A recent (but rare) example: an admin deleted his own featured article, although there had been no FAR. The bot (as proposed) would have allowed that (someone caught it manually). If the count is based on Joelr31, Marskell, and Raul, it will stay accurate. I believe. I track the changes, but if I see a problem, I could alert one of them. Sandy (Talk) 18:20, 11 December 2006 (UTC)[reply]
- I'm a little lost having been out of the loop for a while, but as far as I'm understanding this, Raul's suggestion makes sense. The people taking care of the numbers are pretty close knit and careful about things, and I don't think anything will slip by. Marskell 22:36, 11 December 2006 (UTC)[reply]
- I'll think further on it, but Raul's plan seems to make sense, since the only users who change the number (add or delete FAs per FAC or FARC) are Raul, Joelr31 and Marskell, while other users may make other sorts of changes. A recent (but rare) example: an admin deleted his own featured article, although there had been no FAR. The bot (as proposed) would have allowed that (someone caught it manually). If the count is based on Joelr31, Marskell, and Raul, it will stay accurate. I believe. I track the changes, but if I see a problem, I could alert one of them. Sandy (Talk) 18:20, 11 December 2006 (UTC)[reply]
- I have a simple suggestion which should be relatively simple to code (simpler than most of the above suggestions). If we have to choose between having the count slightly stale, or allowing vandals the opportunity to vandalize it, I think it's obvious everyone here would choose the former. Bearing that in mind, I suggest: run the bot once an hour. When it runs, it goes to the page history, and selects the last version by a trusted user, and generates the count based on that version. (Only three users regularly add/remove articles from that page - myself, Marskell, and Joel. For now, that should suffice as a list) This technique is 100% vandalproof and should produce very accurate results. Raul654 16:55, 11 December 2006 (UTC)[reply]
- Ok, that is what I have gone with. It will search the last 15 revisions for the latest revision by any of Raul654, Marskell, or Joelr31. It will then use that revision as the count. See the updated debug report on User:Jmax-bot/FACounter for an example --Jmax- 09:14, 12 December 2006 (UTC)[reply]
- Looks good - nice work! Sandy (Talk) 09:21, 12 December 2006 (UTC)[reply]
- Ok, that is what I have gone with. It will search the last 15 revisions for the latest revision by any of Raul654, Marskell, or Joelr31. It will then use that revision as the count. See the updated debug report on User:Jmax-bot/FACounter for an example --Jmax- 09:14, 12 December 2006 (UTC)[reply]
(reindent)Yes, that is much simpler (aka better) than what I was thinking. Final-ish questions: (1) can you set it to run hourly and (2) do you foresee any problems with adding Template:FA number as another target, since the page with the update times and list of articles is useful. BanyanTree 15:14, 12 December 2006 (UTC)[reply]
- It is currently running hourly, and, adding another target is no problem whatsoever. --Jmax- 03:24, 13 December 2006 (UTC)[reply]
I also have one final question - what happens if it scans through the past 15 revisions and doesn't find one by myself, Joel, or Marskell? Presumably, it should do nothing in that case. Raul654 21:32, 12 December 2006 (UTC)[reply]
- You presume correctly. Perhaps I should go one step further to make it post on the talk page in that case, or some other way to notify you three in that situtation (currently it merely prints an error). --Jmax- 22:39, 12 December 2006 (UTC)[reply]
- That seems excessive, as it would mean about 23 talk posts a day given that the total on WP:FA changes at best once in 24 hrs. The update will show up on watchlists? That should be oversight enough. Marskell 03:45, 13 December 2006 (UTC)[reply]
- I think Jmax means that if of the last 15 revisons to Wikipedia:FA, none are by either one of you three, to ask at Wikipedia talk:FA? That would be a good idea. The only thing I would like for it is to store in memory the revision_id of the last edit it processed, so if indeed hell freezes over and no "authorized user" edits the page substantially, it doesn't warn over and over again every hour. Titoxd(?!?) 07:48, 14 December 2006 (UTC)[reply]
- That's a given. So, what is the status on this? The bot is ready, and has been for some time. Oh, another question: Should I expand the bot to count WP:GA? --Jmax- 09:46, 14 December 2006 (UTC)[reply]
- I would think GA would be pretty hard to count, because anyone can add and delete. If you're offering to expand, WIkipedia:Former featured articles has a count in the text which myself, Joelr31 and Marskell are keeping manually. The structure of the article is very similar to Wikipedia:FA, except that there is a comment field for tally by area, which is a hassle. Sandy (Talk) 12:34, 14 December 2006 (UTC)[reply]
- That's a given. So, what is the status on this? The bot is ready, and has been for some time. Oh, another question: Should I expand the bot to count WP:GA? --Jmax- 09:46, 14 December 2006 (UTC)[reply]
- I think Jmax means that if of the last 15 revisons to Wikipedia:FA, none are by either one of you three, to ask at Wikipedia talk:FA? That would be a good idea. The only thing I would like for it is to store in memory the revision_id of the last edit it processed, so if indeed hell freezes over and no "authorized user" edits the page substantially, it doesn't warn over and over again every hour. Titoxd(?!?) 07:48, 14 December 2006 (UTC)[reply]
- That seems excessive, as it would mean about 23 talk posts a day given that the total on WP:FA changes at best once in 24 hrs. The update will show up on watchlists? That should be oversight enough. Marskell 03:45, 13 December 2006 (UTC)[reply]
(reindent)It depends on if you want the vandal-spoofing feature, the lack of which is a total deal breaker if the number appears on the Main Page. It would be nice to be able to transclude a FFA count, perhaps to a Template:FFA number with a commented section like {{FA number}}, as a way for final troubleshooting. Otherwise, I'm running out of questions. - BanyanTree 14:05, 14 December 2006 (UTC)[reply]
- I've updated the debug report to include a list of whitelisted users. I have also added FFA counts. See User:Jmax-bot/FFACounter --Jmax-
- Sweet! Can you have the bot put the most recent FFA number into Template:FFA number? We can then transclude the number back into Wikipedia:FFA, so people can avoid the manual update. Thanks! - BanyanTree 01:53, 15 December 2006 (UTC)[reply]
- This is great, Jmax. My count on FFAs is one off from the bot, so as I soon as I finish up some other work, I'll run through that and see what's up. Thanks so much. Back to you later, Sandy (Talk) 02:00, 15 December 2006 (UTC)[reply]
- Sweet! Can you have the bot put the most recent FFA number into Template:FFA number? We can then transclude the number back into Wikipedia:FFA, so people can avoid the manual update. Thanks! - BanyanTree 01:53, 15 December 2006 (UTC)[reply]
Please make sure that there are no "red herrings" in the FFA count debug. It only checks for links between two specified boundaries, and could possibly pick up false positives. I'll run it right now for you. --Jmax- 02:07, 15 December 2006 (UTC)[reply]
- I see WP:FA in the bot count, which appears to coming from the link in the final section header, "Former featured articles that have been re-promoted". - BanyanTree 02:14, 15 December 2006 (UTC)[reply]
- I just finished the other stuff I was working on - is it solved, then, or should I have a look? If the bot is going to be counting, I'm going to delete those goofy commented subcounts by section, which are a chore. Sandy (Talk) 03:19, 15 December 2006 (UTC)[reply]
- Yes, that was it; it was counting the WP:FA in the heading. Sandy (Talk) 03:30, 15 December 2006 (UTC)[reply]
- I just finished the other stuff I was working on - is it solved, then, or should I have a look? If the bot is going to be counting, I'm going to delete those goofy commented subcounts by section, which are a chore. Sandy (Talk) 03:19, 15 December 2006 (UTC)[reply]
Status - As I told Jmax, I'm letting it run for a few days. If nothing problematic happens (and so far, nothing has) then I'm going to set the bot and sysop flags on the Jmax bot account, and then all Jmax has to do is set it to target the FA num template. Raul654 18:52, 14 December 2006 (UTC)[reply]
- Note - I promoted two FAs today and it worked like a charm Raul654 20:17, 14 December 2006 (UTC)[reply]
- Of course! --Jmax- 01:42, 15 December 2006 (UTC)[reply]
Ok, thanks for your help BT! I've made the change and it's updated now. --Jmax- 04:53, 15 December 2006 (UTC)[reply]
Anything w/ sysop privs needs full code published. If anyone read the massive TawkerbotTorA discussion, sysop bots are not exactly a ordinary thing, it's not something any 'crat can just set rights on. At the very least, the code for this one needs to be published -- Tawker 07:50, 15 December 2006 (UTC)[reply]
- I am philosophically opposed to allowing a non-admin to control a bot account with admin rights. If Jmax- publishes his code, I would not be opposed to allowing an existing admin to set up a bot account to run it. Or even just have Jmax- run all the code except the part requiring admin rights and have seperate bot controlled by an admin copy over the figures. Dragons flight 08:05, 15 December 2006 (UTC)[reply]
- I understand fully. If there are any admins with Perl experience who would like to use (perhaps contribute) the bot and accompanying framework (which is currently rather bare, but shall hopefully grow), then I would prefer that. Alternatively, Werdna (on IRC - #wikipedia) suggested that the count be placed in a sub-page of User:Jmax-bot with the suffix '.js', which would make it only editable by that user, and sysops. This feels like a dirty hack, however, and may not be the best solution. In addition, I will publish the code shortly. --Jmax- 08:24, 15 December 2006 (UTC)[reply]
- I agree that giving a bot +sysop is not to be taken lightly. This is like creating a setuid program. The Trusted Computing Base should be minimized. If the only need for sysop priviledges is to update a protected template, I recommend splitting the functionality into two bots, 1) a non-admin bot that updates a non-protected page (or equivalently edit a local file, send an email, etc.), and 2) an admin bot which does the necessary edit to the protected page with published, minimum number of lines of code. —Quarl (talk) 2006-12-15 08:41Z
- Would support the dirty hack (it was my first thought as an alternative), doubt I'd support it as an admin bot. As previous RFAs for bot accounts have shown, people are pretty wary of giving out those privileges unless there is a clear benefit outweighing the downside, in the scheme of things I can't see saving an admina single edit per day (or the FA count being wrong for while) being considered to be that huge a benefit. --pgk 11:48, 15 December 2006 (UTC)[reply]
- I was neutral on TawkerbotTorA, but at least there was a decent rationale (a few hundred blocks per month). This is one edit per day? Opposed to sysopping it. Thatcher131 15:19, 15 December 2006 (UTC)[reply]
- I'd also support the .js subpage, which makes a lot more sense than sysopping the bot. --ais523 16:09, 15 December 2006 (UTC)
The hack is not a good idea. However, I don't have any objections to Jmax supplying the code to someone who does have an admin account, who could schedule it as a cron job on their own admin account. I looked over the code myself a few days ago, and it seems straightforward (or as straightforward as Perl can be, at any rate). Raul654 17:01, 15 December 2006 (UTC)[reply]
- Can you expand on why it is not a good idea? I can't see any technical issues. --pgk 17:59, 15 December 2006 (UTC)[reply]
- I don't see technical issues - it's just a major misuse of the .js file for something it was never intended for. It feels... icky. Raul654 18:06, 15 December 2006 (UTC)[reply]
- That's the feeling of a hack, alright. Anyways, if consensus proves to be to use the hack, I'll gladly do it. --Jmax- 22:18, 15 December 2006 (UTC)[reply]
- I don't see technical issues - it's just a major misuse of the .js file for something it was never intended for. It feels... icky. Raul654 18:06, 15 December 2006 (UTC)[reply]
OK, in that case, Jmax, can you please modify the bot to add the number (and just the number) to the Jmax bot's .js file? Raul654 18:18, 16 December 2006 (UTC)[reply]
Approved and article count put on the main page -- Tawker 08:08, 17 December 2006 (UTC)[reply]
Is the FFA Wikipedia:Former featured articles count also approved, and if so, where is it being put?Sandy (Talk) 15:51, 17 December 2006 (UTC)[reply]- I see it's already done - thanks ! Sandy (Talk) 16:00, 17 December 2006 (UTC)[reply]
Marskell removed an article, and it updated correctly in both places. Sandy (Talk) 17:31, 18 December 2006 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.