Wikipedia:Bot requests/Archive 57
This is an archive of past discussions on Wikipedia:Bot requests. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current main page. |
Archive 50 | ← | Archive 55 | Archive 56 | Archive 57 | Archive 58 | Archive 59 | Archive 60 |
MySQL expert needed
The Wikipedia 1.0 project needs someone with experience working with large (en.wikipedia-size) MySQL databases. If interested, contact me or the WP 1.0 Editorial Team.— Wolfgang42 (talk) 02:19, 20 October 2013 (UTC)
- Um, what specifically do you need help with? You'll probably get more people who can help if they know specifically what you need help with. Legoktm (talk) 02:24, 20 October 2013 (UTC)
- Writing various queries to pull information out of the database. The problem is that the en.wikipedia database is very large, and I don't know enough about MySQL to be able to work with a dataset where queries can take weeks to complete. — Wolfgang42 (talk) 18:02, 20 October 2013 (UTC)
- If queries take that long something is wrong with the database configuration. More often than not you need to add and/or use indexes. If you provide the table structure and index list queries shouldn't take more than a few hours at most. Werieth (talk) 18:08, 20 October 2013 (UTC)
- Can you paste the queries? If you're running on labs, there are different databases like revision_userindex which would be faster if you need an index upon user. Legoktm (talk) 21:10, 20 October 2013 (UTC)
- The query (being run on the Labs database) is:
- Writing various queries to pull information out of the database. The problem is that the en.wikipedia database is very large, and I don't know enough about MySQL to be able to work with a dataset where queries can take weeks to complete. — Wolfgang42 (talk) 18:02, 20 October 2013 (UTC)
SELECT page_title,
IF ( rd_from = page_id,
rd_title,
/*ELSE*/IF (pl_from = page_id,
pl_title,
/*ELSE*/
NULL -- Can't happen, due to WHERE clause below
))
FROM page, redirect, pagelinks
WHERE (rd_from = page_id OR pl_from = page_id)
AND page_is_redirect = 1
AND page_namespace = 0 /* main */
ORDER BY page_id ASC;
— Wolfgang42 (talk) 22:53, 23 October 2013 (UTC)
- The results from the second column seem odd, mixing varbinary & int data and the OR in the where clause doesn't help with the preformance. What exactly are you wanting to get from the database? -- WOSlinker (talk) 23:39, 23 October 2013 (UTC)
- You're right—I pasted an older version of the code; I've fixed it to be the title both times. (My mistake for not checking that I had the latest copy in version control.) This query is a direct translation of an agglomeration of perl, bash, and C code which was parsing the SQL dumps directly. What it's trying to do is find redirect targets by looking in the redirect table, and falling back to the pagelinks table if that fails.
- I would suspect that the 3-way join isn't helping performance any either, but unfortunately it seems to be needed. If there's a better way to do this, I'd love to see it. — Wolfgang42 (talk) 02:30, 24 October 2013 (UTC)
Try this and see if it works any better. -- WOSlinker (talk) 06:00, 24 October 2013 (UTC)
SELECT page_title, COALESCE(rd_title, pl_title)
FROM page
LEFT JOIN redirect ON page_id = rd_from
LEFT JOIN pagelinks ON page_id = pl_from
WHERE page_is_redirect = 1
AND page_namespace = 0 /* main */
ORDER BY page_id ASC;
- Putting the EXPLAIN keyword in front of the query will return the execution plan, indexes used, etc. --Bamyers99 (talk) 19:40, 24 October 2013 (UTC)
Request for a bot for WikiProject Military history article reviews per quarter
G'day, WPMILHIST has a quarterly awards system for editors that complete reviews (positive or negative) of articles that fall within the WikiProject. So far we (the project coordinators) have done this tallying manually, which is pretty labour-intensive. We have recently included GA reviews, and are having difficulty identifying negative GA reviews using the standard tools. We were wondering if someone could build a bot that could tally all FA, FL, A-Class, Peer and GA reviews of articles that fall within WikiProject Military history? In terms of frequency, we usually tally the points and hand out awards in the first week after the end of each quarter (first weeks in January, April, July and October), but it would be useful functionality to be able to run the bot as needed if that is possible. Regards, Peacemaker67 (send... over) 23:36, 13 October 2013 (UTC)
- If this comes up, lemme know, 'kay? Maybe some of the other projects might be able to use it as well. :) John Carter (talk) 23:43, 13 October 2013 (UTC)
- Hi, could someone clarify if I am in the wrong place (ie is this not a bot thing)? Thanks, Peacemaker67 (send... over) 03:09, 18 October 2013 (UTC)
- G'day all, is this something a bot could do? Peacemaker67 (send... over) 19:59, 25 October 2013 (UTC)
- Hi, could someone clarify if I am in the wrong place (ie is this not a bot thing)? Thanks, Peacemaker67 (send... over) 03:09, 18 October 2013 (UTC)
New REFBot - feedback on user talkpages
20:11, 17 October 2013 > message A reference problem
- 23:20, 17 October 2013 Thanks for consulting. Cheers,...
05:19, 18 October 2013 > message A reference problem
- 02:57, 21 October 2013 Reply - I did some page splits. I split List of Princeton University people (United States Congress, Supreme Court, Continental Congress and Constitutional Convention) from List of Princeton University people (government). The unlinked references at the former more than likely come from the latter. I have fixed two, and hope to fix the rest over time. Thank you for bringing this to my attention.
11:19, 22 October 2013 > message A reference problem Two replys
- 11:28, 22 October 2013 Not me Squire!
- 11:28, 22 October 2013 OOPS, was me -fixed! (This editor is a Senior Editor II and is entitled to display this Rhodium Editor Star.)
Bot to download and reupload images to resolve AU legal concerns
The discussion is at Wikipedia talk:WikiProject Australian Roads/Shields, but in summary, there are sets of images transferred from Commons to here as {{PD-ineligible-USonly}}. The user that moved the files (downloaded them from commons then uploaded them here) wants to remove his involvement due to potential legal issues in Australia. Under existing policy, revdel, oversight, and office actions are not appropriate. It was suggested that a bot could upload the same files under a different name and nominates the old ones for deletion per WP:CSD#F1. - Evad37 (talk) 06:42, 26 October 2013 (UTC)
Marking talk pages of Vital articles
Can someone make a bot to mark all talk pages of Vital articles (all levels) with {{VA}}, and fill out its parameters (level, class, topic) if possible. It should also remove such templates from non-VAs.
Ideally this should run on a regular basis, but even a one-off run would be very helpful. -- Ypnypn (talk) 18:48, 28 October 2013 (UTC)
Bot to tag "PROD Survivors" and "Recreated Articles"
In this first paragraph, I will summarize my request: It would be good if someone could please create a bot which tags articles which were PRODded but survived (I shall call these "Survivors"). And/or which tags articles which were PROD-deleted then recreated (I shall call these "Recreated Articles"). You may tag them with {{old prod full}}. You may leave all the template's parameters blank, or you may fill some in.
Rationale: Such tags warn us not to re-add another PROD tag. They also make it more obvious to us that perhaps we should consider nominating the page for WP:AfD.
Here are some things you could do, but which I don't recommend: You could download a database dump with history, parse it, and look for Survivors. But such a dump is over ten terabytes of XML once uncompressed.[1] You could download the dump of all logs, download the dump of all page titles, parse the two, and look for Recreated Articles. User:Tim1357 tried parsing a dump,[2], but he didn't succeed: the matter is still on the latest revision of his to-do list. I suspect it may not be worth attempting either of these difficult tasks.
Here is what I do recommend: It would be worthwhile to create a bot to watch Category:Proposed deletion and tag future Survivors. And to watch for new pages and tag Recreated Articles. User:Abductive suggests some optional refinements.[3]
It would be good if someone could please start writing a bot to do either or both of these tasks. It would be even better if they could provide us with a link to their code-in-progress. User:Kingpin13 and User:ThaddeusB have expressed interest,[4] but nobody seems to have actually written any code to do these tasks on the live Wikipedia.
User:Rockfang started tagging Survivors in 2008 using AWB (the wrong tool for the job) but later stopped. S/he wrote that s/he "got distracted".
AnomieBOT already does one related task. If an article is AfDed, then recreated, AnomieBOT's NewArticleAFDTagger task puts {{old AfD multi}} on that article's talk page. The task's open-source[5] code is here. Maybe you could build on it, and maybe you could even ask Anomie to run it for you. Dear User:Anomie: Do you know if you or any bot ever tagged the pages which were recreated in the years before you wrote your bot?
Cheers, —Unforgettableid (talk) 04:32, 16 October 2013 (UTC)
- For the record, I'm a "he". :) Rockfang (talk) 05:17, 16 October 2013 (UTC)
- I do not know of anyone who went back and tagged all articles that had ever been deleted through AfD.
- I considered the recreated-after-prod tagging at one point. But the task would require keeping a list of every article that was ever prodded and then deleted without the prod tag being removed, which I didn't think was worthwhile. The AfD tagging is easier, since the bot can just look for Wikipedia:Articles for deletion/{{PAGENAME}}. Anomie⚔ 11:45, 16 October 2013 (UTC)
- I have investigated and found that probably somewhere between 95% and 100% of PROD-deleted articles have the all-caps string "PROD" somewhere in their deletion logs. So, detecting Recreated Articles would be easier than you think. :) Cheers, —Unforgettableid (talk) 19:22, 16 October 2013 (UTC)
- "somewhere between 95% and 100%"? Which is it? Anomie⚔ 21:04, 16 October 2013 (UTC)
- Out of the couple dozen PROD-deleted articles I checked, each and every one had the string somewhere in their deletion logs. But my sample size was so small that I cannot claim with certainty that 100% of PROD-deleted articles have it in their logs. —Unforgettableid (talk) 00:40, 18 October 2013 (UTC)
- Whether the number is 95%, 100%, or somewhere in between, searching for the string is quite easy and quite effective. ISTM it's the best way to identify Recreated Articles. Dear all: what do you think? —Unforgettableid (talk) 06:43, 4 November 2013 (UTC)
- I think the best way to handle it is get the All articles proposed for deletion category periodically. If an article was in one iteration and not the next, it was either deleted or the tag was removed. --ThaddeusB (talk) 17:55, 4 November 2013 (UTC)
- Whether the number is 95%, 100%, or somewhere in between, searching for the string is quite easy and quite effective. ISTM it's the best way to identify Recreated Articles. Dear all: what do you think? —Unforgettableid (talk) 06:43, 4 November 2013 (UTC)
- Out of the couple dozen PROD-deleted articles I checked, each and every one had the string somewhere in their deletion logs. But my sample size was so small that I cannot claim with certainty that 100% of PROD-deleted articles have it in their logs. —Unforgettableid (talk) 00:40, 18 October 2013 (UTC)
- "somewhere between 95% and 100%"? Which is it? Anomie⚔ 21:04, 16 October 2013 (UTC)
- I have investigated and found that probably somewhere between 95% and 100% of PROD-deleted articles have the all-caps string "PROD" somewhere in their deletion logs. So, detecting Recreated Articles would be easier than you think. :) Cheers, —Unforgettableid (talk) 19:22, 16 October 2013 (UTC)
- I have long intended to make a bot to tag PROD survivors... is anyone else planning on programming this? If not, I can try to get started on it next week. --ThaddeusB (talk) 19:25, 18 October 2013 (UTC)
- Dear ThaddeusB: If you do end up writing such a bot, please do let us know. :) Cheers, —Unforgettableid (talk) 06:33, 4 November 2013 (UTC)
- If so, please name it CattlePROD Bot! Headbomb {talk / contribs / physics / books} 17:34, 4 November 2013 (UTC)
- Since no one else seems interested, I will try to get to this by the end of the week. --ThaddeusB (talk) 17:55, 4 November 2013 (UTC)
- If so, please name it CattlePROD Bot! Headbomb {talk / contribs / physics / books} 17:34, 4 November 2013 (UTC)
- Dear ThaddeusB: If you do end up writing such a bot, please do let us know. :) Cheers, —Unforgettableid (talk) 06:33, 4 November 2013 (UTC)
Redirects in templates after page moves
Per WP:BRINT, redirects are undesirable in templates. Currently after a page move, bots (bless their hearts) sweep up all of the broken or double redirects etc., but the links in templates are left untouched. For instance, a page was moved from here to here in January but the accompanying template was not updated until today. Is is possible for a bot to fix redirects on templates that are on a page that is moved? Rgrds. --64.85.216.235 (talk) 05:51, 4 November 2013 (UTC)
- Possible order of priority here. It may not be needed to move the redirect if the template is not actually on the page being moved. For example, if the Professional Fraternity Association changed its name to something else, it would cause a redirect on Template:Kappa Kappa Psi since that has a link to the PFA, but wouldn't need to be fixed as badly since the PFA page doesn't include the Kappa Kappa Psi template. — Preceding unsigned comment added by Naraht (talk • contribs) 17:47, 4 November 2013 (UTC)
- Yes, to reiterate, a bot that can fix redirects on the templates that are currently on the page that is moved is the priority. Templates that link to the article but are not on the moved page are not priority. Rgrds. (Dynamic IP, will change whem I log off.) --64.85.216.79 (talk) 20:55, 4 November 2013 (UTC)
Star Wars Bot needed?
Bot for Star Wars articles needed, maybe? Might help monitor changes.20-13-rila (talk) 11:19, 5 November 2013 (UTC)
- This is not a proper bot task request that can be implemented, especially without detail. What changes is it supposed to monitor? — HELLKNOWZ ▎TALK 11:17, 5 November 2013 (UTC)
- I was thinking that it might help with the Star Wars WikiProject, which I am a member of. I am not sure if it is needed, which is why I would like to discuss it. 20-13-rila (talk) 11:35, 5 November 2013 (UTC)
- May I suggest you discuss this with the project first and come up with a concrete proposal of what task(s) can be done and how. Without further detail, I doubt you will find many interested parties on this page (which is for requesting specific tasks). — HELLKNOWZ ▎TALK 11:37, 5 November 2013 (UTC)
- Thank you, Rila 20-13-rila (talk) 09:32, 6 November 2013 (UTC)
- May I suggest you discuss this with the project first and come up with a concrete proposal of what task(s) can be done and how. Without further detail, I doubt you will find many interested parties on this page (which is for requesting specific tasks). — HELLKNOWZ ▎TALK 11:37, 5 November 2013 (UTC)
- I was thinking that it might help with the Star Wars WikiProject, which I am a member of. I am not sure if it is needed, which is why I would like to discuss it. 20-13-rila (talk) 11:35, 5 November 2013 (UTC)
Mash together two FA-related lists?
We have WP:FANMP (a list of FAs yet to appear on the main page) and WP:WBFAN (a list of FAs and former FAs by nominator). Can someone think of a way to produce a hybrid for me, i.e. a list of FAs yet to appear on the main page by nominator? BencherliteTalk 20:30, 5 November 2013 (UTC)
Shutdown of blogs.amd.com
It seems that the articles have been moved to http://community.amd.com and http://developer.amd.com. I think all links to http://blogs.amd.com should be marked with {{dead link}} at least. Please fix them semi-automatically if you can. --4th-otaku (talk) 12:35, 4 November 2013 (UTC)
- That's fine, I'll do them all quickly enough tomorrow. Rcsprinter (orate) @ 00:09, 6 November 2013 (UTC)
- I can't seem to find any articles which link to blogs.amd.com. Rcsprinter (message) @
- [6] There's not a whole lot. — HELLKNOWZ ▎TALK 17:12, 8 November 2013 (UTC)
- Good find; it'll have to be tomorrow I run the thing. Rcsprinter (post) @ 22:00, 8 November 2013 (UTC)
- [6] There's not a whole lot. — HELLKNOWZ ▎TALK 17:12, 8 November 2013 (UTC)
- I can't seem to find any articles which link to blogs.amd.com. Rcsprinter (message) @
Help needed tracking recent changes to medical content
User:Femto Bot used to populate Wikipedia:WikiProject Medicine/Recent changes which in turn updated Special:RecentChangesLinked/Wikipedia:WikiProject Medicine/Recent changes. I think that's how it worked. It reported all changes to pages with {{WPMED}} on their talk page. Anyway, it was an awesome tool for patrolling some of Wikipedia's most sensitive content. But since Rich Farmborough was banned from bot work it's stopped working - it only reports recent changes to pages beginning with "A".
This tool aims to do the same thing but it's slow and often times out, and when it works it's running a couple of days behind.
There was also Tim1357's tool, but his account has expired from the Toolserver.
I was wondering if somebody here would be able to provide WP:MED with something to replace these? With something like this a handful of experienced medical editors can effectively patrol all of Wikipedia's medical content. Without it, there's no telling what's happening. --Anthonyhcole (talk · contribs · email) 17:58, 4 November 2013 (UTC)
- It appears that the source code for the bot is not available. I see you have attempted to contact him; I will take it over if you are successful getting the code, I can run it if needed. However, he will not be able to run it by himself I fear, due to ArbCom. --Mdann52talk to me! 13:23, 5 November 2013 (UTC)
- Should be fairly trivial to write something like this up. Werieth (talk) 13:40, 5 November 2013 (UTC)
- 1. See VPT for current development.
- 2. I have asked for a module solution in Lua (negative for the full automation)
- 3. I am putting a fresh page up manually right now.
- 4. I moved the RELC page to Wikipedia:WikiProject Medicine/List of pages/Articles for future automation and expansion (and because the old page name was incorrect).
- Will be back later on. -DePiep (talk) 14:29, 5 November 2013 (UTC)
- Should be fairly trivial to write something like this up. Werieth (talk) 13:40, 5 November 2013 (UTC)
- Would something like User:Werieth/sandbox work for you? Werieth (talk) 15:25, 5 November 2013 (UTC)
- Yes. (just curious: my page is 795k (28391 articles), yours is without *bullets is 871k - uses another source category? I started AWB for this, checking four cats deep.)
- Now, the MED people are served and I have little time today & tomorrow. So I'll pick it up later. In short, this is my concept around the bot action:
- A project editor can put a notice (template) on the Project page. The template is called like "{{RELC list: please bot make some RELC lists for this project}}". Parameters are set for:
|RELC list1 namespace1=Article [space] + Talk [space]
,|RELC list2 namespace2=Template + template talk
,|other parameters like 1x/month=
. The template is invisible. Just like what User:MiszaBot/config does on talkpages to archive. - The bot sees the request and writes the list on a dedicated "RELC list" page (in its own section: say ==Pages==. The bot is not the only one that writes on that page).
- Systematic page names are build like:
- Wikipedia:WikiProject Medicine/List of pages our top page
- Wikipedia:WikiProject Medicine/List of pages/Articles
- Wikipedia:WikiProject Medicine/List of pages/Articles + Talks 0-9 A-M
- Wikipedia:WikiProject Medicine/List of pages/Articles + Talks N-Z
- Wikipedia:WikiProject Medicine/List of pages/Articles + Talks
- Wikipedia:WikiProject Medicine/List of pages/Templates
- Wikipedia:WikiProject Medicine/List of pages/Templates + Template talks
- Wikipedia:WikiProject Medicine/List of pages/Non-articles
- Wikipedia:WikiProject Medicine/List of pages/Non-articles + non-articles talks
- The naming suggestion is: first use namespace names into plural; readers see this on top of the RELC special page, so a natural page name is valuable. We also need codes for those "all non-articles" and "A-M" requests.
- A template for project page, now {{RELC list}}, will use these page definitions too (so we must agree on the names and other protocols), and produces the special links on a project page (as {{RELC list}} does for WP:MED now).
- There are also other templates like {{RELC list/Listpage header}}
- FYI, I build such a set, list pages maintained manually, for WP:ELEMENTS at{{WikiProject Elements page lists}}.
- Trick: the page should contain its own name, so the RELC reader sees: "page was updated on ...".
- Trick: necessary off-topic pages, like the header template, would appear in the RELC view after edits (disturbing the view because itself not on topic). I created and used a Redirect, which does not change and so does not appear in the special view.
- Will go writing on the WT:MED page now.
- See you Thursday. -DePiep (talk) 16:50, 5 November 2013 (UTC)
- Im pretty sure that I can generate a list based on any criteria you need. User:Werieth/sandbox didnt use a category, but rather it used a list of all pages that had {{WikiProject Medicine}}. Defining how you want the lists generated should be doable, we would just need to define a template setup similar to User:MiszaBot/config. The important factors for getting this going is to clearly and simply define things. Break it down to the very basics of what you are looking for, dont factor in how something is done, just what you want done, leave the how for me. Werieth (talk) 17:52, 5 November 2013 (UTC)
- If you're asking what functionality WP:MED needs, I was very happy with Special:RecentChangesLinked/Wikipedia:WikiProject Medicine/Recent changes in terms of speed and features. --Anthonyhcole (talk · contribs · email) 18:44, 5 November 2013 (UTC)
- re Anthonyhcole. The page you mention had its last update in 2012. That could be solved of course by updating it today. But there is also this: it was 35k in size, which means it listed only a small part of all the WP:MED pages. See this old version of that page. How small? Today the 'updated' page (named Wikipedia:WikiProject Medicine/List of pages/Articles; big page) has 28.391 MED articles listed, and is 700k. That means your page only listed 35/700=5% or 1400 pages. It did not serve its purpose, one never saw that B-cell chronic lymphocytic leukemia (first B page) was edited. Different check: today I checked the RELC workings with only the MED articles starting with "A" (2500 pages, 70k page). So the old page not even had the "A" complete. It was missing 95% of its target. How was that a good feature?
- About speed: opening the Special page to show the edits (WP:MED Articles - Related changes), the special page we want, has acceptable speed, is not slow (for me). Anyway we should not "improve" it by leaving MED articles out at random, do we. It is opening the big list page itself that is slow (700k). That is why I advise readers to leave that page alone, and only the Special page RELC reads it (fast) to produce the desired overview.
- If I am missing something, or mistaking your point, please tell me. User experiences (good and bad) are best reported at WT:MED. -DePiep (talk) 19:23, 5 November 2013 (UTC)
- If you're asking what functionality WP:MED needs, I was very happy with Special:RecentChangesLinked/Wikipedia:WikiProject Medicine/Recent changes in terms of speed and features. --Anthonyhcole (talk · contribs · email) 18:44, 5 November 2013 (UTC)
- Im pretty sure that I can generate a list based on any criteria you need. User:Werieth/sandbox didnt use a category, but rather it used a list of all pages that had {{WikiProject Medicine}}. Defining how you want the lists generated should be doable, we would just need to define a template setup similar to User:MiszaBot/config. The important factors for getting this going is to clearly and simply define things. Break it down to the very basics of what you are looking for, dont factor in how something is done, just what you want done, leave the how for me. Werieth (talk) 17:52, 5 November 2013 (UTC)
- re Werieth. OK, what you say is what I meant to say (in a hurry). And yes I'll define stuff more crisp and clear. We have a start. I suggest we develop this bot+template project over at Template talk:RELC list from now on. See you there? This thread can be closed I guess, since you picked it up. -DePiep (talk) 19:33, 5 November 2013 (UTC)
- It looks like AnomieBOT's WatchlistUpdater does what is needed here. Anomie, would there be any problems with setting up a watchlist page for WP:MED? — Mr. Stradivarius ♪ talk ♪ 02:12, 6 November 2013 (UTC)
- I'd have to seek approval for that task to edit outside the bot's userspace, and I don't recall offhand if the code is set up to handle 28000+ articles. It sounds like Werieth is working on a bot for this, so I'll leave it to them. Anomie⚔ 02:21, 6 November 2013 (UTC)
- Fair enough - thank you for your quick response, and thank you Werieth for taking this on! — Mr. Stradivarius ♪ talk ♪ 03:09, 6 November 2013 (UTC)
- I'd have to seek approval for that task to edit outside the bot's userspace, and I don't recall offhand if the code is set up to handle 28000+ articles. It sounds like Werieth is working on a bot for this, so I'll leave it to them. Anomie⚔ 02:21, 6 November 2013 (UTC)
- FWIW, I have a clone of Tim's tool at [7]. (Yes, I'll fix the mixed content issues eventually). Legoktm (talk) 02:25, 6 November 2013 (UTC)
- FYI, the tool is been changed from "RELC list" into Page reports. -DePiep (talk) 21:51, 8 November 2013 (UTC)
Bot for adding links to OLAC resources about languages
The "OLAC" (Open Language Archives Community) website has consistently helpful pages about resources for the languages of the world, especially the endangered and lesser-taught languages. The OLAC pages use a URL which ends with a three-letter code from the ISO 639-3 language code list, which is found in our language articles infobox. Each OLAC page has a nice descriptive title at the top, such as OLAC resources in and about the Aguaruna language.
Rather than adding several thousand OLAC page links to the External links sections of language articles by hand, couldn't we just write a bot to do this?
I know some languages have multiple language codes in their Wikipedia infobox, due to multiple dialects or language variants. Even if the bot didn't add links for languages with multiple codes, it would still be a big time-saver!
What do you think? Djembayz (talk)
- If you look at ǂKx'ao-ǁ'ae you'll see that it already has a language infobox, and that includes a link off to an external site. Why not modify the infobox to add the language-archives.org link? Josh Parris 00:10, 9 November 2013 (UTC)
Switch Internet Archive links to HTTPS
Without getting too deep into tin foil territory, encrypting is one of many essential steps to ensure readers' privacy. Since October 24, 2013, the Internet Archive now uses HTTP Secure (https://
) by default [8]. Just this week they updated their server software so it can handle TLS 1.2, the latest version. It is safe to say they encourage their visitors to access their site using an encrypted connection.
In my opinion, Wikipedia should support this effort and switch all outgoing links to the Internet Archive to HTTPS. According to Alexa, Wikipedia currently ranks fourth among upstream sites to archive.org
[9]. {{Wayback}} was already updated in that regard, but most of the links to the Wayback Machine are implemented in one of the many citation templates as encouraged at WP:WBM. I started to fix a lot of those links manually, before realizing it would be a perfect job for a bot.
The Wayback Machine links have a common scheme, e.g. https://web.archive.org/web/20020930123525/http://www.wikipedia.org/
. So the task is this: find http://web.archive.org/web/
throughout the article namespace and replace with https://web.archive.org/web/
. That's it. --bender235 (talk) 20:51, 8 November 2013 (UTC)
- See WP:NOTBROKEN and WP:COSMETICBOT Werieth (talk) 21:08, 8 November 2013 (UTC)
- This is not a cosmetic change. It's not like switching
http://archive.org/
tohttp://web.archive.org/
, which would indeed change nothing. But switching tohttps
changes the transport mechanism, from unencrypted to encrypted. Even tho it looks simple, it has significant consequences. --bender235 (talk) 21:17, 8 November 2013 (UTC)- In this case changing the transport protocol doesnt make much of a difference. No data other than the page contents (which can easily be retrieved via both secure and non-secure methods) is being transmitted. Thus the possible intercepted data risk is null. All it would do is generate a false sense of security. If you really think it should be done, you might look into a lua replacement module that can be plugged into the citation templates. Werieth (talk) 21:27, 8 November 2013 (UTC)
- Lua is a most cluefull suggestion. Josh Parris 23:51, 8 November 2013 (UTC)
- In this case changing the transport protocol doesnt make much of a difference. No data other than the page contents (which can easily be retrieved via both secure and non-secure methods) is being transmitted. Thus the possible intercepted data risk is null. All it would do is generate a false sense of security. If you really think it should be done, you might look into a lua replacement module that can be plugged into the citation templates. Werieth (talk) 21:27, 8 November 2013 (UTC)
- This is not a cosmetic change. It's not like switching
- I agree with Werieth, this is a huge number of edits (over 160,000 in mainspace) for something that's not broken. If we really want to do this, it would be better as something like a low-priority (only done in combination with more significant changes) change in another tool like AWB. Mr.Z-man 21:22, 8 November 2013 (UTC)
- Okay, I'll do that. --bender235 (talk) 23:30, 8 November 2013 (UTC)
- I think you've misunderstood. I said it should be done only in combination with more substantial changes. I certainly wasn't saying to go and make 160,000 edits with AWB. Mr.Z-man 00:34, 9 November 2013 (UTC)
- I won't do that. I just added it to the regular typofixing scan I do regularly anyway. --bender235 (talk) 00:36, 9 November 2013 (UTC)
- Note to everyone: I started a discussion on this over at Village Pump. --bender235 (talk) 10:40, 9 November 2013 (UTC)
- I think you've misunderstood. I said it should be done only in combination with more substantial changes. I certainly wasn't saying to go and make 160,000 edits with AWB. Mr.Z-man 00:34, 9 November 2013 (UTC)
- Okay, I'll do that. --bender235 (talk) 23:30, 8 November 2013 (UTC)
Adding ISOC (international), SOC (US) and NOC (Canada) job codes to professions infoboxes
Would it be possible to import those 3 standardized codes into the professions infoboxes ? --Teolemon
- Importing CNP Codes (Quebec and Canada)
- CNP Occupation Code (en)
- Code for a given occupation in Canada and Québec
- number with 4 figures, List at http://www5.rhdcc.gc.ca/cnp/francais/cnp/2011/RechercheRapide.aspx?val65=*
- value for Librarian is : 5111
- Usage guidelines can be found at http://www5.hrsdc.gc.ca/NOC/English/NOC/2011/Introduction.aspx
- Importing International Standard Classification of Occupations Codes (International)
- International Standard Classification of Occupations (en)
- Standard International Classification code for jobs. "ISCO is a tool for organizing jobs into a clearly defined set of groups according to the tasks and duties undertaken in the job."
- XLS Structure of those codes: http://www.ilo.org/public/english/bureau/stat/isco/index.htm
- Value for Librarian is: 2622 (Librarians and related information professionals)
- Importing SOC Codes (US)
The individual occupation items don't have yet any SOC codes associated with them, but they are in broad occupation categories in enwiki that should make it easier to match:
- The page explaining the SOC system (enwiki)
- Example of one of the broad categories (enwiki)
Here's the list of SOC codes for matching with the existing items.
— Preceding unsigned comment added by 2A01:E35:2EA8:950:5BF:1AF3:3374:F3D0 (talk • contribs)
New REFBot
copied from WP:VPT --Frze > talk 07:17, 18 October 2013 (UTC)
DPL bot and BracketBot are the best inventions of Wikipedia. It's time for a new Bot. We need the
If a user contributes a broken reference name, an incorrect ref formating (or a missing reflist), please inform the user who caused this error. It is so outrageously hard work to correct all these errors afterwards, from someone who is not holding the factual knowledge. For example: it took me a week to work up the backlog of Category:Pages with broken reference names - more than 1500 items, some disregarded more than two years. Search with WikiBlame for first entry of ref, making the changes, inform the users... annoying. Thank you very much. --Frze > talk 12:25, 17 October 2013 (UTC)
- I ask for little consideration please. It takes several minutes of work - only because of a lack of character. Ten times and more per day. For example see Cite error: The named reference Media2 was invoked but never defined What' wrong? > Compare selected versions > Fix broken reference name. Why doesn't a Bot send a message to the polluter of the error? Why must other users rid of the mess? With BrackBot and DPLBot it is so easy --Frze > talk 06:44, 18 October 2013 (UTC)
- Example:
== A reference problem ==
Hi [[User:SpidErxD|SpidErxD]]! Some users have been working hard on [[:Category:Pages with broken reference names]].
[https://en.wikipedia.org/enwiki/w/index.php?title=Nuclear_program_of_Iran&diff=577623223&oldid=577620891 Here] you added new references '''ref name=OPBW''' and '''ref name="status"''' but didn't define it. This has been showing as an error at the bottom of the article. <small>'''''Cite error: The named reference was invoked but never defined.'''''</small> Can you take a look and work out what you were trying to do? Thanks --User:REFBot
- Let's try... See User talk:SpidErxD#A reference problem Thanks --Frze > talk 08:00, 18 October 2013 (UTC)
- A bot-issued message would have to be much vaguer than that. If revision X has no error, and revision X+1 has an error "The named reference Foo was invoked but never defined", then many different things could have gone wrong. The edit could have added the named reference, or deleted the last call to it, or accidentally disabled the last call to it by damaging the syntax of some earlier reference or some template. -- John of Reading (talk) 08:21, 18 October 2013 (UTC)
- I've been working on broken reference name problems for quite a while. A bot along the lines of BracketBot would be very helpful. Many of the "ref name" problems happen when experienced users are working on pages, copying material between articles, or doing cleanup rapidly. (AWB users like to change hyphens or spaces in reference names for example.) Experienced users often don't slow down to preview pages and look for errors at the bottom. These are the editors who would immediately fix problems if they were informed. AnomieBOT 's orphan reference fixer does a great job catching many things, but those put in accidentally by experienced users are often more subtle and harder to track down.
- Unlike BracketBot it would be hard to give a message that pinpointed the problem, Also unlike bracket problems, reference errors are easy to see in the article. It would be enough just to put a copy of the generated error message on the user's talk page. StarryGrandma (talk) 18:19, 18 October 2013 (UTC)
- Regardless of the details of how it would work out, I see this proposal as a wonderful idea. If we want a vague thing, we could use something like "$EDIT1 you made to $PAGE2 resulted in a reference coding error. Please return to the page to fix the problem. If you would like assistance with fixing this problem, please visit the Help Desk, where other editors will be happy to assist you." It would be sufficient for the experienced editors, and inexperienced editors would either understand what to do or they'd know where to go to get help. Nyttend (talk) 01:52, 19 October 2013 (UTC)
- I like the idea a lot!
- As for the wording, pooling from Nyttend, 64.40.54.174 and User:BracketBot/inform; how about something like: " Hello, I'm REFBot. I have automatically detected that [your edit] to [page] may have caused a reference coding error. Please take a look at the page and edit it to fix the problem if you can. If you would like assistance with fixing this problem, please visit the Help Desk, where other editors will be happy to assist you. If I misunderstood what happened, or if you have any questions, you can leave a message on [my operator's talk page]. Thanks ~" benzband (talk) 10:37, 22 October 2013 (UTC)
- Regardless of the details of how it would work out, I see this proposal as a wonderful idea. If we want a vague thing, we could use something like "$EDIT1 you made to $PAGE2 resulted in a reference coding error. Please return to the page to fix the problem. If you would like assistance with fixing this problem, please visit the Help Desk, where other editors will be happy to assist you." It would be sufficient for the experienced editors, and inexperienced editors would either understand what to do or they'd know where to go to get help. Nyttend (talk) 01:52, 19 October 2013 (UTC)
- Unlike BracketBot it would be hard to give a message that pinpointed the problem, Also unlike bracket problems, reference errors are easy to see in the article. It would be enough just to put a copy of the generated error message on the user's talk page. StarryGrandma (talk) 18:19, 18 October 2013 (UTC)
- Agree a bot like this would be a good idea. It might be hard to eliminate false positves though. I would suggest the bot leave a message saying something like "an error was found, please take a look" rather than "you caused an error". Best. 64.40.54.174 (talk) 05:06, 19 October 2013 (UTC)
- That`s my opinion too. Just send a remembrance / reminder to the editor, who caused the problem a few minutes ago. So he could fix the error in one minute, instead of us, who needs sometimes 10 minutes, or more (in at least these steps: see categories pages with citation errors > open page > read > edit page in another window > view history in another window > compare selected versions > < pondering what's wrong > fixing > preview > saving > closing all pages...). Just now there are 250 pages in all three categories > that means: ten hours work when you need just 2 1/2 minutes per page. Message could be just:
"Take a look at the page XYZ. There is a citation error. It could be in the text:
- A <ref> tag is missing the closing </ref>.
- or take a look at the bottom of the page:
- There are <ref> tags on this page, but the references will not show without a {{reflist}} template.
- A named reference was invoked but never defined.
- A reference was defined but isn't used in the text.
Thanks, RefBot talk 10:05, 21 October 2013 (UTC)"
- There is an article traffic on pages with citation errors of about 25 views per day. (See [10][11][12]). That could mean that about 10 or so users have been working hard on this pages with citation errors. If there would be a REFBot, this work could be minimized to 25%...35%. --Frze > talk 10:05, 21 October 2013 (UTC)
- This is a great idea. Categories such as Pages with missing references list and other error cats tend to fill up rather quickly, even with our attempts to maintain it. Other bots try to fix referencing errors, but there are so many different types of errors, all of which can be caused by multiple syntactical errors (here's a small list). It would be hard to program a bot to fix all of them. With this bot, all that is needed is for it to check if the error is present, and which edit it first appeared (to notify the correct editor). It does not need to check the specific syntax that caused the error. The editor will be able to see what he did and fix it easily. It's a much simpler solution than programming a bot to fix them itself.
"It might be hard to eliminate false positves though."
There might be a small problem with valid checking if templates are present in the article. I've seen a fair share of error messages in articles that resulted from an edit to a template and not an edit to the article itself. The error message still shows up in the article. i.e. if a user adds a citation to the template and there isn't a{{Reflist}}
template in the article. The bot would have to check for that I assume. — JJJ (say hello) 15:15, 22 October 2013 (UTC)
- I support this request for creation of a RefBot. Pointing editors towards problems in their edits not only reduces the need for other editors to fix these problems later but provides an opportunity for the original editors to become aware of problems they are creating and how to avoid/fix them. - - MrBill3 (talk) 15:40, 22 October 2013 (UTC)
- Hey all, sorry it took me so long to comment on this idea. It seems a great idea, and there looks to be enough support on the issue. I will try and have a look into making this, though the main issue is that I don't have much time with all my university work. (Though stay tuned, because my year project is for Wikipedia.) If anyone could help me, by finding out how to detect the errors in real time, I'll look into modifying BracketBot's code to make ReferenceBot. (Or whatever name you want to vote for ) 930913(Congratulate) 18:31, 27 October 2013 (UTC)
- @Frze: Why do you distract me? D: Anyway, I have made a script to collect the previous day's mistakes.
Yesterday's mistakes
|
---|
Category:Pages with DOI errors
|
- I need these (a random sample?) checked to ensure that a notice is appropriate, and for each of the categories I'll need a template, or a single template with an insertion for relative phrases. 930913(Congratulate) 01:13, 28 October 2013 (UTC)
- 930913, I've got code that checks ISBN problems. It finds these problems. Yell if you want it.
- The main problem with Bracketbot that needs to be fixed ASAP is people goto 930913's talk page for questions. Huon answers most of the questions with GoingBatty helping out. Questions for Bracketbot and Refbot need to be directed to where more people can help out. Bgwhite (talk) 04:46, 28 October 2013 (UTC)
- 930913: Thanks for your efforts. Here are the categories we are interested in: Category:Pages with citation errors
- Category:Pages with broken reference names
- Category:Pages with incorrect ref formatting
- Category:Pages with missing references list --Frze > talk 04:00, 29 October 2013 (UTC)
- @Frze: Added
Yesterday's Mistakes
|
---|
:Arjayay edited Transcendental Meditation technique causing Category:Pages with ISBN errors, Category:Pages with citations using unsupported parameters, Category:Pages using citations with old-style implicit et al., Category:Pages using citations with accessdate and no URL
|
- Again, please check for reasons why any of the above shouldn't be included, and come up with wording for each category. Thanks, 930913(Congratulate) 17:52, 29 October 2013 (UTC)
- I am the editor heading the "Yesterdays mistakes" list with
- Arjayay edited Transcendental Meditation technique causing Category:Pages with ISBN errors, Category:Pages with citations using unsupported parameters, Category:Pages using citations with old-style implicit et al., Category:Pages using citations with accessdate and no URL
- You clearly need to look at these edits more closely, before rushing in with a bot. My "edit" was to reinstate a page which had been blanked, and replaced by an inappropriate redirect. All of the mistakes I am accused of "creating" were, therefore, already there. If editors are going to be chased by a bot every time they reinstate a blanked page, or a blanked section, you are going to be inundated with complaints, and editors will either ignore your notifications, or turn the bot off.
You need to be able to identify when an editor has actually introduced the problem, and when they have merely reverted some vandalism, which includes problem(s) potentially made by numerous editors over a long period of time. Wikipedia has an unfortunate history of launching half tested software, e.g. bracket bot which doesn't understand basic things such as the use of greater or less than symbols, and points editors to the wrong place, so I fear the worst - Arjayay (talk) 18:25, 29 October 2013 (UTC)
- I am the editor heading the "Yesterdays mistakes" list with
- @Arjayay: More closely? At all. I don't have much time, so I'm relying on people like you raise these issues, so I can properly code the bot. Obviously flagging reverts is undesirable, and will need to be removed for the approved implementation. Thank you for your participation, 930913(Congratulate) 22:32, 29 October 2013 (UTC)
- 930913: Thanks for your efforts again. We are only interested in: Category:Pages with citation errors, nothing else!
- because if there are more than 50 items in each categorie - then it shows as a backlog.
- The categories you mentioned are impossible to work on, please try it later: There is a backlog today of about 60,000 pages:
- Category:Pages with ISBN errors - 7,598 pages
- Category:Pages with citations using unsupported parameters - 10,216 pages
- Category:Pages using citations with old-style implicit et al. - 4,408 pages
- Category:Pages using citations with accessdate and no URL - 42,060 pages
- We have to clean up at first the Category:Pages with citation errors so not to leave any big red error on the pages. Thank you for your attention --Frze > talk 21:26, 29 October 2013 (UTC)
- @Frze: Irrelevant, the script would work by notifying anyone who puts a page in those categories (i.e. not those already there.) This bot will not clear the backlog, it will slow the backlog's growth, such to aid your attempts to clear. 930913(Congratulate) 22:32, 29 October 2013 (UTC)
- A930913: I know that the Bot do not clear the backlog. I am not meschugge. But first of all we will not grow the backlog in the Category:Pages with citation errors again to more than 1,500 pages! It took us more than a week hard work to clean up. Do what you want with the other categories, the main thing is to start the bot for Category:Pages with citation errors. In most exquisite gratitude --Frze > talk 23:09, 29 October 2013 (UTC)
- @Frze: The point is, for very little work, we cover a whole load more categories. 930913(Congratulate) 23:21, 29 October 2013 (UTC)
- A930913: I know that the Bot do not clear the backlog. I am not meschugge. But first of all we will not grow the backlog in the Category:Pages with citation errors again to more than 1,500 pages! It took us more than a week hard work to clean up. Do what you want with the other categories, the main thing is to start the bot for Category:Pages with citation errors. In most exquisite gratitude --Frze > talk 23:09, 29 October 2013 (UTC)
- A930913: The other point is the message to the user, as simple as posssible. See benzbands contrib 10:37, 22 October 2013. Please program the two different bots. --Frze > talk 23:44, 29 October 2013 (UTC)
"Please program the two different bots."
The whole point of this bot is to notify editors. There will only be one bot. — JJJ (say hello) 00:38, 30 October 2013 (UTC)
- A930913: The other point is the message to the user, as simple as posssible. See benzbands contrib 10:37, 22 October 2013. Please program the two different bots. --Frze > talk 23:44, 29 October 2013 (UTC)
- @Frze: You're missing the point, each category can have its own message. 930913(Congratulate) 00:37, 30 October 2013 (UTC)
- 930 and JJJ, methinks what Frze was talking about is that there is a prioritized-need for the various use-cases of the bot, with citation-errors being the most critical (since they are very difficult to correct without help from the editor who originally created the trouble). The least critical is the 42k pages that have "cite-w/-accessDate-but-no-URL" which basically means, somebody cited a page-number from a printed book, and then went ahead and specified that their info was 'retrieved on' November 1st of 2013. Which is pointless, since as long as they specify edition/format/isbn of the book, the date they looked up the fact in that book does not matter, the retrieved-on param is only for URL-based cites, since the contents of the URL often suffer from bitrot, whereas deadtree books do not so suffer.
- Anyways, while I understand that there need only be a single bot (or filter, see my suggestion below), I strongly suggest that RefBot should not be turned loose on all the possible error categories, simultaneously. We do not want to put template-spam on the talkpage of a user, which says "you made a reference-coding error" when all they did was harmlessly put the retrieved-on param into a cite from a deadtree book. Why waste their time? But we especially don't want to *rollback* such changes, of course. Point being, although there will only be one bot, I agree with Arjayay about the importance of doing serious careful testing, so that we are dead-sure RefBot handles all the odd corner-cases properly, before we unleash it on millions of unsuspecting editors. We do not want to *discourage* people from adding references! That's hard enough to get them to do in the first place.
- Therefore, at the end of the day I agree with Frze, but for a different reason: strongly suggest that RefBot be implemented so that it can be up-front configured to ignore all categories which are not explicitly specified. That way, we can do a staged rollout, testing RefBot against a small subset of all possible cite-errors, before we expand the category-count. Eventually, of course, RefBot may be so bullet-proof-user-friendly that we can permit it to ignore nothing, and at that point the category-inclusion-exclusion-code can be taken out. But in the first months of RefBot testing, methinks it will prove very valuable to just focus on one sort of cite-error at a time, beginning with Category:Pages with citation errors where we know there are some beta-testers full of wikithusiasm for the wondrous powers of RefBot. :-) Appreciate your time; thanks for improving wikipedia. 74.192.84.101 (talk) 11:33, 4 November 2013 (UTC)
- @Frze: You're missing the point, each category can have its own message. 930913(Congratulate) 00:37, 30 October 2013 (UTC)
A930913 TheJJJunk I'm looking forward with happy anticipation to the implementation of my idea. Thanks to you all. --Frze > talk 04:28, 30 October 2013 (UTC)
- Question: Is the bot going to have an opt-out option like BracketBot does? Because with the brackets, the edit could have been very minor, not causing large-scale damage to the article. But with this, big red error messages occur because of the mistakes. This is something to think about: Do users get to decide if they get notified about their error, or do they have to be. — JJJ (say hello) 17:13, 30 October 2013 (UTC)
- Comment. There are other possibilities... with xLinkBot, if the editor submits an external link to facebook (or some other greylisted website), xLinkBot will perform a rollback, then notify the editor on their talkpage. This is often problematic, because even if the edit was ten kilobytes, xLinkBot will remove everything -- parsing out just the greylisted hyperlink is too server-intensive and error-prone. Another problem, is that rollback can wipe out a long series of edits, only the last of which added the greylisted link. Given these existing possibilities, and the BracketBot behavior JJJ mentions:
- Question: Is the bot going to have an opt-out option like BracketBot does? Because with the brackets, the edit could have been very minor, not causing large-scale damage to the article. But with this, big red error messages occur because of the mistakes. This is something to think about: Do users get to decide if they get notified about their error, or do they have to be. — JJJ (say hello) 17:13, 30 October 2013 (UTC)
- Forcible-Prevention-Filter. RefBot should be implemented as an edit-filter, and immediately warn the user when they preview or save a busted ref, refusing to let them save it in a broken state (they must fix it first)
- Loud-Warning-Filter. RefBot should be implemented as an edit-filter, and immediately warn the user when they preview or save a busted ref, but permit the user to override and save anyways (in the broken state)... then nothing
- Silent-Warning-Filter. Same as #2. Additionally, RefBot allows the editor to opt-out of receiving RefBot filter-warnings.
- Loud-Fix-It-Later-Filter. RefBot should be implemented as an edit-filter, and immediately warn the user when they preview or save a busted ref, but permit the user to override and save anyways (in the broken state)... however, after their edit is saved, RefBot should rollback that one edit (not rollback the last N edits by the editor in question), and then RefBot should automagically post to the article-talkpage with a diff-link to the attempted-ref-edit that it just reverted
- Silent-Fix-It-Later-Filter. Same as #4. Additionally, RefBot allows the editor to opt-out of receiving RefBot filter-warnings.
- Loud-Warning-Bot. RefBot should be implemented as a bot, and eventually warn the editor on their talkpage, but should leave the article alone (no opt-out capability)
- Silent-Warning-Bot. Same at #6. Additionally, RefBot allows the editor to opt-out of receiving RefBot talkpage-messages.
- Loud-Fix-It-Later-Bot. RefBot should be implemented as a bot, and eventually warn the editor on their talkpage, plus RefBot should rollback that one edit (not rollback the last N edits by the editor in question), and then RefBot should automagically post to the article-talkpage with a diff-link to the attempted-ref-edit that it just reverted. Plus, ideally, RefBot's user-talkpage-message should have a one-click-to-put-my-broken-edit-back hyperlink, which also redirects the editor to the article (this prevents them from needing to manually visit the article, enter the edit-history, manually undo RefBot, and then go back to editing the article). Since the editor might not utilize the one-click-magic 'soon' by standards of how quickly the article in question is changing, prolly the one-click-magic should only work if the sub-section of the article in question has *not* been changed by any editors, since RefBot reverted this editor's work; otherwise, the one-click-magic might do more harm than good.
- Silent-Fix-It-Later-Bot. Same at #8. Additionally, RefBot allows the editor to opt-out of receiving RefBot talkpage-messages.
- Obviously, there are additional variations that are possible, such as #7-less-the-article-talkpage-feature, or whatever. But I think these options cover the main *types* of behaviors that we might want. 74.192.84.101 (talk) 11:04, 4 November 2013 (UTC)
(Arbitrary break for ease of editing)
I support the idea of this bot existing with functionality similar to that of BracketBot. I have specific ideas for a different bot that would actually fix CS1 citation errors, but I will describe that functionality in a separate request.
As stated above by others, I do not think it would be productive to apply this new bot's activity to all of the subcategories of Category:Articles with incorrect citation syntax. That would generate a LOT of error messages on people's Talk pages, and some error messages are not even displayed on the article pages by default, so it will be hard for people to figure out where they made an error or if they have fixed it. I recommend starting with the following categories, each of which has been emptied through diligent work by wikignomes:
- Pages with citations having wikilinks embedded in URL titles
- Pages with citations using conflicting page specifications
- Pages with citations using unnamed parameters
- Pages with DOI errors
- Pages with empty citations
- Pages with OL errors
- Pages with URL errors
Also as requested above, the bot should operate on articles in:
- Pages with broken reference names
- Pages with incorrect ref formatting
- Pages with missing references list
I estimate that a total of 20 to 50 articles are added to all of the categories above (combined) each day; someone here might be able to scrub the logs and get a better count.
The bot should post a message similar to Bracketbot's message on the Talk page of the editor who makes the change. Since these categories are already empty, the situation described above in which a revert reintroduces an error should be a rare case.
Also, the bot should have a built-in waiting period (Bracketbot waits five minutes) to allow editors to fix errors themselves if they notice them. Please contact me if you need help writing the error notification text for each category. – Jonesey95 (talk) 16:23, 6 November 2013 (UTC)
- @Jonesey95, Frze, and TheJJJunk: Started on the three categories and have come up with this so far.
- "Category:Pages with broken reference names" --> "a [[:Category:Pages_with_broken_reference_names|broken reference name]] <small>([[Help:Cite_errors/Cite_error_references_no_text|help)</small>"
- "Category:Pages with incorrect ref formatting" --> "a [[:Category:Pages_with_incorrect_ref_formatting|cite error]] <small>([[Help:Cite errors|help]])</small>"
- "Category:Pages with missing references list" --> "a [[:Category:Pages with missing references list|missing references list]] <small>([[Help:Cite_errors/Cite_error_refs_without_references|help]] {{!}} [[Help:Cite_errors/Cite_error_group_refs_without_references|help with group references]])</small>"
Examples of what the bot would generate from that
|
---|
On User:Tesfazgi Teklezgi:
Please check these pages and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC) On User:14.139.160.4:
Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC) On User:98.230.108.226:
Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC) On User:Soetermans:
Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC) On User:Chrisd915:
Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC) On User:71.173.129.226:
Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC) On User:128.8.228.120:
Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC) |
- These would be daily reports (BracketBot now does ten minutes delay, though the page hasn't been updated) more like DPL bot (signed with ReferenceBot, not my sig and put on the talk page, not the userpage ).
- The current templates used are
{{User:ReferenceBot/inform/top}}
,{{User:ReferenceBot/inform/middle}}
and{{User:ReferenceBot/inform/bottom}}
. See also User:ReferenceBot. - I'll apply for approval soon. 930913(Congratulate) 16:26, 7 November 2013 (UTC)
- Nice work. I clicked through the links to the help pages and improved some of the help text. We want to be sure that people are able to fix problems once they are alerted to them. – Jonesey95 (talk) 19:21, 7 November 2013 (UTC)
- @A930913: Perhaps once/if the bot gets approved, it should have its own talk page instead of being redirected to your own? Just an idea. This could help isolate problems of this bot, knowing that BracketBot's talk also redirects to the same place. — TheJJJunk (say hello) 20:45, 7 November 2013 (UTC)
- Good idea. I like Citation bot's Talk page; it gives an easy way to report problems or make suggestions.
- A daily report sounds fine to me, if you can figure out how to identify which edit introduced the problem and if the problem still exists. You'll still want some sort of delay to allow people to fix their own edits if they see them. (e.g. If the report runs daily at 23:59, you should ignore edits made between 23:49 and 23:59 but include edits made from 23:49 to 23:59 on the previous day.) – Jonesey95 (talk) 22:51, 7 November 2013 (UTC)
Move 30 Seconds to Mars links to Thirty Seconds to Mars
After a requested move and a move review, the page 30 Seconds to Mars was moved to Thirty Seconds to Mars, which is the official name of the band. After long discussions, we came to an end and all links to 30 Seconds to Mars should be replaced with Thirty Seconds to Mars. Please fix them if you can.--95.245.58.53 (talk) 21:16, 11 November 2013 (UTC)
- I'd like to point out that at the requested move 30 Seconds to Mars was moved to Thirty Seconds to Mars and its move review was closed with an endorse. The last discussion is here, where it was definitely decided that the name Thirty Seconds to Mars is right.--95.245.58.53 (talk) 14:43, 12 November 2013 (UTC)
- All links to 30 Seconds to Mars will redirect to Thirty Seconds to Mars. See WP:NOTBROKEN. — TheJJJunk (say hello) 15:22, 12 November 2013 (UTC)
- That is not the point. Thirty Seconds to Mars is the official name (see discussions), that's why 30 Seconds to Mars should be replaced.--95.245.58.53 (talk) 20:42, 12 November 2013 (UTC)
- All links to 30 Seconds to Mars will redirect to Thirty Seconds to Mars. See WP:NOTBROKEN. — TheJJJunk (say hello) 15:22, 12 November 2013 (UTC)
Thanks for your work. The same thing should be done for MTV Unplugged: 30 Seconds to Mars, Attack (30 Seconds to Mars song), Kings and Queens (30 Seconds to Mars song), Hurricane (30 Seconds to Mars song), Night of the Hunter (30 Seconds to Mars song), Search and Destroy (30 Seconds to Mars song), City of Angels (30 Seconds to Mars song), Do or Die (30 Seconds to Mars song).--Earthh (talk) 20:13, 13 November 2013 (UTC)
- Half done All of the links have been replaced, and I hit some of the major templates too. The only ones that should remain are the ones that are linked through the templates, and once a user makes an edit to the page, it will be removed from the list. If any are still linked, you'll need to check if any templates still have the old links. — TheJJJunk (say hello) 23:12, 13 November 2013 (UTC)
Dead archiveurl detection, repair
The Wayback Machine respects robots.txt across time. If a website has a robots.txt that permits archiving at one point, an editor could archive that page; a subsequent change to robots.txt on that site could lead to an inaccessible archive. For example:
South Park has link to
- http://web.archive.org/web/20050515091804/http://www.peremoga.gov.ua/index.php?2150005000000000020
- http://web.archive.org/web/20050515100506/http://www.peremoga.gov.ua/index.php?2150005000000000070
WebCite doesn't cause us problems in this way.
I believe a bot is required to repair these archive links. They can be detected by running a report against the database for all external links to archive.org, and for each link checking the link still works (will a HEAD command be sufficient?). Dead archiveurl links would need to be archived at webcite, or if the original link is unavailable then that need flagging with {{dead}}. Josh Parris 02:22, 16 November 2013 (UTC)
Mass diacritics correction
I have a task for bots: It's needed to replace all diacritics Ş ş Ţ ţ with Ș ș Ț ț in articles from categories about Moldova and Romania. In romanian language correct are second variant, but in Windows XP is a bug and in place of them are those wrong ^ diacritics. I have corrected a part of articles about football, but they are more. Its needed an bot that also can rename pages, because sometimes diacritics (wrong) are in title. Examples:
- https://en.wikipedia.org/enwiki/w/index.php?title=Chi%C5%9Fin%C4%83u&redirect=no
- https://en.wikipedia.org/enwiki/w/index.php?title=Ro%C5%9Fia_Montan%C4%83&redirect=no
I repeat that in romanian language Ş ş Ţ ţ does not exist. Those are turkish diacritics, so you can freely to run bot in category ″Moldova″ and ″Romania″ + all subcategories. Thanks. XXN (talk) 14:49, 17 November 2013 (UTC)
- See Wikipedia:Bots/Requests for approval/VoxelBot 2. Also see Wikipedia:Bot requests/Archive 39#Cedilla to Comma below bot for articles under Romanian place names and people and Wikipedia:Bot requests/Archive 52#Romanian orthography for past discussion of this sort of thing. In short, someone doing this must be very careful that the words they are altering are in fact Romanian rather than Turkish, Kurdish, Azerbaijani, Crimean Tatar, Gagauz, Tatar, or Turkmen (based on the list at Cedilla#S). Anomie⚔ 17:22, 17 November 2013 (UTC)
Looking for technical mentors and tasks for Google Code-in
Hi, I'm one of the Wikimedia org admins at mw:Google Code-in. We are looking for technical tasks that can be completed by students e.g. create/update a bot, improve its documentation... We also need mentors for these tasks. You can start simple with one mentor proposing one task, or you can use this program to organize a taskforce of mentors with the objective of getting dozens of small technical tasks completed. You can check the current Wikimedia tasks here. The program started on Monday, but there is still time to jump in. Give Google Code-in students a chance!--Qgil (talk) 16:11, 21 November 2013 (UTC)
bots
hello, how do i go about getting a bot for my chatroom? — Preceding unsigned comment added by Hannsg8000 (talk • contribs) 19:06, 21 November 2013 (UTC)
- Generally, we can only help with bots related to Wikipedia. However, if it is an IRC related chatroom, meta:wm-bot may be useful to you. --Mdann52talk to me! 13:29, 22 November 2013 (UTC)
Requesting script modification
Hi all, I was recently granted a trial with my bot (see Mdann52 bot BRFA). However, it turned out that the script I was trying to use (mw:Manual:Pywikibot/weblinkchecker.py) did not check links in-between ref tags, so was not very useful for the task as I first thought. As my python skills are not very good at the minute, can someone rewrite the script (or produce a version of it) that only checks links in-between ref tags (and possibly ignores any tagged with {{dead link}}?) Thanks --Mdann52talk to me! 13:39, 22 November 2013 (UTC)
Updating a table with some stuff
There's currently a table of women physicists at User:Headbomb/sandbox2. If someone could code a bot to fetch the articles, and fill in the other columns, that would be nice and much appreciated.
- DOB (Date of Birth): Use YYYY-MM-DD format, or YYYY-MM, or YYYY if information is incomplete. — if information is missing.
- DOD (Date of Death): Use YYYY-MM-DD format, or YYYY-MM, or YYYY if information is incomplete. — if information is missing.
- Tagged?: List projects that tag the article, alphabetized. That is, if the talk page is tagged by {{WikiProject Physics}} and {{WP Biography}}, list {{WikiProject Biography}}, then {{WikiProject Physics}}, with linebreaks between them.
- Class: Max rating found in banners, i.e. if you find Start and Stub, list Start.
- Link count: How many times the article and its redirects are linked to (mainspace count only, exclude redirects).
For clarity, I've filled the first line of the table. The request is for a one-time run for now, but a weekly/monthly run could be done when at some point in the future when the table gets hosted as its permanent location. Feel free to do tests directly on my sandbox2. Headbomb {talk / contribs / physics / books} 18:01, 26 November 2013 (UTC)
Project banners for WikiProject Women artists
The newly formed WikiProject Women artists could use a bot to add project banners to the talk pages of articles within certain categories. Gobōnobō + c 03:55, 28 November 2013 (UTC)
- You might want to ask Anome or User:Magioladitis. Their about the only ones left with a bot that might be willing to do that. 108.45.104.69 (talk) 04:00, 28 November 2013 (UTC)
- Great. Thank you 108. Gobōnobō + c
- I can do the tagging only if I get specific instructions. -- Magioladitis (talk) 09:39, 28 November 2013 (UTC)
- Great. Thank you 108. Gobōnobō + c
Ban violation bot
A bot that finds ban violations (e.g. editing someone's userpage when there is an interaction ban between the new editors, editing during a site ban, etc) and reports and possibly reverts them. 2AwwsomeTell me where I screwed up.See where I screwed up. 20:02, 26 November 2013 (UTC)
- I've honestly been brainstorming a bot like this for months now. But there are a lot potential issues that would need resolving.—cyberpower ChatOnline 13:03, 27 November 2013 (UTC)
- Topic bans might run off the category system, but then there's the "broadly construed" aspect of many bans. Interaction bans aren't normally just userpages are they? Josh Parris 10:49, 28 November 2013 (UTC)
- No, but some would be difficult. And the userpage one was just an example. WikiProjects might be better than categories, unless the topic doesn't have a WikiProject. And there might be something to check whether edits were probably reverts of vandalism. 2AwwsomeTell me where I screwed up.See where I screwed up. 16:40, 28 November 2013 (UTC)
- I've been experimenting with a few setups of potentially running a bot like this with encouraging results. I might just take this task up.—cyberpower OnlineHappy Thanksgiving 16:44, 28 November 2013 (UTC)
- No, but some would be difficult. And the userpage one was just an example. WikiProjects might be better than categories, unless the topic doesn't have a WikiProject. And there might be something to check whether edits were probably reverts of vandalism. 2AwwsomeTell me where I screwed up.See where I screwed up. 16:40, 28 November 2013 (UTC)
- Topic bans might run off the category system, but then there's the "broadly construed" aspect of many bans. Interaction bans aren't normally just userpages are they? Josh Parris 10:49, 28 November 2013 (UTC)
Bot for repetitive tasks
I want a bot for that automated or semi-automated for making repetitive edits that would be extremely tedious to do manually. repetitive tasks. for example adding the same category or template for a 1000 article. --DIYAR DESIGN (talk) 18:08, 27 November 2013 (UTC)
- In general that's what bots are supposed to do. Any chance for more details? Hasteur (talk) 19:36, 27 November 2013 (UTC)
- I also want a bot, to do that kind of stuff for me. I want to be able to tell it what to do, and it does it, no questions asked. That's why I chose to become a programmer and a bot-op. Now I have 2 active bots that I can boss around. :p—cyberpower ChatOnline 20:58, 27 November 2013 (UTC)
So you want a bot. What sort of bot? What do you want it to do? Idea is not well explained. 2AwwsomeTell me where I screwed up.See where I screwed up. 17:06, 28 November 2013 (UTC)