Jump to content

Wikipedia:Bots/Requests for approval/Lonjers french region rename bot

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Lonjers (talk | contribs) at 02:40, 31 January 2016 (Lonjers french region rename bot). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Operator: Lonjers (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 03:36, Thursday, January 7, 2016 (UTC)

Automatic, Supervised, or Manual:Automatic

Programming language(s):Python

Source code available:https://github.com/utilitarianexe/wiki_france_region_rename

Function overview:Removes unused region and department parameters from the French commune info boxes for french commune articles.

Links to relevant discussions (where appropriate):https://en.wikipedia.org/wiki/Wikipedia:Bot_requests#New_French_regions_on_1_January https://en.wikipedia.org/wiki/User_talk:AHeneen#help_with_info_box_renaming https://en.wikipedia.org/wiki/Module_talk:Wikidata#Suggested_test_case:_New_French_Regions

Edit period(s): one time

Estimated number of pages affected:30,000

Exclusion compliant Yes:

Already has a bot flag (Yes/No):

Function details:Fairly simple find and replace task. Bot first gets the list of all English French commune articles. It then searches for the French commune info boxe on each article in the list. It finds in the info box the region and department parameters and removes them. The region and department are currently calculated using the INEE code. Having the additional non functional region and department parameters is confusing.

Discussion

Hi, User:AHeneen brought your proposal to my attention. You do not need to change the regions in the commune infoboxes because that is done automatically, using the INSEE code. See for instance Largentière: the infobox contains the line "|region = Rhône-Alpes", but this is ignored because the infobox uses the first two numbers of the INSEE code ("07132") to determine it is in the (new) region Auvergne-Rhône-Alpes. I have already updated the regions in all the relevant department, arrondissement and canton articles (infobox and article text). What still needs to be done is change the regions in the article text for the communes. Maybe a bot can help there. But be careful, because not all references to an old region should be changed, for instance Alsace may refer to the traditional region, not the former administrative region. Markussep Talk 08:36, 13 January 2016 (UTC)[reply]

Context-sensitive changes are very tricky for bots. We can try to come up with a restrictive replacement, maybe, but (and I'm sorry to say this) a manually-assisted AWB job might be the way to go here. Needs a closer look to see how these articles are structured. — Earwig talk 08:44, 13 January 2016 (UTC)[reply]
Thanks for the responses. Yikes that must have taken some time to edit all those manually. But better in the end because you fixed it in the article text too. I was avoiding that because of the context problems. User:The Earwig I think you could still edit the text in all the commune articles if you restricted to just the line at the top of the articles. Nearly all of them have the form "in the *** region" at the top of the article and you could just skip the ones that don't exactly match that in the first sentence. I do agree though that changing it anywhere else would require manual checking. Let me know if you think that is a good idea to try and I will modify my code for that task. Kinda just trying to find a way to still use the code that I wrote. But if you don't think it is a good idea that is ok too still good practice. Lonjers (talk) 21:03, 13 January 2016 (UTC)[reply]
There's an important lesson to be learned here about work being spent on bots that later turn out to be unnecessary. It happens, although for your first task it's a bit unfortunate. We can give what you are proposing a shot. I am thinking of some additional conditions, like skipping articles that already include the new region name (which have likely been migrated already) or have "was" in the same sentence as the region-to-replace, but it's still tricky to get right. You might want to start by going through the relevant articles and building a list of which ones the bot would definitely change, so we can get a sense of the number of edits and do some spot-checks. — Earwig talk 22:29, 13 January 2016 (UTC)[reply]
Did some checking and actually not very many of the articles match a standard template. In general it seems most of the articles don't even include the region in the text. The French wikipedia versions of the articles usually do but those seem to already be updated. I guess we should close this request for now. Still looking for little programming tasks to do on wikipediat if you have any suggestions. Lonjers (talk) 23:19, 16 January 2016 (UTC)[reply]
I did not know that the template ignored the Region parameter and used the INSEE number. Sorry for your wasted effort @Lonjers:. Changing the article links within the prose is a huge task and cannot be easily done with a bot, as mentioned above. Also, the region names are only temporary for a few months. The regional governments must chose a new name by 1 July and the national government then has until October to recognize or reject the new region names. Except for Normandy, all of the new region names in the prose of the articles must be changed again when the official name is approved. AHeneen (talk) 03:32, 14 January 2016 (UTC)[reply]
No worries Lonjers (talk) 23:19, 16 January 2016 (UTC)[reply]

@Lonjers: Following comments above, do you wish to proceed with this BRFA in any manner? —  HELLKNOWZ  ▎TALK 15:55, 17 January 2016 (UTC)[reply]

Lets wait to see how the discussion below with Rich goes. I probably do not want to proceed. But I will send you an update when I know for sure. Lonjers (talk) 22:44, 19 January 2016 (UTC)[reply]
@Hellknowz: So I think my plan now is to use this to just remove the unused region name parameter from the articles. Should be simple to update the code to work like this. Let me know if you think this is a good idea. Sorry for being so long getting back to you. Lonjers (talk) 22:37, 25 January 2016 (UTC)[reply]
Could you update the function details to the exact latest spec that you want to run so we know what the bot intends to do? —  HELLKNOWZ  ▎TALK 23:03, 25 January 2016 (UTC)[reply]
Done will update the code today too. Lonjers (talk) 21:19, 27 January 2016 (UTC)[reply]
Code is now updated to just do this small task. @Hellknowz: Lonjers (talk) 06:11, 29 January 2016 (UTC)[reply]
  • Let me just say that the region hack is just that: a hack. Once the new names are finalised updating the infoboxen would be a good idea.
hmmm can you explain why it would be a better solution to use a region name explicitly. Seemingly the templace editors made the choice to change it to use this way for a good reason as the template code looks pretty intense. Would be happy to use this to remove the now unused region parameters. But if there is a good reason let me know and I will try to contact the people who made the template and we can change the template to use the explicit region. Lonjers (talk) 22:44, 19 January 2016 (UTC)[reply]
Let's suppose, for example, that someone, wittingly or unwittingly changes the INSEE. A random editor seeing the wrong region would be at a loss to fix it. A better solution might be to calculate the region and compare it with the given region, adding the article to a hidden tracking category if they don't match. It would also be better to encapsulate the region calculation in a reusable manner, such as {{French region name from INSEE code}}. All the best: Rich Farmbrough, 22:45, 20 January 2016 (UTC).[reply]
It certainly confused me how the template works. I think that removing the region param from the current pages would be a good first step. And then changing the template to include the new template you mentioned would make things much cleaner. That template could then be used in other places as well. Lonjers (talk) 22:37, 25 January 2016 (UTC)[reply]
  • I also think it would be a perfect pilot task to fix the "Centre" to "Centre-Val de Loire" (in the French commune infobox) now, so maybe continue this BRFA on that basis?
All the best: Rich Farmbrough, 21:20, 17 January 2016 (UTC).[reply]
So it is actually not obvious to me how the Centre gets in there instead of Centre-Val de Loire. It is not something in the markup of each commune article in the region. It is somehow being generated by the template incorrectly. I am going to look into fixing this in the template. Or if we do decide to explicity add the region to each pages markup the template should pull it form there I think. @Rich Farmbrough: Lonjers (talk) 22:44, 19 January 2016 (UTC)[reply]
The infobox should show "Centre-Val de Loire", not Centre. It does for the communes I checked, except the caption of the detailed map, that's corrected now (may take some time / null-edit to show). For the new regions, these maps are not available yet, and maybe we should change them for department maps instead (when available). Indeed the parameter fields "department" and "region" in the infoboxes are not used anymore, so they can be removed (but don't have to). Markussep Talk 09:19, 20 January 2016 (UTC)[reply]

Updated task

What are the "appropriate info boxes" -- you have to give the exact list. Is it just {{Infobox French commune}}? Are redirects included? Should you also remove |department= per "The fields "region" and "department" are no longer used."? Please also leave a message with Wikipedia talk:WikiProject France to make sure no one has unforeseen objections. —  HELLKNOWZ  ▎TALK 13:27, 29 January 2016 (UTC)[reply]

The list of articles to edit is not found by looking for all articles with the template. It is based on Lists of communes of France. It only modifies them if they have Infobox French commune. All the articles I have seen use that one at least on the English wikipedia. Some of the other ones on the redirect don't seem to be used at least in the few hundred I have looked through. But if they do come up when I go through all of them I will edit those to. I was not planning on editing the commune articles on the French and Japanese wikipedias. Those contain the other redirects. But I guess it would make sense to do those to. Just was not sure how to ask premission to also edit those as I don't know the languages. I do plan of editing out the department parameter too. I edited the code and request for this. Going to leave a message on the project page now. Let me know if this makes sense. Lonjers (talk) 02:39, 31 January 2016 (UTC)[reply]