Jump to content

Wikipedia:Bots/Requests for approval/Lonjers french region rename bot

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Lonjers (talk | contribs) at 21:06, 13 January 2016 (Discussion). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Operator: Lonjers (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 03:36, Thursday, January 7, 2016 (UTC)

Automatic, Supervised, or Manual:Automatic

Programming language(s):Python

Source code available:https://github.com/utilitarianexe/wiki_france_region_rename

Function overview:Renames regions in info boxes for french department, commune, and arrondissement articles to deal with France consolidating regions as of Jan 1 2016

Links to relevant discussions (where appropriate):https://en.wikipedia.org/wiki/Wikipedia:Bot_requests#New_French_regions_on_1_January https://en.wikipedia.org/wiki/User_talk:AHeneen#help_with_info_box_renaming https://en.wikipedia.org/wiki/Module_talk:Wikidata#Suggested_test_case:_New_French_Regions

Edit period(s): one time

Estimated number of pages affected:30,000

Exclusion compliant (Yes/well will add the code tonight):

Already has a bot flag (Yes/No):

Function details:Fairly simple find and replace task. Bot first gets the list of all French department, commune, and arrondissement articles. It then searches for the appropriate info boxes on each article in the list. It finds in the info box the current region the article is mapped to. This region is then mapped to the new region it belongs to. Because the regions are simply being consolidated not rearranged this new region name can simple replace the old one. An example of an already corrected article(these get auto skipped by the bot) is https://en.wikipedia.org/wiki/Ard%C3%A8che one that is not fixed yet is https://en.wikipedia.org/wiki/Lot_%28department%29. It was suggested to use wikidata properties instead of simply a name replace. This is a good idea but would require much more work. Many of the pages that need corrected do not even have wikidata items yet. I plan on working on this too. But think the simple name replace should be done first. There may be cases were my regexes don't properly macth the info boxes of some articles. I plan on running the program first in a mode to check for any of these cases. It is not possible to do this manually. If any problems are found I will fix the code. This is before I make any edits. But even to do this check requires looking at many articles so I think I need approval first. This is my first time trying this so be patient

Discussion

Hi, User:AHeneen brought your proposal to my attention. You do not need to change the regions in the commune infoboxes because that is done automatically, using the INSEE code. See for instance Largentière: the infobox contains the line "|region = Rhône-Alpes", but this is ignored because the infobox uses the first two numbers of the INSEE code ("07132") to determine it is in the (new) region Auvergne-Rhône-Alpes. I have already updated the regions in all the relevant department, arrondissement and canton articles (infobox and article text). What still needs to be done is change the regions in the article text for the communes. Maybe a bot can help there. But be careful, because not all references to an old region should be changed, for instance Alsace may refer to the traditional region, not the former administrative region. Markussep Talk 08:36, 13 January 2016 (UTC)[reply]

Context-sensitive changes are very tricky for bots. We can try to come up with a restrictive replacement, maybe, but (and I'm sorry to say this) a manually-assisted AWB job might be the way to go here. Needs a closer look to see how these articles are structured. — Earwig talk 08:44, 13 January 2016 (UTC)[reply]
Thanks for the responses. Yikes that must have taken some time to edit all those manually. But better in the end because you fixed it in the article text too. I was avoiding that because of the context problems. User:The Earwig I think you could still edit the text in all the commune articles if you restricted to just the line at the top of the articles. Nearly all of them have the form "in the *** region" at the top of the article and you could just skip the ones that don't exactly match that in the first sentence. I do agree though that changing it anywhere else would require manual checking. Let me know if you think that is a good idea to try and I will modify my code for that task. Kinda just trying to find a way to still use the code that I wrote. But if you don't think it is a good idea that is ok too still good practice. Lonjers (talk) 21:03, 13 January 2016 (UTC)[reply]