Jump to content

Wikipedia:Link rot/URL change requests/Archives/2020/September

From Wikipedia, the free encyclopedia


Memento Mori

Hello wikipedian users!

Today I was browsing the Memento Mori wiki for a debate. And I noticed this URL about a bible passage but it was a dead/rotten URL. Since I do not know much about the bible is there anyone who can find a replacement URL for this?

For anyone helping I want to give a thanks ahead! With a friendly greetings from Dutch Wikipedian user 90.145.57.18 (talk) 09:49, 11 September 2020 (UTC) (User: Backupje (edit | talk | history | links | watch | logs)

That verse does not appear to exist, but here's a link to the whole chapter. – Jonesey95 (talk) 13:20, 11 September 2020 (UTC)

TSSZ News

Site was shut down in June 2020, however is still active with a closure notice. The URL, http://www.tssznews.com, should be marked dead and have links to them on Wikipedia (such as Sonic & Sega All-Stars Racing) to be changed to Wayback Machine archives. ❤︎PrincessPandaWiki (talk | contribs) 01:23, 15 September 2020 (UTC)

User:PrincessPandaWiki. Done. -- GreenC 02:46, 18 September 2020 (UTC)

Racing Post horse profiles

There has been an internal reorganisation of the horse profiles database at Racing Post (formerly at bloodstock.racingpost.com), so a number of citations (around a couple of thousand, almost all using {{cite web}}) should change to match. Fortunately the mapping from the old to the new URLs is fairly straightforward. Following advice from Help_talk:Citation_Style_1, the planned approach is to futureproof the citations by using a new specific-source wrapper for {{cite web}} which I've set up : {{cite Racing Post horse profile}}. The need is to convert from citations of the form
{{cite web|url=http://bloodstock.racingpost.com/stallionbook/stallion.sd?horse_id=531769|title=Galileo stud record |publisher=Bloodstock.racingpost.com |accessdate=2014-05-26}}
to
{{cite Racing Post horse profile|horse-id=531769|title=Galileo stud record|access-date=2014-05-26}}

These three parameters are the only ones needed (|access-date= is optional), and are derived from the original citation; any other parameters in the original citation - |publisher= and |date= and of course |url= - can be discarded.

Matching URLs may also include one or both of suffixes popup=1 and tab=stud, either before or after the horse_id - these aren't needed for the new-style citation. (e.g. a citation might include a URL such ashttp://bloodstock.racingpost.com/stallionbook/stallion.sd?horse_id=589690&popup=1&tab=stud).

Also, a number of citations have been marked dead, but can be recovered using the new citation format e.g.
{{cite web|url=http://bloodstock.racingpost.com/stallionbook/stallion.sd?popup=1&horse_id=303737 |archive-url=https://web.archive.org/web/20110724030812/http://bloodstock.racingpost.com/stallionbook/stallion.sd?horse_id=303737&popup=1 |url-status=dead |archive-date=2011-07-24 |title=Green Desert Stud Record |publisher=Bloodstock.racingpost.com |accessdate=2011-09-22}}
should convert to
{{cite Racing Post horse profile|horse-id=303737|title=Green Desert Stud Record}}.

Colonies Chris (talk) 15:32, 23 September 2020 (UTC)

Colonies Chris, I am curious what is the thinking behind creating a custom template? It will make it invisible to processes and tools that are not specially programmed for it. For example, IABot will not save any of the URLs at Wayback Machine since it can't translate the ID number into a URL. The URLs may or may not show up in the External links dump which some processes rely on including analysis. It doesn't really solve link rot because when sites change they usually leave some dead, some live and some change. This makes the top-level URL prone to breakage in the future. Then there would need to be a tool capable of decoding the custom template and override URLs on a per cite and/or add archive URLs and/or unwind the template back to a cite web. It's usually better to use standard templates with literal URLs unless there is reason to accept the downside of adding a layer of customized abstraction. -- GreenC 16:26, 23 September 2020 (UTC)
I'm following the advice I was given at Help_talk:Citation_Style_1#Horse_profile_citations. I understand this is a common way of accessing databases - there is a long list of such custom templates at Category:Citation_Style_1_specific-source_templates. Colonies Chris (talk) 16:34, 23 September 2020 (UTC)
It's too much work with custom templates. I'm willing to do the following because the tool is designed for it:
  • Retain the existing citation in use and convert the URL. This includes operations on standard CS1|2 templates, square and bare links.
  • It will check and verify the new URL is working.. if not it will add an |archive-url= or {{webarchive}} (assuming none exist)
  • If there is an archive URL it will be removed (if new URL working)
  • If there is a {{dead link}} it will be removed (if new URL working)
  • If a bare link will convert to square link with a generic link title
  • Change other fields at your request such as |publisher=, |access-date=, |date=
  • Clean up any noticeable patterns of bad data in the title field.
  • Report here statistics and can include full logging if you want it.
If this is acceptable let me know. Be careful anyone offering simple search/replace as most of these features would not get done. -- GreenC 00:20, 24 September 2020 (UTC)
Thank you, that sounds very suitable. The destination URL is https://www.racingpost.com/profile/horse/{{{horse-id}}} (1–6 digits, extracted from the source URL), and the |publisher= should be set to Racing Post. Colonies Chris (talk) 13:23, 24 September 2020 (UTC)
Colonies Chris: Couple updates. It will remove all |publisher= and |website= and replace with |work=Racing Post (blue link). The question of publisher vs. work has been religious on Wikipedia but the consensus seems to be to use work (or website which is an alias). Work has the added benefit of italicized since Racing Post is a publication. It is also checking every link in the subdomain bloodstock.racingpost.com, including those without a horse_id= and if they are 404 it adds an archive URL. Some are soft-404 (reporting 200 status but not returning usable content) so it can't do much for those require manual attention, but it can save the 404s at least. Diff. -- GreenC 00:36, 25 September 2020 (UTC)
Seeing a pattern. For example in Galileo (horse) the URL [1] redirects to [2] which is a soft-404. Will create a rule any bloodstock URL that is 301->200 check the HTML page size, since it's a blank page it can determine that way if it's really a 301->404 .. -- GreenC 01:05, 25 September 2020 (UTC)
Hi Greenc - yes, some existing links redirect correctly to the new URL, and some just redirect to a hanging blank page. But in all cases where I've tried transferring the horse-id into the the new-style URL, the link works fine. As for Racing Post, the article distinguishes between "Racing Post" the publishing company and Racing Post the newspaper published by that company. There are many citations to news items that rightly specify work=Racing Post, but citations to this database should not not do that - it is not a part of the newspaper, it's another product of the publishing company. Colonies Chris (talk) 08:45, 25 September 2020 (UTC)
Ok good, converted those back to publisher. Also, found a new type of link here where it's a mix of the new domain and old url path. -- GreenC 17:39, 25 September 2020 (UTC)
The race_id links appear to be broken, redirecting to a soft404 (from Slade Power). -- GreenC 18:41, 25 September 2020 (UTC)
Thanks for this. I hadn't looked at any links other than the bloodstock.racingpost.com ones. I noticed that in one modified citation in Erhaab, there is still a publisher=Bloodstock.racingpost.com unconverted? Colonies Chris (talk) 19:12, 25 September 2020 (UTC)
Yes if the target is |publisher= and the publisher field already exists it didn't change the value. This is an oversight but not too critical at the moment it could be fixed in a later pass. I think every URL-type in the www.racingpost.com domain (eg. race_id) should be evaluated for solutions, there appears to be a lot of broken links in the site. -- GreenC 19:55, 25 September 2020 (UTC)
Results
The work as discussed above is done: It edited 1,007 unique articles. Converted 1,083 bloodstock.racingpost.com URLs with a "horse_id=#" to www.racingpost.com "/profile/horse/#". Made 5,139 changes to metadata (publisher, work, etc). Added about 510 archive URLs. Removed 41 archive URLs. 28 barelink to squarelink. Plus some other things. -- GreenC 01:47, 26 September 2020 (UTC)
Thanks for this. I've noticed in Akeed Mofeed there is yet another form of link that redirects to the new URL format - http://www.racingpost.com/horses/horse_home.sd?horse_id=786709. Can those (I imagine there are more) be fixed too? Colonies Chris (talk) 09:35, 26 September 2020 (UTC)