Jump to content

Wikipedia:WikiProject Newspapers/Wikidata

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by G. Moore (talk | contribs) at 22:46, 16 February 2020 (Infoboxen: add example fields for infobox). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

 About Talk Goals Team Tutorials Data Projects Reviews & Alerts Research 
Introduction to linking a Wikipedia article with its Wikidata entry, or starting a new Wikidata entry.

Wikidata is a sister site to Wikipedia; it is a hybrid between a wiki and a database, so it's much more structured than Wikipedia. Each item is essentially a data entry, with links to other data entries; so the item for the New York Times will have elements like "instance of ... newspaper" and "located in ... New York, New York". (Sample Wikidata item for New York Times - this one has a ton of info, a small local paper might only have 3 or 4 statements.)

Wikidata is expected, over time, to play a greater and greater role in how information is organized on the Internet. Other web services can query it as a database, and pull out structured information.

Adding databases to Wikidata

Wikidata can offer great value simply by linking existing online databases (often websites). For instance, if one web site has a page for every lawyer in Nebraska, another has a page for every female published author in the U.S., and another has a page for everyone buried in a U.S. cemetery, then the Wikidata item for a deceased female lawyer-author from Nebraska could have an "identifier" linking to each of those pages, making it easier in the future for both humans and automated processes to "link" the scattered bits of online information about her.

99of9 has added the US NPL identifier to Wikidata, and linked thousands of U.S. newspapers to their US NPL pages. (For instance: visit the Portland Tribune's Wikidata entry and scroll down to near the bottom; then click the "2595" link.)

What other databases can we add? Here are some national ones:

  • USNPL  Done
  • Chronicling America, a project of the U.S. Library of Congress, which seems to use the LCCN identifier in its URL scheme (as do some other online databases)
  • Mondo Times
  • SmallTownPapers.com (appears to be a commercial archiving venture -- must be behind archiving project like this one)
  • Google's newspaper archive (not sure how useful it is as a data source, though it has tons of content)
  • Newspapers.com is pay-to-play, but seems to have a strong URL scheme for its pages, and they have a ton of archives. (They're also a Wikipedia Library partner, so there might be valuable lines of communication available.)
  • Podunk.com - many newspapers listed, requires more research to see how much useful info it has.
  • Echo Media, same - needs more research.

Oregon

  • Oregon Historical Newspapers archive (Univ. of Oregon) (uses LCCN as unique ID)
  • Oregon Newspaper Publishers Association - this one could be problematic, curious what data folks think. Tons of useful info, but it only has separate pages for General Members (not for Associate or Collegiate members, or non-members). So, over time...what if a newspaper drops its membership? Presumably, the record dies. Not sure how to handle.  Done

Infoboxen

One important example of how Wikidata will shift the way that information is organized is evident within the Wikimedia world: Wikidata is increasingly used in managing the kind of infoboxen that are a high priority for this WikiProject.

There is an Infobox Tutorial on Wikidata that might be worth reviewing.

There were 8,408 articles using the {{Infobox newspaper}}, as of February 16, 2020. See Link for the current count and Special:WhatLinksHere/Template:Infobox_newspaper for the current articles using this template. The data that should be included in this Infobox should include, at minimum: name=, type= (Daily, Weekly or monthly newspaper), foundation=, language=, ceased publication= (for defunct newspapers), headquarters= (address of newspaper), publishing_city=, publishing_country=, ISSN= (when known), oclc= (when known), and website= (when known).

Query retrieval

Sample image taken from the query listed here.
  Wikidata item, no WP article
  WP article, no infobox
  WP article with infobox
The map is generated by this Wikidata query Visit the link to zoom in on cities with more than one paper, etc. Map generated August 8, 2018.

When facts are stored in databases, you can ask questions about the whole set of facts at once. One way this is done on wikidata is using the Wikidata query service.

Here are some examples of queries relevant to this project:

  1. Map of all newspapers on wikidata if they have a recorded place of publication and that place has recorded coordinates. The map is colour coded according to whether there is an en-wiki article, and if so, the link is available by clicking on the point.
  2. USA newspapers without a place of publication please provide P291 if you can find it.

You can customize the queries above, or make your own. A tutorial and examples are available to kick you off.

Personalized automatically updating lists

If there is a specific subset of newspapers that you are interested in, and you can specify this with a query, you can get a personalized automatically updating list.

Here is an example by wikidata:User:Sic19 that lists a whole lot of information stored in wikidata about all Welsh newspapers. --99of9 (talk) 07:56, 10 August 2018 (UTC)[reply]

Things to do

  • Every newspaper (whether or not it's notable enough for a Wikipedia article) should have a Wikidata entry.
  • There is now a Mix'n'match set 1655 for Australian Newspapers you'd be welcome to help with. --99of9 (talk) 01:43, 7 August 2018 (UTC)[reply]

There is a closely related WikiProject on Wikidata; please consider reviewing their pages and/or joining that project.

(work in progress...please feel free to build out this page)