Jump to content

Data hub: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Pharun (talk | contribs)
Line 23: Line 23:


* [[CKAN]]<ref name="cms_to_dms"/>
* [[CKAN]]<ref name="cms_to_dms"/>
* [[Quandl]]
* [[InfoChimps]]
* [[InfoChimps]]
* [[DataMarket]]
* [[DataMarket]]

Revision as of 18:39, 29 June 2013

A data hub (data management system, or DMS) is software for collaborating on gathering, sharing and using data.[1]

The term is usually used to refer to the new web-based generation of such products. They can be either platforms for handling lots of different kinds of data, or in verticals specialising in one particular field.

Features

At core, a DMS is a list of datasets that are of diverse schema.

Once you have that, people expect the following features, and/or tight integration with tools that provide them:[2]

  • Load and update data from any source (ETL)
  • Store datasets and index them for querying
  • View, analyse and update data in a tabular interface (spreadsheet)
  • Visualise data, for example with charts or maps
  • Analyse data, for example with statistics and machine learning
  • Organise many people to enter or correct data (crowd-sourcing)
  • Measure and ensure the quality of data, and its provenance
  • Permissions; data can be open, private or shared
  • Find datasets, and organise them to help others find them
  • Sell data, sharing processing costs between users

List of data hubs

It's considered that a desktop operating system (e.g. Unix, OSX, Windows) is the legacy DMS that we use at the moment to do the things that would be better done by a good DMS[2].

References

  1. ^ "Data Hubs, Data Management Systems and CKAN | OKFN Notebook". Open Knowledge Foundation. 2011-04-27. Retrieved 2012-03-08.
  2. ^ a b c d e "From CMS to DMS: C is for Content, D is for Data". ScraperWiki. 2012-03-09. Retrieved 2012-03-12.