Data hub
Appearance
A data hub (data management system, or DMS) is software for collaborating on gathering, sharing and using data.[1]
The term is usually used to refer to the new web-based generation of such products. They can be either platforms for handling lots of different kinds of data, or in verticals specialising in one particular field.
Features
At core, a DMS is a list of datasets that are of diverse schema.
Once you have that, people expect the following features, and/or tight integration with tools that provide them:[2]
- Load and update data from any source (ETL)
- Store datasets and index them for querying
- View, analyse and update data in a tabular interface (spreadsheet)
- Visualise data, for example with charts or maps
- Analyse data, for example with statistics and machine learning
- Organise many people to enter or correct data (crowd-sourcing)
- Measure and ensure the quality of data, and its provenance
- Permissions; data can be open, private or shared
- Find datasets, and organise them to help others find them
- Sell data, sharing processing costs between users
List of data hubs
- CKAN[2]
- Quandl
- InfoChimps
- DataMarket
- ScraperWiki[2]
- BuzzData
- Kasabi
- Factual
- GeoIQ
- Socrata
- Windows Azure MarketPlace
- Avoiding Mass Extinctions Engine
- PANDA project[2]
It's considered that a desktop operating system (e.g. Unix, OSX, Windows) is the legacy DMS that we use at the moment to do the things that would be better done by a good DMS[2].
References
- ^ "Data Hubs, Data Management Systems and CKAN | OKFN Notebook". Open Knowledge Foundation. 2011-04-27. Retrieved 2012-03-08.
- ^ a b c d e "From CMS to DMS: C is for Content, D is for Data". ScraperWiki. 2012-03-09. Retrieved 2012-03-12.