Head/tail breaks: Difference between revisions
No edit summary Tag: possible conflict of interest |
No edit summary |
||
Line 23: | Line 23: | ||
== External links == |
== External links == |
||
Head/tail breaks on single data array, https://github.com/digmaa/HeadTailBreaks |
Head/tail breaks implementation on single data array, https://github.com/digmaa/HeadTailBreaks |
||
{{DEFAULTSORT:Head tail Breaks}} |
{{DEFAULTSORT:Head tail Breaks}} |
Revision as of 08:30, 26 July 2014
This article may be too technical for most readers to understand.(June 2014) |
A major contributor to this article appears to have a close connection with its subject. (July 2014) |
Head/tail breaks is a new clustering algorithm scheme for data with a heavy-tailed distribution such as power laws and lognormal. The heavy-tailed distribution can be simply referred to the scaling pattern of far more small things than large ones. The classification is done through dividing things into large (or called the head) and small (or called the tail) things around the arithmetic mean or average, and then recursively going on for the dividing process for the large things until far more small things than large ones is no longer valid, or with more or less similar things left only.[1]
Motivation
The head/tail breaks is mainly motivated by inability of conventional classification methods such as equal intervals, quantiles, geometric progressions, standard deviation, and Jenks natural breaks optimization for revealing the underlying scaling pattern of far more small things than large ones. Note that the notion of far more small things than large one is not just referred to geometric property, but also to topological and semantic properties. In this connection, the notion should be interpreted as far more unpopular (or less-connected) things than popular (or well-connected) ones, or far more meaningless things than meaningful ones.
Method
Given some variable X that demonstrates a heavy-tailed distribution, there are far more small x than large ones. Take the average of all xi, and obtain the first mean m1. Then calculate the second mean for those xi greater than m1, and obtain m2. In the same recursive way, we can get m3 depending on whether the ending condition of no longer far more small x than large ones is met. For simplicity, we assume there are four means, m1, m2, and m3. This classification leads to four classes: [minimum, m1], (m1, m2], (m2, m3], (m3, maximum]. The resulting number of classes is referred to as ht-index, an alternative index to fractal dimension for characterizing complexity of fractals or geographic features: the higher the ht-index, the more complex the fractals.[2]
Applications
Instead of more or less similar things, there are far more small things than large ones surrounding us. Given the ubiquity of the scaling pattern, head/tail breaks is found to be of use to statistical mapping, map generalization, cognitive mapping and even perception of beauty .[3][4][5] It helps visualizing the underlying scaling pattern of far more small things than large ones.
References
- ^ Jiang, Bin 2013a. "Head/tail breaks: A new classification scheme for data with a heavy-tailed distribution", The Professional Geographer, 65 (3), 482 – 494.
- ^ Jiang, Bin and Yin Junjun 2014. "Ht-index for quantifying the fractal or scaling structure of geographic features", Annals of the Association of American Geographers, 104(3), 530–541.
- ^ Jiang, Bin, Liu, Xintao and Jia, Tao (2013). "Scaling of geographic space as a universal rule for map generalization", Annals of the Association of American Geographers, 103(4), 844 – 855.
- ^ Jiang, Bin (2013b). "The image of the city out of the underlying scaling of city artifacts or locations", Annals of the Association of American Geographers, 103(6), 1552-1566.
- ^ Jiang, Bin and Sui, Daniel (2013). "A new kind of beauty out of the underlying scaling of geographic space", The Professional Geographer, DOI: 10.1080/00330124.2013.852037
Further reading
This "Further reading" section may need cleanup. (June 2014) |
- Jiang, Bin (2014), Head/tail breaks for visualizing the fractal or scaling structure of geographic features, http://www.ucl.ac.uk/spacetimelab/stlab-news-publication/bin-jiang-geospatial
- Lin, Yue (2013), A comparison study on natural and head/tail breaks involving digital elevation models. http://www.diva-portal.org/smash/record.jsf?pid=diva2:658963
External links
Head/tail breaks implementation on single data array, https://github.com/digmaa/HeadTailBreaks