Jump to content

Talk:Cluster analysis: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
m Archiving 1 discussion(s) to Talk:Cluster analysis/Archive 1) (bot
PrimeBOT (talk | contribs)
m top: Task 24: template substitution following a TFD
Line 1: Line 1:
{{talkheader}}
{{Talk header}}
{{notice|{{Graph:PageViews|365}}|heading=Daily page views |center=y |image=Open data small color.png}}
{{notice|{{Graph:PageViews|365}}|heading=Daily page views |center=y |image=Open data small color.png}}
{{WikiProject banner shell|
{{WikiProjectBannerShell|
1={{WPDATABASE|importance=high|class=C}}
1={{WPDATABASE|importance=high|class=C}}
{{WikiProject Computer science|importance=high|class=C}}
{{WikiProject Computer science|importance=high|class=C}}
{{WikiProject Robotics|class=C|importance=mid|attention=yes}}
{{WikiProject Robotics|class=C|importance=mid|attention=yes}}
{{WPStatistics|importance=high|class=C}}
{{WPStatistics|importance=high|class=C}}
}}
{{User:MiszaBot/config
| algo=old(365d)
| archive=Talk:Cluster analysis/Archive %(counter)d
| counter=1
| maxarchivesize=75K
| archiveheader={{Automatic archive navigator}}
| minthreadsleft=5
| minthreadstoarchive=1
}}
}}
{{Copied
{{Copied
Line 33: Line 24:
|diff4 = http://en.wikipedia.org/enwiki/w/index.php?title=Cluster_analysis&diff=453684361&oldid=453662528
|diff4 = http://en.wikipedia.org/enwiki/w/index.php?title=Cluster_analysis&diff=453684361&oldid=453662528


}}
{{User:MiszaBot/config
| algo=old(365d)
| archive=Talk:Cluster analysis/Archive %(counter)d
| counter=1
| maxarchivesize=75K
| archiveheader={{Automatic archive navigator}}
| minthreadsleft=5
| minthreadstoarchive=1
}}
}}
{{backwardscopy|url=http://files.aiscience.org/journal/article/html/70110028.html|title=What is Data Mining Methods with Different Group of Clustering and Classification|org=American Institute of Science, American Journal of Mobile Systems, Applications and Services|year=2015|monthday=October|comments=The authors even copied the sentence: 'An overview of algorithms explained in Wikipedia can be found in the list of statistics algorithms.', and the content on Wikipedia significantly predates this publication.}}
{{backwardscopy|url=http://files.aiscience.org/journal/article/html/70110028.html|title=What is Data Mining Methods with Different Group of Clustering and Classification|org=American Institute of Science, American Journal of Mobile Systems, Applications and Services|year=2015|monthday=October|comments=The authors even copied the sentence: 'An overview of algorithms explained in Wikipedia can be found in the list of statistics algorithms.', and the content on Wikipedia significantly predates this publication.}}
{{Ticket confirmation|source=https://github.com/eXascaleInfolab/clubmark/tree/master/docs|id=2019021110001288|license=dual|note=Also available under [https://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0] and [https://www.apache.org/licenses/LICENSE-2.0 Apache 2.0]}}
{{Ticket confirmation|source=https://github.com/eXascaleInfolab/clubmark/tree/master/docs|id=2019021110001288|license=dual|note=Also available under [https://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0] and [https://www.apache.org/licenses/LICENSE-2.0 Apache 2.0]}}
{{dashboard.wikiedu.org assignment | course = Wikipedia:Wiki_Ed/New_York_University/Research_Process_and_Methodology_-_RPM_FA_2020_-_MASY1-GC_1260_200_Thu_(Fall_2020) | assignments = [[User:Rc4230|Rc4230]] | start_date = 2020-09-06 | end_date = 2020-12-06 }}


==Wiki Education Foundation-supported course assignment==
[[File:Sciences humaines.svg|40px]] This article was the subject of a Wiki Education Foundation-supported course assignment, between <span class="mw-formatted-date" title="2020-09-06">6 September 2020</span> and <span class="mw-formatted-date" title="2020-12-06">6 December 2020</span>. Further details are available [[Wikipedia:Wiki_Ed/New_York_University/Research_Process_and_Methodology_-_RPM_FA_2020_-_MASY1-GC_1260_200_Thu_(Fall_2020)|on the course page]]. Student editor(s): [[User:Rc4230|Rc4230]].

{{small|Above undated message substituted from [[Template:Dashboard.wikiedu.org assignment]] by [[User:PrimeBOT|PrimeBOT]] ([[User talk:PrimeBOT|talk]]) 17:53, 16 January 2022 (UTC)}}
== Inifinity-norm ==
== Inifinity-norm ==



Revision as of 17:53, 16 January 2022

Wiki Education Foundation-supported course assignment

This article was the subject of a Wiki Education Foundation-supported course assignment, between 6 September 2020 and 6 December 2020. Further details are available on the course page. Student editor(s): Rc4230.

Above undated message substituted from Template:Dashboard.wikiedu.org assignment by PrimeBOT (talk) 17:53, 16 January 2022 (UTC)[reply]

Inifinity-norm

Can someone please make infinity-norm a link: infinity-norm

(The article is currently locked.)

Sabotage

This page appears to have been deliberately vandalised.

Please unlock this page.

V-means clustering

A Google search for "V-means clustering" only returns this Wikipedia article. Can someone provide a citation for this?

for future ref, this is the V-means paragraph that was removed

V-means clustering

V-means clustering utilizes cluster analysis and nonparametric statistical tests to key researchers into segments of data that may contain distinct homogenous sub-sets. The methodology embraced by V-means clustering circumvents many of the problems that traditionally beleaguer standard techniques for categorizing data. First, instead of relying on analyst predictions for the number of distinct sub-sets (k-means clustering), V-means clustering generates a pareto optimal number of sub-sets. V-means clustering is calibrated to a usened confidence level p, whereby the algorithm divides the data and then recombines the resulting groups until the probability that any given group belongs to the same distribution as either of its neighbors is less than p.

Second, V-means clustering makes use of repeated iterations of the nonparametric Kolmogorov-Smirnov test. Standard methods of dividing data into its constituent parts are often entangled in definitions of distances (distance measure clustering) or in assumptions about the normality of the data (expectation maximization clustering), but nonparametric analysis draws inference from the distribution functions of sets.

Third, the method is conceptually simple. Some methods combine multiple techniques in sequence in order to produce more robust results. From a practical standpoint this muddles the meaning of the results and frequently leads to conclusions typical of “data dredging.”

Fuzzy c-means clarification

I believe ther is a typo at "typological analysis"; should be "topological"

The explanation of the fuzzy c-means algorithm seems quite difficult to follow, the actual order of the bullet points is correct but which bit is to be repeated and when is misleading.

"The fuzzy c-means algorithm is greatly similar to the k-means algorithm:

  • Choose a number of clusters
  • Assign randomly to each point coefficients for being in the clusters
  • Repeat until the algorithm has converged (that is, the coefficients' change between two iterations is no more than ε, the given sensitivity threshold) :
    • Compute the centroid for each cluster, using the formula above
    • For each point, compute its coefficients of being in the clusters, using the formula above"

Also aren't c-means and k-means just different names for the same thing, in which case can they be changed to be consistent throughout?



The c-means clustering relates only to the fuzzy logic clustering algorithm. You could say that k-means is teh convergence of c-clustering with ordinary logic, rather than fuzzy logic.

Remove or update grid-based clustering?

The grid-based clustering section has no real references and poorly described in comparison to the rest of the article.