Jump to content

Talk:Data mining

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 112.135.57.93 (talk) at 02:19, 23 April 2011 (suggestion for new section). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

suggestion for new section

In this article, there are many applications in data mining but not about data mining in meteorology. I'm interesting in writing an article related that. I would like to know about suggestions about it. — Preceding unsigned comment added by Inoshika (talkcontribs) 14:18, 16 April 2011 (UTC)[reply]

In my opinion, all the "notable" uses of data mining should be moved into a separate article. The current article is a huge mess IMHO. May I suggest to just start a new article somewhere, e.g. "User:Inoshika/Data mining in meteorology" and then have it moved and linked at some point? The other "uses" paragraphs could be cloned into appropriate articles, too. So we can leave all those business-marketing-analaysis people to their own article mess... --87.174.61.197 (talk) 17:53, 16 April 2011 (UTC)[reply]
Ya. Thats what I thought too. Write a separate article as "data mining in meteorology" and link it under "Applications". --112.135.1.146 (talk) 00:38, 17 April 2011 (UTC) —Preceding unsigned comment added by 112.135.1.146 (talk) 00:34, 17 April 2011 (UTC)[reply]
I added that article. It's Data mining in meteorology. and it was linked under applications--112.135.57.93 (talk) 02:19, 23 April 2011 (UTC)[reply]

2009

Data mining vs Information extraction

pdfpdf added this to the top of the article, I have moved it here as it seems like more a discussion topic.

Note: "Data mining" is a quite different process to "Information extraction"

Dmmd123 (talk) 00:00, 9 January 2009 (UTC)[reply]

It was intended to be a "disambiguation-type" entry.
It is quite common for those not involved to assume that "Data mining" means "Information extraction".
The article makes no mention of this common incorrect assumption, hence I added the "hat-note".
Although my edit was perhaps not the best way to address the issue, I disagree that it is a "discussion topic".
What is a better way to do it? Cheers, Pdfpdf (talk) 12:21, 9 January 2009 (UTC)[reply]
As there has been no response or discussion, I have reinstated my edit. Pdfpdf (talk) 10:44, 13 January 2009 (UTC)[reply]
There are two reasons I do not think this should go at the top of the article.
1. Linking at the top of the page like that is normally for disambiguation - when someone searches for one thing but wanted the other. At the very least it should conform to the style guide.
2. The reason that confusion arises is because this article poorly defines datamining. I think we both agree Datamining and Information extraction are two very unique things. This article basically gives a whole lot of fairly cloudly examples of where dataming might be used:
It has been suggested that both the Central Intelligence Agency and the Canadian Security Intelligence Service have employed this method.
When there are much clearer examples where datamining has been used.
Dmmd123 (talk) 23:05, 13 January 2009 (UTC)[reply]
Regarding your second reason:
Yes, I agree with your second reason, that "I think we both agree Datamining and Information extraction are two" quite different things. I also agree that "this article poorly defines datamining" and also that the examples given are indeed, (to be generous), "a bit vague".
However, I'm not quite sure how that relates to the hat-note.
(Unless you are implying something like: "If the article was clear, then we wouldn't need the hat-note"). If that is your intent, then I don't quite agree - the hat-note says it before the reader starts on the article; even if the article were clear, the reader would still have to read the article before they realised what is already stated in the hat-note.)
Regarding your first reason:
I thought I had already addressed that. I agree that: "Linking at the top of the page like that is normally for disambiguation - when someone searches for one thing but wanted the other." As I said above, that is my intention here - i.e. someone comes here thinking that "data mining" is a synonym for "information extraction". The first thing they encounter is a line saying: "Data mining" is a quite different process to "Information extraction".
My view is that I don't see any point in slavishly conforming to the style guide when strictly following the guide is not applicable; this certainly conforms to the intent of the style guide. Further, the guide is a guide, it is not a set of rules.
Your thoughts? Cheers, Pdfpdf (talk) 10:15, 14 January 2009 (UTC)[reply]
Carrying on from my original question "What is a better way to do it?", it would seem that this is the answer. Pdfpdf (talk) 12:36, 14 January 2009 (UTC)[reply]
Why does the article seem to assume that data == personal data? Particularly, it says "As more data are gathered, with the amount of data doubling every three years", where I can think of a number of scenarios where the amount of data is not doubled every three years. If data mining is specific to a type of data, the article should say that - my personal understanding (which could be wrong of course) is that data mining is the act of representing seemingly random data in a meaningful form in an attempt to highlight patterns for a number of purposes. So for example you may take data (temperature, humidity, air pressure) from hundreds of weather stations and 'mine' it to produce a forecast, or to better understand how weather works. Am I incorrect here? 90.208.217.227 (talk) 22:44, 9 December 2009 (UTC)[reply]

I'm not out to start a religious war here, but it's my understanding that knowledge can only exist between two ears; by "definition", machines can not create "knowledge" - only people can create "knowledge". Machines can only turn data (and information) into information.
What are other people's opinions? Pdfpdf (talk) 09:06, 14 January 2009 (UTC)[reply]

In my experience the Datamining/KDD field uses a faily precise vocabulary. Information is a term which is normally avoided as it is too vague, it can describe a fact, a pattern, piece of knowledge. Normally Datamining algorithms are described as returning patterns, this is stated in the opening sentence. The sentence in question states that data mining is becoming an increasingly important tool to transform this data into (information/knowledge). Datamining transforms data into knowledge through the KDD process (Knowledge Discovery in Databases) - which should be explained in the article as KDD redirects here. The final step of KDD is for a human interpret the datamining patterns into knowledge - the stuff between our two ears. Perhaps this sentence needs to be clarified, but Datamining is definitely a tool which assists in the creation of knowledge. Dmmd123 (talk) 22:42, 14 January 2009 (UTC)[reply]
Thanks for replying - I find it useful to read a second opinion.
Let's do the easy one first: I completely agree that "Datamining is definitely a tool which assists in the creation of knowledge."
(Conversely, I vehemently disagree that datamining creates knowledge - fortunately for me, you didn't say that.)
KDD redirects here, as does Knowledge Discovery in Databases. I'm afraid that's not helpful in telling me what KDD is "supposed" to mean, or if (or how) it is different to datamining (or not), so I can't make any comment about KDD.
"In my experience the Datamining/KDD field uses a faily precise vocabulary." - My experience has been more varied than yours - i.e., some papers and texts are indeed quite specific, however, I have also come across others that are, to be generous, "vague and non-specific". So that doesn't help much either.
"Information is a term which is normally avoided as it is too vague, it can describe a fact, a pattern, piece of knowledge." - Is normally avoided by whom? I could accurately make the same comment about data, knowledge and dozens of other terms (and provide a mountain of supporting evidence). I'm not sure what point you are trying to make here.
"Normally Datamining algorithms are described as returning patterns" - Agreed.
"this is stated in the opening sentence" - Agreed.
"The sentence in question states that data mining is becoming an increasingly important tool to transform this data into (information/knowledge)." - Agreed. However, I would be more comfortable if the sentence said "important tool to assist in the transformation of this data".
"Datamining transforms data into knowledge through the KDD process (Knowledge Discovery in Databases) - which should be explained in the article as KDD redirects here." - Whoa Nelly! You make a number of points here.
  • "Datamining transforms data into knowledge through the KDD process" - I have my doubts here, not the least of which is the first one that I don't know what KDD is supposed to be, but my first reaction (based on ignorance) is that I sincerely doubt that KDD "transforms data into knowledge". (However, I would have no problem with "KDD assists with the transformation of data into knowledge".)
  • "which should be explained in the article" - Agreed. In fact: Strongly agreed !!
  • "as KDD redirects here" - Is KDD really an indistinguishable synonym for data mining? Somehow, I doubt it - what would be the point of yet-another-name when the perfectly good name "data mining" already exists? i.e. I suspect that KDD must have some characteristics of its own which distinguish it from data mining. Again, it would seem that a definition of KDD would be useful ...
"The final step of KDD is for a human interpret the datamining patterns into knowledge - the stuff between our two ears." - That is what I would expect, which therefore, in the terminology that I have been using, suggests that KDD creates information, not knowledge. By-the-way, I'm not trying to split hairs and be pedantic here - I'm trying to work out what other people mean by "information" and "knowledge".
"Perhaps this sentence needs to be clarified" - I would classify that sentence as a major understatement! ;-) My biased opinion is that, until such clarification is achieved, the conversation will go around in circles.
Thank you for the food for thought. Cheers, Pdfpdf (talk) 12:39, 15 January 2009 (UTC)[reply]
Let me add that I also use the term knowledge to refer to what is "between the ears"--a human construction. Information, a term I find no less precise than knowledge, is what can be externalized by humans and transferred to other humans (who construct their knowledge). I do not say that only humans can construct knowledge, but it is much better to say that both machines and people are processing information and that the end result of human processing is called "knowledge" and the end result of (all current) machines is further information. —Preceding unsigned comment added by Robotczar (talkcontribs) 22:15, 12 May 2010 (UTC)[reply]

KDD vs DM?

Is there a difference between KDD and DM?
What is the difference between KDD and DM?

I've been looking for a "good" definition of KDD.

One that keeps cropping up is Knowledge discovery is defined as "the non-trivial extraction of implicit, unknown, and potentially useful information from data" Frawley, W.J., Piatetsky-Shapiro, G., and Matheus, C. Knowledge Discovery In Databases: An Overview. In Knowledge Discovery In Databases, eds. G. Piatetsky-Shapiro, and W. J. Frawley, AAAI Press/MIT Press, Cambridge, MA., 1991, pp. 1-30.

Personally, I'm not sure if or how that is different from Data mining.

Yet, in [1], Peggy Wright says (circa 1997) that:

  • in Fayyad, U.M., Piatetsky-Shapiro, G., and Smyth, P. From Data Mining To Knowledge Discovery: An Overview. In Advances In Knowledge Discovery And Data Mining , eds. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, AAAI Press/The MIT Press, Menlo Park, CA., 1996, pp. 1-34.
  • a clear distinction between data mining and knowledge discovery is drawn. Under their conventions, the knowledge discovery process takes the raw results from data mining (the process of extracting trends or patterns from data) and carefully and accurately transforms them into useful and understandable information. This information is not typically retrievable by standard techniques but is uncovered through the use of AI techniques.

That was 10 years ago. Has anything changed?
What do others think?
Can anyone supply links to current definitions of DM & KDD?
Thanks, Pdfpdf (talk) 05:18, 26 January 2009 (UTC)[reply]

If you read the full article from Fayyad you will see that he goes on to talk about KDD as an overall process and DM as a particular step in that process. This might help: http://seclab.cs.ucdavis.edu/projects/misuse/meetings/KDD.html
In current colloquial usage the terms DM is commonly used to mean KDD, although this is incorrect. (no source apart from my own opinion on that one).
131.170.90.2 (talk) 02:42, 28 January 2009 (UTC) actually by me but not logged in Dmmd123 (talk) 04:28, 29 January 2009 (UTC)[reply]

"longitudinal changes"??

What the heck is "longitudinal changes"?? 65.13.73.143 (talk) 15:18, 2 March 2009 (UTC)[reply]

Too obvious

This sentence has twice been removed because it is too obvious: "However, while it can be used to uncover hidden patterns in data that have been collected, it can neither uncover patterns that are not already present in the data, nor can it uncover patterns in data that have not been collected."

pdfpdf undid it the first time because: a) If I didn't think it added value, I wouldn't have put it in there. b) "unnecessary" is subjective. c) You would be surprised (appaled?) by how the obvious isn't obvious to those who don't think.

I have just added it again on the basis of argument C. I think this is an important sentence as it clarifies that datamining is not some crystal ball which can magically tell you the future - it is a data analysis technique. It also helps clarify the distinction between data collection and datamining.

Dmmd123 (talk) Revision as of 18:10, 21 March 2009

As far as argument C -- in the articles on databases, we don't say "Obviously a database can't store data that doesn't exist". I agree with you that a lot of people seem to lack common sense, but I don't feel that this justifies overloading wikipedia with each piece of common sense knowledge that our brains contain. If some people are too stupid to realize that a piece of software can't manipulate data that doesn't exist, then that should be their problem, and shouldn't become a problem for everyone else here who is not stupid. The intelligent reader should not have to go through droves of commonsense knowledge to learn about a topic they might not understand. Anyhow, that's why I have removed it. Unfortunately, I am going to have to be away from wikipedia for a few weeks here, so I won't be able to participate further in this discussion until I return, but I just wanted to throw that in before I left. --- Jrtayloriv (talk) 18:10, 23 March 2009 (UTC)[reply]
I mostly agree with most of what you say, but you are only addressing a small part of the issue. You don't seem to be addressing the issue that Dmmd123 has raised above, nor the issue that I have raised below. As I said below, perhaps whats there doesn't do the job it is intended to be doing well enough, but the solution to that is not to remove it; the solution is to change it so that it does the job better. Pdfpdf (talk) 22:15, 23 March 2009 (UTC)[reply]
I agree entirely with Jrtayloriv. This is not an "important point" for the very simple reason that it is not a point at all. For something to be a point, it has to be in some way controvertible. The statement in question is not. Pdfpdf's argument a (with all due respect) is not an argument. Every single person who contributes to wikipedia in good faith can make the statement that they would not have contributed if they didn't think what they were writing added value; that does not impact at all on the question of whether their additions actually *do* add value. Likewise, argument b is a basically meaningless statement. Yes, "unnecessary" is subjective. It's one of the common, even standard, subjective terms with which we all routinely work. One hopes to use it in a way that most people will also, subjectively, find reasonable. Jrtayloriv points to the fact that most reasonable people would find the statement in question unnecessary. S/he has already addressed argument c.

In response to Dmmd123, and your point below, Pdfpdf, that many people want have magical thinking when it comes to data-mining, I am making a change in the article itself. Peace. —Preceding unsigned comment added by 156.56.192.250 (talk) 06:21, 3 April 2009 (UTC)[reply]

Purpose of the lead paragraph

The purpose of the lead paragraph is to summarise the article and highlight the important.
The issue should not be whether the statement is obvious or not; the issue should be whether the statement summarises an important point.
Both Dmmd123 and I think it is an important point. The question is, "Does it summarise it?" I suspect the answer may be, "Not very well."
All contributions towards improving the summary are welcome. Complete removal of the summary, without replacing it with something better, is not welcome. Pdfpdf (talk) 09:03, 21 March 2009 (UTC)[reply]

External links, in the External links section or not, should follow the appropriate policies and guidelines. Usually this means WP:EL, WP:SPAM, and WP:NOTLINK. Sometimes these links are just references are not formatted properly, so WP:RS, WP:SELFPUB, and related policies and guidelines apply. --Ronz (talk) 16:28, 7 April 2009 (UTC)[reply]

>External links ... should follow the appropriate policies and guidelines.
Yes. And these do. If you disagree, please state specifically what the problem(s) is/are.
>Sometimes these links are just references are not formatted properly
I'm sorry, I don't understand.
Also, you say, "please discuss". Certainly. What do you want to discuss. Cheers, Pdfpdf (talk) 16:39, 7 April 2009 (UTC)[reply]

In the Data mining tools and vendors section, the Gartner study acts as the ref for the listing of the vendors and products. Adding offsite links directly to the products should be avoided. It would be better to find additional third party articles about those entries and create their own articles (should be enough out there for those three to easilly establish notability to support their own articles). --- Barek (talkcontribs) - 17:31, 7 April 2009 (UTC)[reply]

That sounds good to me. (i.e. I agree with you.) --Pdfpdf (talk) 13:57, 8 April 2009 (UTC)[reply]

Misleading edit comment

Regarding this edit and this edit, your edit comment is misleading.
You have said: "unsourced - comes off as an advertisement"
This directly implies: "If you supply a source for this, and if you make it more objective and less like an advertisement, then it will be an acceptible contribution."
This is misleading.

The real problem with that bit of text is that it is not an example of data mining.

Even if it were perfectly written and compliant with all wiki standards and guidelines, someone with knowledge of the field would have quickly removed it, because it is simply irrelevant. An edit comment that implies that if the contributor "fixes it up" it will be OK is quite misleading for the contributor. If the contributor does "fix it up", they will have every justification for being annoyed when their work is once again removed, this time for the real reason.

--Pdfpdf (talk) 00:28, 11 April 2009 (UTC) [reply]

Extended content

I don't want to sound ungreatful, but this and other edits currently being performed by User:Ronz are a level of intervention which, at this stage of the development of the article, is proving to be counter-productive. This is because this intervention is getting in the way of the development of the article, and distracting people from the job of developing the article.

As an analogy, it is like someone has walked into a car repair shop and insisted that a car's flat tyres be inflated, without considering whether the tyres have sufficient tread on them to be used safely. Yes, the tyres will need inflating before the car is roadworthy, but let's make sure that we have the right tyres on the car first, and then that the tyres are in roadworthy condition, before we think about inflating them.

So User:Ronz, what do you think of the idea of you waiting a few months until the contributors have got their facts straight, and then come along and address the wiki-concerns? Given that you are complaining that you are busy and don't have the time to do the job properly anyway, I would have thought such a suggestion would be attractive to you. Pdfpdf (talk) 00:50, 9 April 2009 (UTC)[reply]

Please follow WP:NPA, WP:BATTLE, and WP:TALK. Thanks! --Ronz (talk) 02:15, 9 April 2009 (UTC)[reply]
For someone who professes to know everything about everything, and is always right, you have appalling manners.
To use your style of address: "Please follow WP:CIVIL and WP:AGF."
You do not appear to have any interest in any opinion but your own. When someone, (and looking at your talk page, there have been many. Very many.), points this out to you, you hide behind your self-righteous self-opinion, and delete their contributions. Of the dozen or so questions I have asked you, you have only answered one.
For your information, YOU are the ONLY person who cares if you are busy. Your "busy-ness" is YOUR choice, YOUR problem, and YOUR responsibility. Not mine. Not anyone elses.
If you are "too busy" to edit wikipedia, then there is a simple solution - DON'T spend your "valuable time" editing wikipedia.
For your further information:
It is extremely rude to alter another editors contributions, and thus change the intent of their statements and hence misrepresent their arguments. Don't do it.
It is slanderous to make false, misleading and unsubstantiated accusations.
It is cowardice to not take responsibility for one's own statements, and to hide behind a string of vague, non-specific generalisations.
It is far from useful to nitpick about the trivial detail of the finished product when the basic foundations are being laid.
It is inappropriate and unproductive to make uninformed irrelevant comments about a topic where you demonstrate you have no knowledge.
And in any society, it is completely inappropriate to insult people who have been elected by their peers to represent them.
I have made (at least) five attempts to politely and logically bring this information to your attention. My five attempts have NEVER involved ANY aspects of the non-specific alphabet soup you seem to take great solace in quoting. In doing so, I have asked you a number of questions in order to better understand your concerns. Of those five attempts, you have reverted three without either comment or explanation, to another you have replied "I don't have time for this", and to the one above, you have responded with alphabet soup.
Now, I will repeat myself by saying, your edits are disruptive, uninformed, negative, and NOT useful.
Despite your view of the world, they are NOT adding value.
In fact, not only are they not adding value, but they are distracting people from making useful contributions to the development of this article, and wasting their time by forcing them to deal with issues that, at this stage of development of the article, are irrelevant.
Further, as I have politely descibed above, they are misleading.
If you are too busy and don't have the time to inform yourself properly and do a proper job of editing here, then please, DON'T edit here. Go somewhere else where you have knowledge of the domain and can make a POSITIVE contribution. I will point out that you have not made ONE POSITIVE contribution here - EVERY one of your contributions here has been negative.
If any of the above is NOT clear to you, please ask for clarification - I am only too happy to reply to specific questions about specific issues. --Pdfpdf (talk) 14:08, 9 April 2009 (UTC)[reply]

Discussion: The nature of useful resources in the fields of Data Mining and KDD

It seems there are problems with some of the useful resources in the "External links" section and other places in the article, and they have been removed. Sadly, some of them contravene wikipedia guidelines.

This is a problem in the field of Computer Science due to the nature of the field. Unlike many fields that have an established body of reliable literature, and where new information arises infrequently and in small increments, Computer Science is rapidly evolving on a large number of fronts. Often the best, and the only, sources of information are on sites which, for other reasons, contravene the guidelines.

So we find ourselves on WP of being in the situation of "throwing the baby out with the bathwater", because the guidelines require the whole thing to be thrown out, not just "the bathwater".

I can not immediately think of a solution to this problem.
I invite and welcome positive contributions and discussion towards the solution of this problem.
Also, this problem can not be a unique to this page. Pointers to how other communities address this situation would be useful. --Pdfpdf (talk) 01:47, 11 April 2009 (UTC)[reply]

List of relevant wikipedia guidelines

The following is a list of pointers to relevant WP guidelines. If you feel others are relevant, please add them to the list.
(This list is in sort by the wikipedia shortcut.)

Discussion

Ignoring the WP:COI problems for the moment, the following two links should be removed from the article per WP:ELNO #4 & 13 and WP:NOTLINK. They certainly are useful for finding references that could be used to verify information currently in the article and for future expansion.

--Ronz (talk) 17:29, 14 April 2009 (UTC)[reply]

Thanks for the info - will try to reply before midnight "real soon", but not today I'm afraid. Pdfpdf (talk) 15:41, 15 April 2009 (UTC)[reply]

data mining

Data mining is the process of extracting patterns from data. Data mining is becoming an increasingly important tool to transform these data into information. It is commonly used in a wide range of profiling practices, such as marketing, surveillance, fraud detection and scientific discovery.

Data mining can be used to uncover patterns in data but is often carried out only on samples of data. The mining process will be ineffective if the samples are not a good representation of the larger body of data. Data mining cannot discover patterns that may be present in the larger body of data if those patterns are not present in the sample being "mined". Inability to find patterns may become a cause for some disputes between customers and service providers. Therefore data mining is not fool proof but may be useful if sufficiently representative data samples are collected. The discovery of a particular pattern in a particular set of data does not necessarily mean that a pattern is found elsewhere in the larger data from which that sample was drawn. An important part of the process is the verification and validation of patterns on other samples of data.

The term data mining has also been used to describe data dredging and data snooping. However, dredging and snooping can be (and sometimes are) used as exploratory tools when developing and clarifying hypotheses

Data mining commonly involves four classes of task:[10]

   * Classification - Arranges the data into predefined groups. For example an email program might attempt to classify an email as legitimate or spam. Common algorithms include Decision Tree Learning, Nearest neighbor, naive Bayesian classification and Neural network.
   * Clustering - Is like classification but the groups are not predefined, so the algorithm will try to group similar items together.
   * Regression - Attempts to find a function which models the data with the least error.
   * Association rule learning - Searches for relationships between variables. For example a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis.
   * See also structured data analysis.  —Preceding unsigned comment added by 203.223.189.2 (talk) 11:19, 20 February 2010 (UTC)[reply] 

new methord for DM

what are you doing in the eara? —Preceding unsigned comment added by 219.143.128.241 (talk) 07:26, 25 September 2010 (UTC)[reply]

Focus

I think the majority of this article focused on the negative uses of data mining so I added examples of data mining where it was not negative. —Preceding unsigned comment added by 98.243.168.193 (talk) 02:51, 17 January 2011 (UTC)[reply]


Isn't data mining also a part of statistics?

Should I look for references for that? Talgalili (talk) 21:33, 25 January 2011 (UTC)[reply]

What do you propose exactly? The data analysis category is already in statistics, so this article is also in category statistics. Obviously, statistics is used in data mining a lot (so is math in general, databases, ...). Many of the data mining methods have moved away from a sound statistical model and essentially define their output by what the algorithm finds, and not some statistical reasoning. Data mining of course has its roots in statistics, though. --Chire (talk) 10:51, 27 January 2011 (UTC)[reply]
Hi Chire. I'm asking so to know if I can add "statistics" to the first sentence in the article that states:
Data mining, a branch of computer science[1] and artificial intelligence,[2] (and I think we should add - "statistics")
Talgalili (talk) 12:36, 27 January 2011 (UTC)[reply]
Statistics definitely should be mentioned in the first paragraph. I like the definition of Encyclopedia Britannica ([1] in the article): "data mining ... combines tools from statistics and artificial intelligence ... with database management ..." [2]. Because there are many aspects that go beyond statistics. We should rewrite the paragraph to convey this. If I have time, I'll give it a go. --Chire (talk) 15:48, 31 January 2011 (UTC)[reply]
Thanks for the rewrite :) Talgalili (talk) 17:07, 31 January 2011 (UTC)[reply]

Rexer DM survey notability dispute

I would like to dispute the notability tag added by User:Melcombe to the Rexer's Annual Data Miner Survey article. Unlike statistics, data mining is a relatively young and interdisciplinary field. Not as much writing has been done for it. I should think it significant that 735 participants from 60 countries participated in the most recent 2010 survey. Moreover, each year more people become involved with the survey each year. Compare this with other surveys that have articles on Wikipedia. --Luke145 (talk) 20:16, 22 April 2011 (UTC)[reply]