Jump to content

Classifier (mathematics): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Unsupervised learning: - filling information
Line 20: Line 20:


=== ''Unsupervised learning '' ===
=== ''Unsupervised learning '' ===
The '''[[unsupervised learning]]''' consists in learning from input [[pattern|patterns]] with no output values specified.
The '''[[unsupervised learning]]''' consists in learning from input [[pattern|patterns]] with no output values specified. The main problem of this technique is how to take a decision between all patterns provided. The system takes the input objects as a set of random variables, building a density model for that data set.

The main problem of this technique is how to take a decision between all patterns provided. The system takes the input objects as a set of random variables, building a density model for that data set.


=== ''Reinforcement Learning'' ===
=== ''Reinforcement Learning'' ===

Revision as of 18:32, 27 November 2009

A classifier is a sort of a function that provides a tagged class as an output from a set of attributes taken as inputs. A way to build a classifier is to take a set of tagged examples and try to define a rule that could assign a tag to any other kind of input data.


Introduction

Due to the technological progress and the need that people have to live surrounded by much information as possible, the amount of digital multimedia files is growing very rapidly. This necessitates the search for efficient methods that make possible to quickly retrieve relevant audiovisual information.

Learning and Data Mining

Both Data Mining and Machine learning are techniques related to the processing of large amounts of data.

The Data Mining technique tries to obtain patterns or models from the data collected.

Machine learning is the basic part that the different types of existing classifiers have in common. The basic idea of learning is using the perceptions not only to act but also to improve the ability of an agent to act in the future.

There are different types of learning techniques:

Supervised learning

The supervised learning involves learning a function from tagged examples above, to establish a correspondence between the inputs and the desired outputs of the system. It is not always possible to do this type of training because the expected output in the input function has to be known. The learning system tries to tag(classify) a set of vectors choosing one of several categories (classes).

Unsupervised learning

The unsupervised learning consists in learning from input patterns with no output values specified. The main problem of this technique is how to take a decision between all patterns provided. The system takes the input objects as a set of random variables, building a density model for that data set.

Reinforcement Learning

The reinforcement learning is a way of learning by observing the world.

The idea of learning consists in building a function with the observed behaviour as their input and output. Learning methods can be understood as the research of a rank of hypothesis to find the appropriate function.

Type of classifiers

Bayesian classifier

A Bayesian classifier is a pattern classifier based on statistical theories of learning. Bayesian learning calculates the probability of each hypothesis of the data and makes predictions on this bases. It is an almost optimal learning, but it requires an expensive computational cost because the rank of hypothesis is usually very large, even it may be infinite.

Parzen classifier

This classifier is based on the data histogram, it estimates the densities of each class.

Backpropagation classifier

Using simple models or parametric density or histogram models do not always give the desired results in some of the observed situations . In these cases, a research of more sophisticated density models needed. Neural networks are a useful approximation technique to build parametric models of density. The usual neural network model that uses this algorithm consists of a network with an input layer with as many nodes as inputs have, a hidden layer with a variable number of nodes that depend on the characteristics of the situation, and an output layer with as many nodes as possible outputs have.

Classifier with PCA

The new features of the PCA method (Principal Component Analysis) are functions of the old. A dataset is taken and a smaller linear subspace is buildt.

Support vector machine

The support vector machine (SVM) is a simple technique that give a great use when a classifier is trying to be built with the use of examples. Unlike neural networks which aim to build a model after an event, the SVM's tries to get the border decision. Its ease is an advantage because you just have to encode the geometry of the border.

Applications

The applications of classifiers are wide-ranging. They find use in medicine (drug trial analysis, MRI data analysis), finance (share analysis, index prediction), mobile phones (signal decoding, error correction), computer vision (face recognition, target tracking), voice recognition, datamining (supermarket purchasing analysis, retail customer analysis) and uncountable other areas.

An example is a classifier that accepts a person's salary details, age, marital status, home address and credit history and classifies the person as acceptable/unacceptable to receive a new credit card or loan.

For a list of classifier applications and classifier technologies, please see statistical classification.

See also