Jump to content

User:Mpennin/sandbox: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Mpennin (talk | contribs)
No edit summary
Mpennin (talk | contribs)
No edit summary
Line 5: Line 5:
==Biography==
==Biography==


===Early Life===
Geoffrey Hinton was born on December 6, 1947 in Wimbledon, United Kingdom to parents Howard and Margaret Hinton. From the ages of 7 to 17 he attended Clifton College in Bristol, although he disliked it. Geoff first became passionate about Psychology at the age of 20, when he was attending Cambridge University. He describes his main interest as to understand how the brain computes, and more specifically how a large number of neurons can interact and learn to do all the things the human brain does.
===Education===
===Education===
Hinton graduated from Cambridge University in 1970 and received his Bachelor of Arts in experimental psychology. Shortly thereafter he obtained a PhD in Artificial Intelligence from the University of Edinburgh in 1978. During his years at Edinburgh, he did much study on the topic of relaxation methods used by intelligent agents in visualizing and identifying a whole object out of smaller confusing objects, or “noise”. His thesis paper is entitled “Relaxation and its role in vision”.
In this section I will outline Hinton's educational background.
===Career===
===Career===
Hinton did postdoctoral work at Sussex University and the University of California San Diego and spent five years as a faculty member in the Computer Science department at Carnegie-Mellon University. During the 1980s, Hinton did a great deal of research on parallel computing architecture in neural networks. It became apparent prior to much of his work that the brain did not compute in an entirely linear fashion. Many aspects of intelligent behaviour cannot be feasibly carried out unless multiple operations take place in many different parts of the brain at the same time; this lead to his work on various AI machines. Throughout his career, Hinton devised many theories and models in attempt to explain and understand how the brain works. He began in 1983 with the Boltzmann machine, and from there on would amend this architecture, reaching each time a more close approximation.
Here, I will describe Hinton's career so far and what his current research interests are.
====The Boltzmann Machine====
The Boltzmann machine was co-designed by Hinton in order to better understand how the human brain works using an artificial neural network. It is a network of neuron-like units connected symmetrically with one another which make decisions in a stochastic manner about whether to be activated or not (on or off). The binary nature of the units represents a true/false relationship with regards to the information presented to each neural unit. In essence, the Boltzmann machine is a learning machine. The Boltzmann machine, like a Hopfield network, is a network of units with an “energy” defined for the network; but unlike the Hopfield network, the binary units operate in a stochastic manner. The global energy, , in a Boltzmann machine is identical in form to that of a Hopfield network:

Where:
• is the connection strength between unit and unit .
• is the state, , of unit .
• is the threshold of unit .
The connections in a Boltzmann machine have two restrictions:
• . (No unit has a connection with itself.)
• . (All connections are symmetric.)

There are two kinds of problems Boltzmann machines are used to solve: a search problem, and a learning problem. In the former, the weights between any two nodes are fixed, and the stochastic process is implemented and used like a cost function. The architecture samples vectors that are not high on the cost function. In the latter, the weights between any two nodes are not fixed, and the machine instead is shown a set of binary data vectors from which it must learn to generate a high probability and thus a high weight.
Unfortunately, the speed of learning in a Boltzmann machine can be quite slow. This can be overcome by restricting the connectivity and simplifying the learning algorithm. What results is what’s called a “Restricted Boltzmann Machine”. The primary restriction in this architecture is that there cannot be a layer of hidden units which contain visible-visible or hidden-hidden connections.
====Backpropagation====
Short for “backward propagation of errors”, backpropagation was theorized by Hinton and colleagues, and is a method common in training artificial neural networks (ANNs). Backpropagation is a goal-oriented system, in which the ANN works towards a desired output, learning from many inputs along the way. Backpropagation works similarly to the way a child would learn how to fully understand a particular category of something (i.e. dogs or cars). The procedure constantly adjusts the weights assigned to each nodal relationship to ensure that the neural network is always learning.
Backpropagation works in an ANN with an input layer of units, any number of intermediate layers, and an output layer of units. Initially, some stimulus is presented to the nodes in the input layer. These nodes then send activation energy by way of one or more of the hidden layers. The nodes in the hidden layers then send signals to the output layer, and the pattern of activation seen in the output layer is the networks response to the stimulus. The difference between the actual and desired outputs is called the “error signal” and is then transmitted back to the output layer. The error signal is used to modify the weights of each nodal link if need be. The next time the network is exposed to that stimulus, the modified weights generate a response that is closer to the one desired. Through this process the architecture learns to recognize the object with which it is presented. Simply put, the goal of backpropagation is to find a function that best maps its inputs to the correct output.
Backpropagation training has been a tremendous contribution to the field of neural computation and cognitive psychology. It was from backpropagation that the artificial neural network NETtalk was designed in order to read written English. NETtalk was unique for its time in that it was capable of learning to make the correct pronunciation of a given word after it had been supplied a few examples.
====Contrastive Divergence====
Due to the inherent limitations of backpropagation training, such as its slow speed and fact that convergence is not guaranteed, something further needed to be done to better explain the way in which the brain learns. Hinton and collaborators thus invented fast learning “contrastive divergence” algorithms for a machine previously known as “Harmonium”, but which came to be known after Hinton as the restricted Boltzmann machine (RBM). RBMs are an augmented version of the original Boltzmann machine, in that their neurons must form a bipartite-graph. There cannot be any connections between two visible units or two hidden units, which means that they are not recurrent networks, and thus makes them more efficient. The RBM can be trained supervised or unsupervised, depending on the task at hand.
====Deep Learning====
The next step towards a better approximation of how the brain learns is through a concept called “deep learning”. Deep learning, in the context of artificial neural networks, is a sub-field of machine learning in which there is a hierarchy of representations ranging from low-level concepts to high-level concepts. It involves a number of assumptions about the human brain; firstly that the representations seen by the observer result from the interactions of multiple factors, some of which are in the conscious and some in the subconscious, and that the brain makes generalizations in attempt to learn about what may not be directly presented to it. Secondly, deep learning assumes that these factors are organized into many layers with varying levels of composition and abstraction.
====Geometric Transformations====
In 2010, Hinton had come to the conclusion that the brain learns in a slightly different way than he had previously thought. He realized, through extensive research, that the brain identifies images and objects using geometric transformation to internally visualize and manipulate the stimulus.
====Google Acquisition of DNN Research====
In March of 2013, Google struck a deal with Geoffrey Hinton to buy his company DNN research, which he had set up with two of his graduate students: Alex Krizhevsky and Ilya Sutskever. DNN research was founded within the computer science department of the University of Toronto and specializes in object recognition. Hinton says that he will now be dividing his time between his work with Google and his work with the university.

===Honours and Awards===
===Honours and Awards===
This section will detail his honours and awards and any relevant information with regards to these awards.
This section will detail his honours and awards and any relevant information with regards to these awards.

Revision as of 17:30, 22 March 2013

Geoffrey Hinton

Geoffrey Hinton FRS (born December 6, 1947) is a British-born informatician most noted for his work on the mathematics and applications of neural networks, and their relationship to information theory. Hinton has contributed significantly to the scientific community in the fields of neural computation and cognitive science. His work in artificial intelligence has improved our understanding in how the human brain functions and more specifically, how it learns. Some of his contributions include the Boltzmann machine, backpropagation theory, distributed representations, the Helmholtz machine, and Product of Experts. Currently, his main interest lies in unsupervised learning of intelligent agents and neural networks. Geoff Hinton comes from a family rich in scientific and mathematical study. He is the great-great-grandson of logician and philosopher George Boole and the son of Howard E. Hinton, an entomologist.

Biography

Early Life

Geoffrey Hinton was born on December 6, 1947 in Wimbledon, United Kingdom to parents Howard and Margaret Hinton. From the ages of 7 to 17 he attended Clifton College in Bristol, although he disliked it. Geoff first became passionate about Psychology at the age of 20, when he was attending Cambridge University. He describes his main interest as to understand how the brain computes, and more specifically how a large number of neurons can interact and learn to do all the things the human brain does.

Education

Hinton graduated from Cambridge University in 1970 and received his Bachelor of Arts in experimental psychology. Shortly thereafter he obtained a PhD in Artificial Intelligence from the University of Edinburgh in 1978. During his years at Edinburgh, he did much study on the topic of relaxation methods used by intelligent agents in visualizing and identifying a whole object out of smaller confusing objects, or “noise”. His thesis paper is entitled “Relaxation and its role in vision”.

Career

Hinton did postdoctoral work at Sussex University and the University of California San Diego and spent five years as a faculty member in the Computer Science department at Carnegie-Mellon University. During the 1980s, Hinton did a great deal of research on parallel computing architecture in neural networks. It became apparent prior to much of his work that the brain did not compute in an entirely linear fashion. Many aspects of intelligent behaviour cannot be feasibly carried out unless multiple operations take place in many different parts of the brain at the same time; this lead to his work on various AI machines. Throughout his career, Hinton devised many theories and models in attempt to explain and understand how the brain works. He began in 1983 with the Boltzmann machine, and from there on would amend this architecture, reaching each time a more close approximation.

The Boltzmann Machine

The Boltzmann machine was co-designed by Hinton in order to better understand how the human brain works using an artificial neural network. It is a network of neuron-like units connected symmetrically with one another which make decisions in a stochastic manner about whether to be activated or not (on or off). The binary nature of the units represents a true/false relationship with regards to the information presented to each neural unit. In essence, the Boltzmann machine is a learning machine. The Boltzmann machine, like a Hopfield network, is a network of units with an “energy” defined for the network; but unlike the Hopfield network, the binary units operate in a stochastic manner. The global energy, , in a Boltzmann machine is identical in form to that of a Hopfield network:


Where: • is the connection strength between unit and unit . • is the state, , of unit . • is the threshold of unit . The connections in a Boltzmann machine have two restrictions: • . (No unit has a connection with itself.) • . (All connections are symmetric.)

There are two kinds of problems Boltzmann machines are used to solve: a search problem, and a learning problem. In the former, the weights between any two nodes are fixed, and the stochastic process is implemented and used like a cost function. The architecture samples vectors that are not high on the cost function. In the latter, the weights between any two nodes are not fixed, and the machine instead is shown a set of binary data vectors from which it must learn to generate a high probability and thus a high weight. Unfortunately, the speed of learning in a Boltzmann machine can be quite slow. This can be overcome by restricting the connectivity and simplifying the learning algorithm. What results is what’s called a “Restricted Boltzmann Machine”. The primary restriction in this architecture is that there cannot be a layer of hidden units which contain visible-visible or hidden-hidden connections.

Backpropagation

Short for “backward propagation of errors”, backpropagation was theorized by Hinton and colleagues, and is a method common in training artificial neural networks (ANNs). Backpropagation is a goal-oriented system, in which the ANN works towards a desired output, learning from many inputs along the way. Backpropagation works similarly to the way a child would learn how to fully understand a particular category of something (i.e. dogs or cars). The procedure constantly adjusts the weights assigned to each nodal relationship to ensure that the neural network is always learning. Backpropagation works in an ANN with an input layer of units, any number of intermediate layers, and an output layer of units. Initially, some stimulus is presented to the nodes in the input layer. These nodes then send activation energy by way of one or more of the hidden layers. The nodes in the hidden layers then send signals to the output layer, and the pattern of activation seen in the output layer is the networks response to the stimulus. The difference between the actual and desired outputs is called the “error signal” and is then transmitted back to the output layer. The error signal is used to modify the weights of each nodal link if need be. The next time the network is exposed to that stimulus, the modified weights generate a response that is closer to the one desired. Through this process the architecture learns to recognize the object with which it is presented. Simply put, the goal of backpropagation is to find a function that best maps its inputs to the correct output. Backpropagation training has been a tremendous contribution to the field of neural computation and cognitive psychology. It was from backpropagation that the artificial neural network NETtalk was designed in order to read written English. NETtalk was unique for its time in that it was capable of learning to make the correct pronunciation of a given word after it had been supplied a few examples.

Contrastive Divergence

Due to the inherent limitations of backpropagation training, such as its slow speed and fact that convergence is not guaranteed, something further needed to be done to better explain the way in which the brain learns. Hinton and collaborators thus invented fast learning “contrastive divergence” algorithms for a machine previously known as “Harmonium”, but which came to be known after Hinton as the restricted Boltzmann machine (RBM). RBMs are an augmented version of the original Boltzmann machine, in that their neurons must form a bipartite-graph. There cannot be any connections between two visible units or two hidden units, which means that they are not recurrent networks, and thus makes them more efficient. The RBM can be trained supervised or unsupervised, depending on the task at hand.

Deep Learning

The next step towards a better approximation of how the brain learns is through a concept called “deep learning”. Deep learning, in the context of artificial neural networks, is a sub-field of machine learning in which there is a hierarchy of representations ranging from low-level concepts to high-level concepts. It involves a number of assumptions about the human brain; firstly that the representations seen by the observer result from the interactions of multiple factors, some of which are in the conscious and some in the subconscious, and that the brain makes generalizations in attempt to learn about what may not be directly presented to it. Secondly, deep learning assumes that these factors are organized into many layers with varying levels of composition and abstraction.

Geometric Transformations

In 2010, Hinton had come to the conclusion that the brain learns in a slightly different way than he had previously thought. He realized, through extensive research, that the brain identifies images and objects using geometric transformation to internally visualize and manipulate the stimulus.

Google Acquisition of DNN Research

In March of 2013, Google struck a deal with Geoffrey Hinton to buy his company DNN research, which he had set up with two of his graduate students: Alex Krizhevsky and Ilya Sutskever. DNN research was founded within the computer science department of the University of Toronto and specializes in object recognition. Hinton says that he will now be dividing his time between his work with Google and his work with the university.

Honours and Awards

This section will detail his honours and awards and any relevant information with regards to these awards.

Notable Contributions

In this section I will provide an outline of some of Hinton's most notable academic contributions to the scientific community.

See Also

Here, I will provide a number of links to various Wikipedia articles on relevant topics.

References

1. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. R. (2012) Improving neural networks by preventing co-adaptation of feature detectors http://arxiv.org/abs/1207.0580 2. G. Hinton, L. Deng, D. Yu, G. Dahl, A.Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury Deep Neural Networks for Acoustic Modeling in Speech Recognition IEEE Signal Processing Magazine, 29, November 2012 (in press) 3. Salakhutdinov, R. R. and Hinton, G. E. (2012) An Efficient Learning Procedure for Deep Boltzmann Machines. Neural Computation 4. Mohamed, A., Dahl, G. E. and Hinton, G. E. (2012) Acoustic Modeling using Deep Belief Networks. IEEE Trans. on Audio, Speech, and Language Processing, 20, pp 14-22. 5. Tang, Y., Salakhutdinov, R. R. and Hinton, G. E. (2012) Deep Lambertian Networks. International Conference on Machine Learning 6. Mnih, V. and Hinton, G. E. (2012) Learning to Label Aerial Images from Noisy Data. International Conference on Machine Learning 7. Tang, Y., Salakhutdinov, R. R. and Hinton, G. E. (2012) Deep Mixtures of Factor Analysers. International Conference on Machine Learning 8. Tang, Y., Salakhutdinov, R. R. and Hinton, G. E. (2012) Robust Boltzmann Machines for Recognition and Denoising. IEEE Conference on Computer Vision and Pattern Recognition 9. Mohamed,A., Hinton, G. E. and Penn, G. (2012) Understanding how Deep Belief Networks perform acoustic modelling ICASSP 2012, Kyoto 10. Suskever, I., Martens, J. and Hinton, G. E. (2011) Generating Text with Recurrent Neural Networks. Proc. 28th International Conference on Machine Learning, Seattle 11. Taylor, G. W, Hinton, G. E., and Roweis, S. (2011) Two distributed-state models for generating high-dimensional time series. Journal of Machine Learning Research, vol 12, pp 863-907 12. Ranzato, M., Susskind, J., Mnih, V. and Hinton, G. (2011) On deep generative models with applications to recognition. IEEE Conference on Computer Vision and Pattern Recognition 13. Mnih, V., Larochelle, H. and Hinton, G. (2011) Conditional Restricted Boltzmann Machines for Structured Output Prediction Uncertainty in Artificial Intelligence 14. Susskind,J., Memisevic, R., Hinton, G. and Pollefeys, M. (2011) Modeling the joint density of two images under a variety of transformations. IEEE Conference on Computer Vision and Pattern Recognition 15. Hinton, G. E., Krizhevsky, A. and Wang, S. (2011) Transforming Auto-encoders. ICANN-11: International Conference on Artificial Neural Networks, Helsinki

Here, I will provide links outside of Wikipedia which may be of use to those learning about Geoffrey Hinton or his psychological research.