Jump to content

Neural Turing machine: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Small grammatical change (neural is lowercase)
m Remove an unnecessary space.
Line 4: Line 4:
An NTM has a neural network controller coupled to [[Auxiliary memory|external memory]] resources, which it interacts with through attentional mechanisms. The memory interactions are differentiable end-to-end, making it possible to optimize them using [[gradient descent]].<ref name="MyUser_Https:_May_17_2016c">{{cite web |url=https://www.linkedin.com/pulse/deep-minds-interview-googles-alex-graves-koray-sophie-curtis |title=Deep Minds: An Interview with Google's Alex Graves & Koray Kavukcuoglu |access-date= May 17, 2016}}</ref> An NTM with a [[long short-term memory]] (LSTM) network controller can infer simple algorithms such as copying, sorting, and associative recall from examples alone.<ref name="arxiv">{{cite arXiv |eprint=1410.5401|title= Neural Turing Machines |last1= Graves |first1= Alex |last2= Wayne |first2= Greg |last3= Danihelka |first3= Ivo |class= cs.NE |year= 2014 }}</ref>
An NTM has a neural network controller coupled to [[Auxiliary memory|external memory]] resources, which it interacts with through attentional mechanisms. The memory interactions are differentiable end-to-end, making it possible to optimize them using [[gradient descent]].<ref name="MyUser_Https:_May_17_2016c">{{cite web |url=https://www.linkedin.com/pulse/deep-minds-interview-googles-alex-graves-koray-sophie-curtis |title=Deep Minds: An Interview with Google's Alex Graves & Koray Kavukcuoglu |access-date= May 17, 2016}}</ref> An NTM with a [[long short-term memory]] (LSTM) network controller can infer simple algorithms such as copying, sorting, and associative recall from examples alone.<ref name="arxiv">{{cite arXiv |eprint=1410.5401|title= Neural Turing Machines |last1= Graves |first1= Alex |last2= Wayne |first2= Greg |last3= Danihelka |first3= Ivo |class= cs.NE |year= 2014 }}</ref>


The authors of the original NTM paper did not publish their [[source code]].<ref name="arxiv" /> The first stable open-source implementation was published in 2018 at the 27th International Conference on Artificial Neural Networks, receiving a best-paper award. <ref>{{Citation|last1=Collier|first1=Mark|title=Implementing Neural Turing Machines|date=2018|work=Artificial Neural Networks and Machine Learning – ICANN 2018|pages=94–104|publisher=Springer International Publishing|language=en|doi=10.1007/978-3-030-01424-7_10|isbn=9783030014230|last2=Beel|first2=Joeran|arxiv=1807.08518|bibcode=2018arXiv180708518C|s2cid=49908746}}</ref><ref>{{Cite web|url=https://github.com/MarkPKCollier/NeuralTuringMachine|title=MarkPKCollier/NeuralTuringMachine|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref>{{Cite news|url=https://www.scss.tcd.ie/joeran.beel/blog/2018/10/20/best-paper-award-for-our-publication-implementing-neural-turing-machines-at-icann-conference/|title=Best-Paper Award for our Publication "Implementing Neural Turing Machines" at the 27th International Conference on Artificial Neural Networks {{!}} Prof. Joeran Beel (TCD Dublin)|last=Beel|first=Joeran|date=2018-10-20|work=Trinity College Dublin, School of Computer Science and Statistics Blog|access-date=2018-10-20|language=en-GB}}</ref> Other open source implementations of NTMs exist but as of 2018 they are not sufficiently stable for production use.<ref name=":0">{{Cite web|url=https://github.com/snowkylin/ntm|title=snowkylin/ntm|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":1">{{Cite web|url=https://github.com/chiggum/Neural-Turing-Machines|title=chiggum/Neural-Turing-Machines|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":2">{{Cite web|url=https://github.com/yeoedward/Neural-Turing-Machine|title=yeoedward/Neural-Turing-Machine|website=GitHub|language=en|access-date=2018-10-20|date=2017-09-13}}</ref><ref name=":3">{{Cite web|url=https://github.com/camigord/Neural-Turing-Machine|title=camigord/Neural-Turing-Machine|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":4">{{Cite web|url=https://github.com/carpedm20/NTM-tensorflow|title=carpedm20/NTM-tensorflow|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":5">{{Cite web|url=https://github.com/snipsco/ntm-lasagne|title=snipsco/ntm-lasagne|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":6">{{Cite web|url=https://github.com/loudinthecloud/pytorch-ntm|title=loudinthecloud/pytorch-ntm|website=GitHub|language=en|access-date=2018-10-20}}</ref> The developers either report that the [[Stochastic gradient descent|gradients]] of their implementation sometimes become [[NaN]] during training for unknown reasons and cause training to fail;<ref name=":4" /><ref name=":5" /><ref name=":3" /> report slow convergence;<ref name=":1" /><ref name=":0" /> or do not report the speed of learning of their implementation.<ref name=":6" /><ref name=":2" />
The authors of the original NTM paper did not publish their [[source code]].<ref name="arxiv" /> The first stable open-source implementation was published in 2018 at the 27th International Conference on Artificial Neural Networks, receiving a best-paper award.<ref>{{Citation|last1=Collier|first1=Mark|title=Implementing Neural Turing Machines|date=2018|work=Artificial Neural Networks and Machine Learning – ICANN 2018|pages=94–104|publisher=Springer International Publishing|language=en|doi=10.1007/978-3-030-01424-7_10|isbn=9783030014230|last2=Beel|first2=Joeran|arxiv=1807.08518|bibcode=2018arXiv180708518C|s2cid=49908746}}</ref><ref>{{Cite web|url=https://github.com/MarkPKCollier/NeuralTuringMachine|title=MarkPKCollier/NeuralTuringMachine|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref>{{Cite news|url=https://www.scss.tcd.ie/joeran.beel/blog/2018/10/20/best-paper-award-for-our-publication-implementing-neural-turing-machines-at-icann-conference/|title=Best-Paper Award for our Publication "Implementing Neural Turing Machines" at the 27th International Conference on Artificial Neural Networks {{!}} Prof. Joeran Beel (TCD Dublin)|last=Beel|first=Joeran|date=2018-10-20|work=Trinity College Dublin, School of Computer Science and Statistics Blog|access-date=2018-10-20|language=en-GB}}</ref> Other open source implementations of NTMs exist but as of 2018 they are not sufficiently stable for production use.<ref name=":0">{{Cite web|url=https://github.com/snowkylin/ntm|title=snowkylin/ntm|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":1">{{Cite web|url=https://github.com/chiggum/Neural-Turing-Machines|title=chiggum/Neural-Turing-Machines|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":2">{{Cite web|url=https://github.com/yeoedward/Neural-Turing-Machine|title=yeoedward/Neural-Turing-Machine|website=GitHub|language=en|access-date=2018-10-20|date=2017-09-13}}</ref><ref name=":3">{{Cite web|url=https://github.com/camigord/Neural-Turing-Machine|title=camigord/Neural-Turing-Machine|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":4">{{Cite web|url=https://github.com/carpedm20/NTM-tensorflow|title=carpedm20/NTM-tensorflow|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":5">{{Cite web|url=https://github.com/snipsco/ntm-lasagne|title=snipsco/ntm-lasagne|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":6">{{Cite web|url=https://github.com/loudinthecloud/pytorch-ntm|title=loudinthecloud/pytorch-ntm|website=GitHub|language=en|access-date=2018-10-20}}</ref> The developers either report that the [[Stochastic gradient descent|gradients]] of their implementation sometimes become [[NaN]] during training for unknown reasons and cause training to fail;<ref name=":4" /><ref name=":5" /><ref name=":3" /> report slow convergence;<ref name=":1" /><ref name=":0" /> or do not report the speed of learning of their implementation.<ref name=":6" /><ref name=":2" />


[[Differentiable neural computer]]s are an outgrowth of Neural Turing machines, with [[attention mechanism]]s that control where the memory is active, and improve performance.<ref>{{Cite web|url=http://www.i-programmer.info/news/105-artificial-intelligence/10174-deepminds-differential-nn-thinks-deeply.html|title=DeepMind's Differentiable Neural Network Thinks Deeply|last=Administrator|website=www.i-programmer.info|access-date=2016-10-20}}</ref>
[[Differentiable neural computer]]s are an outgrowth of Neural Turing machines, with [[attention mechanism]]s that control where the memory is active, and improve performance.<ref>{{Cite web|url=http://www.i-programmer.info/news/105-artificial-intelligence/10174-deepminds-differential-nn-thinks-deeply.html|title=DeepMind's Differentiable Neural Network Thinks Deeply|last=Administrator|website=www.i-programmer.info|access-date=2016-10-20}}</ref>

Revision as of 16:40, 25 April 2023

A neural Turing machine (NTM) is a recurrent neural network model of a Turing machine. The approach was published by Alex Graves et al. in 2014.[1] NTMs combine the fuzzy pattern matching capabilities of neural networks with the algorithmic power of programmable computers.

An NTM has a neural network controller coupled to external memory resources, which it interacts with through attentional mechanisms. The memory interactions are differentiable end-to-end, making it possible to optimize them using gradient descent.[2] An NTM with a long short-term memory (LSTM) network controller can infer simple algorithms such as copying, sorting, and associative recall from examples alone.[1]

The authors of the original NTM paper did not publish their source code.[1] The first stable open-source implementation was published in 2018 at the 27th International Conference on Artificial Neural Networks, receiving a best-paper award.[3][4][5] Other open source implementations of NTMs exist but as of 2018 they are not sufficiently stable for production use.[6][7][8][9][10][11][12] The developers either report that the gradients of their implementation sometimes become NaN during training for unknown reasons and cause training to fail;[10][11][9] report slow convergence;[7][6] or do not report the speed of learning of their implementation.[12][8]

Differentiable neural computers are an outgrowth of Neural Turing machines, with attention mechanisms that control where the memory is active, and improve performance.[13]

References

  1. ^ a b c Graves, Alex; Wayne, Greg; Danihelka, Ivo (2014). "Neural Turing Machines". arXiv:1410.5401 [cs.NE].
  2. ^ "Deep Minds: An Interview with Google's Alex Graves & Koray Kavukcuoglu". Retrieved May 17, 2016.
  3. ^ Collier, Mark; Beel, Joeran (2018), "Implementing Neural Turing Machines", Artificial Neural Networks and Machine Learning – ICANN 2018, Springer International Publishing, pp. 94–104, arXiv:1807.08518, Bibcode:2018arXiv180708518C, doi:10.1007/978-3-030-01424-7_10, ISBN 9783030014230, S2CID 49908746
  4. ^ "MarkPKCollier/NeuralTuringMachine". GitHub. Retrieved 2018-10-20.
  5. ^ Beel, Joeran (2018-10-20). "Best-Paper Award for our Publication "Implementing Neural Turing Machines" at the 27th International Conference on Artificial Neural Networks | Prof. Joeran Beel (TCD Dublin)". Trinity College Dublin, School of Computer Science and Statistics Blog. Retrieved 2018-10-20.
  6. ^ a b "snowkylin/ntm". GitHub. Retrieved 2018-10-20.
  7. ^ a b "chiggum/Neural-Turing-Machines". GitHub. Retrieved 2018-10-20.
  8. ^ a b "yeoedward/Neural-Turing-Machine". GitHub. 2017-09-13. Retrieved 2018-10-20.
  9. ^ a b "camigord/Neural-Turing-Machine". GitHub. Retrieved 2018-10-20.
  10. ^ a b "carpedm20/NTM-tensorflow". GitHub. Retrieved 2018-10-20.
  11. ^ a b "snipsco/ntm-lasagne". GitHub. Retrieved 2018-10-20.
  12. ^ a b "loudinthecloud/pytorch-ntm". GitHub. Retrieved 2018-10-20.
  13. ^ Administrator. "DeepMind's Differentiable Neural Network Thinks Deeply". www.i-programmer.info. Retrieved 2016-10-20.