Neural Turing machine: Difference between revisions

Content deleted Content added

Inline

Revision as of 16:40, 25 April 2023

A neural Turing machine (NTM) is a recurrent neural network model of a Turing machine. The approach was published by Alex Graves et al. in 2014.^[1] NTMs combine the fuzzy pattern matching capabilities of neural networks with the algorithmic power of programmable computers.

An NTM has a neural network controller coupled to external memory resources, which it interacts with through attentional mechanisms. The memory interactions are differentiable end-to-end, making it possible to optimize them using gradient descent.^[2] An NTM with a long short-term memory (LSTM) network controller can infer simple algorithms such as copying, sorting, and associative recall from examples alone.^[1]

The authors of the original NTM paper did not publish their source code.^[1] The first stable open-source implementation was published in 2018 at the 27th International Conference on Artificial Neural Networks, receiving a best-paper award.^[3]^[4]^[5] Other open source implementations of NTMs exist but as of 2018 they are not sufficiently stable for production use.^[6]^[7]^[8]^[9]^[10]^[11]^[12] The developers either report that the gradients of their implementation sometimes become NaN during training for unknown reasons and cause training to fail;^[10]^[11]^[9] report slow convergence;^[7]^[6] or do not report the speed of learning of their implementation.^[12]^[8]

Differentiable neural computers are an outgrowth of Neural Turing machines, with attention mechanisms that control where the memory is active, and improve performance.^[13]

References

^ ^a ^b ^c Graves, Alex; Wayne, Greg; Danihelka, Ivo (2014). "Neural Turing Machines". arXiv:1410.5401 [cs.NE].
^ "Deep Minds: An Interview with Google's Alex Graves & Koray Kavukcuoglu". Retrieved May 17, 2016.
^ Collier, Mark; Beel, Joeran (2018), "Implementing Neural Turing Machines", Artificial Neural Networks and Machine Learning – ICANN 2018, Springer International Publishing, pp. 94–104, arXiv:1807.08518, Bibcode:2018arXiv180708518C, doi:10.1007/978-3-030-01424-7_10, ISBN 9783030014230, S2CID 49908746
^ "MarkPKCollier/NeuralTuringMachine". GitHub. Retrieved 2018-10-20.
^ Beel, Joeran (2018-10-20). "Best-Paper Award for our Publication "Implementing Neural Turing Machines" at the 27th International Conference on Artificial Neural Networks | Prof. Joeran Beel (TCD Dublin)". Trinity College Dublin, School of Computer Science and Statistics Blog. Retrieved 2018-10-20.
^ ^a ^b "snowkylin/ntm". GitHub. Retrieved 2018-10-20.
^ ^a ^b "chiggum/Neural-Turing-Machines". GitHub. Retrieved 2018-10-20.
^ ^a ^b "yeoedward/Neural-Turing-Machine". GitHub. 2017-09-13. Retrieved 2018-10-20.
^ ^a ^b "camigord/Neural-Turing-Machine". GitHub. Retrieved 2018-10-20.
^ ^a ^b "carpedm20/NTM-tensorflow". GitHub. Retrieved 2018-10-20.
^ ^a ^b "snipsco/ntm-lasagne". GitHub. Retrieved 2018-10-20.
^ ^a ^b "loudinthecloud/pytorch-ntm". GitHub. Retrieved 2018-10-20.
^ Administrator. "DeepMind's Differentiable Neural Network Thinks Deeply". www.i-programmer.info. Retrieved 2016-10-20.

[arxiv-1] Graves, Alex; Wayne, Greg; Danihelka, Ivo (2014). "Neural Turing Machines". arXiv:1410.5401 [cs.NE].

[MyUser_Https:_May_17_2016c-2] "Deep Minds: An Interview with Google's Alex Graves & Koray Kavukcuoglu". Retrieved May 17, 2016.

[3] Collier, Mark; Beel, Joeran (2018), "Implementing Neural Turing Machines", Artificial Neural Networks and Machine Learning – ICANN 2018, Springer International Publishing, pp. 94–104, arXiv:1807.08518, Bibcode:2018arXiv180708518C, doi:10.1007/978-3-030-01424-7_10, ISBN 9783030014230, S2CID 49908746

[4] "MarkPKCollier/NeuralTuringMachine". GitHub. Retrieved 2018-10-20.

[5] Beel, Joeran (2018-10-20). "Best-Paper Award for our Publication "Implementing Neural Turing Machines" at the 27th International Conference on Artificial Neural Networks | Prof. Joeran Beel (TCD Dublin)". Trinity College Dublin, School of Computer Science and Statistics Blog. Retrieved 2018-10-20.

[:0-6] "snowkylin/ntm". GitHub. Retrieved 2018-10-20.

[:1-7] "chiggum/Neural-Turing-Machines". GitHub. Retrieved 2018-10-20.

[:2-8] "yeoedward/Neural-Turing-Machine". GitHub. 2017-09-13. Retrieved 2018-10-20.

[:3-9] "camigord/Neural-Turing-Machine". GitHub. Retrieved 2018-10-20.

[:4-10] "carpedm20/NTM-tensorflow". GitHub. Retrieved 2018-10-20.

[:5-11] "snipsco/ntm-lasagne". GitHub. Retrieved 2018-10-20.

[:6-12] "loudinthecloud/pytorch-ntm". GitHub. Retrieved 2018-10-20.

[13] Administrator. "DeepMind's Differentiable Neural Network Thinks Deeply". www.i-programmer.info. Retrieved 2016-10-20.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

@@ Line 4: / Line 4: @@
 An NTM has a neural network controller coupled to [[Auxiliary memory|external memory]] resources, which it interacts with through attentional mechanisms. The memory interactions are differentiable end-to-end, making it possible to optimize them using [[gradient descent]].<ref name="MyUser_Https:_May_17_2016c">{{cite web |url=https://www.linkedin.com/pulse/deep-minds-interview-googles-alex-graves-koray-sophie-curtis |title=Deep Minds: An Interview with Google's Alex Graves & Koray Kavukcuoglu |access-date= May 17, 2016}}</ref> An NTM with a [[long short-term memory]] (LSTM) network controller can infer simple algorithms such as copying, sorting, and associative recall from examples alone.<ref name="arxiv">{{cite arXiv |eprint=1410.5401|title= Neural Turing Machines |last1= Graves |first1= Alex |last2= Wayne |first2= Greg |last3= Danihelka |first3= Ivo |class= cs.NE |year= 2014 }}</ref>
-The authors of the original NTM paper did not publish their [[source code]].<ref name="arxiv" /> The first stable open-source implementation was published in 2018 at the 27th International Conference on Artificial Neural Networks, receiving a best-paper award. <ref>{{Citation|last1=Collier|first1=Mark|title=Implementing Neural Turing Machines|date=2018|work=Artificial Neural Networks and Machine Learning – ICANN 2018|pages=94–104|publisher=Springer International Publishing|language=en|doi=10.1007/978-3-030-01424-7_10|isbn=9783030014230|last2=Beel|first2=Joeran|arxiv=1807.08518|bibcode=2018arXiv180708518C|s2cid=49908746}}</ref><ref>{{Cite web|url=https://github.com/MarkPKCollier/NeuralTuringMachine|title=MarkPKCollier/NeuralTuringMachine|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref>{{Cite news|url=https://www.scss.tcd.ie/joeran.beel/blog/2018/10/20/best-paper-award-for-our-publication-implementing-neural-turing-machines-at-icann-conference/|title=Best-Paper Award for our Publication "Implementing Neural Turing Machines" at the 27th International Conference on Artificial Neural Networks {{!}} Prof. Joeran Beel (TCD Dublin)|last=Beel|first=Joeran|date=2018-10-20|work=Trinity College Dublin, School of Computer Science and Statistics Blog|access-date=2018-10-20|language=en-GB}}</ref> Other open source implementations of NTMs exist but as of 2018 they are not sufficiently stable for production use.<ref name=":0">{{Cite web|url=https://github.com/snowkylin/ntm|title=snowkylin/ntm|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":1">{{Cite web|url=https://github.com/chiggum/Neural-Turing-Machines|title=chiggum/Neural-Turing-Machines|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":2">{{Cite web|url=https://github.com/yeoedward/Neural-Turing-Machine|title=yeoedward/Neural-Turing-Machine|website=GitHub|language=en|access-date=2018-10-20|date=2017-09-13}}</ref><ref name=":3">{{Cite web|url=https://github.com/camigord/Neural-Turing-Machine|title=camigord/Neural-Turing-Machine|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":4">{{Cite web|url=https://github.com/carpedm20/NTM-tensorflow|title=carpedm20/NTM-tensorflow|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":5">{{Cite web|url=https://github.com/snipsco/ntm-lasagne|title=snipsco/ntm-lasagne|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":6">{{Cite web|url=https://github.com/loudinthecloud/pytorch-ntm|title=loudinthecloud/pytorch-ntm|website=GitHub|language=en|access-date=2018-10-20}}</ref> The developers either report that the [[Stochastic gradient descent|gradients]] of their implementation sometimes become [[NaN]] during training for unknown reasons and cause training to fail;<ref name=":4" /><ref name=":5" /><ref name=":3" /> report slow convergence;<ref name=":1" /><ref name=":0" /> or do not report the speed of learning of their implementation.<ref name=":6" /><ref name=":2" />
+The authors of the original NTM paper did not publish their [[source code]].<ref name="arxiv" /> The first stable open-source implementation was published in 2018 at the 27th International Conference on Artificial Neural Networks, receiving a best-paper award.<ref>{{Citation|last1=Collier|first1=Mark|title=Implementing Neural Turing Machines|date=2018|work=Artificial Neural Networks and Machine Learning – ICANN 2018|pages=94–104|publisher=Springer International Publishing|language=en|doi=10.1007/978-3-030-01424-7_10|isbn=9783030014230|last2=Beel|first2=Joeran|arxiv=1807.08518|bibcode=2018arXiv180708518C|s2cid=49908746}}</ref><ref>{{Cite web|url=https://github.com/MarkPKCollier/NeuralTuringMachine|title=MarkPKCollier/NeuralTuringMachine|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref>{{Cite news|url=https://www.scss.tcd.ie/joeran.beel/blog/2018/10/20/best-paper-award-for-our-publication-implementing-neural-turing-machines-at-icann-conference/|title=Best-Paper Award for our Publication "Implementing Neural Turing Machines" at the 27th International Conference on Artificial Neural Networks {{!}} Prof. Joeran Beel (TCD Dublin)|last=Beel|first=Joeran|date=2018-10-20|work=Trinity College Dublin, School of Computer Science and Statistics Blog|access-date=2018-10-20|language=en-GB}}</ref> Other open source implementations of NTMs exist but as of 2018 they are not sufficiently stable for production use.<ref name=":0">{{Cite web|url=https://github.com/snowkylin/ntm|title=snowkylin/ntm|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":1">{{Cite web|url=https://github.com/chiggum/Neural-Turing-Machines|title=chiggum/Neural-Turing-Machines|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":2">{{Cite web|url=https://github.com/yeoedward/Neural-Turing-Machine|title=yeoedward/Neural-Turing-Machine|website=GitHub|language=en|access-date=2018-10-20|date=2017-09-13}}</ref><ref name=":3">{{Cite web|url=https://github.com/camigord/Neural-Turing-Machine|title=camigord/Neural-Turing-Machine|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":4">{{Cite web|url=https://github.com/carpedm20/NTM-tensorflow|title=carpedm20/NTM-tensorflow|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":5">{{Cite web|url=https://github.com/snipsco/ntm-lasagne|title=snipsco/ntm-lasagne|website=GitHub|language=en|access-date=2018-10-20}}</ref><ref name=":6">{{Cite web|url=https://github.com/loudinthecloud/pytorch-ntm|title=loudinthecloud/pytorch-ntm|website=GitHub|language=en|access-date=2018-10-20}}</ref> The developers either report that the [[Stochastic gradient descent|gradients]] of their implementation sometimes become [[NaN]] during training for unknown reasons and cause training to fail;<ref name=":4" /><ref name=":5" /><ref name=":3" /> report slow convergence;<ref name=":1" /><ref name=":0" /> or do not report the speed of learning of their implementation.<ref name=":6" /><ref name=":2" />
 [[Differentiable neural computer]]s are an outgrowth of Neural Turing machines, with [[attention mechanism]]s that control where the memory is active, and improve performance.<ref>{{Cite web|url=http://www.i-programmer.info/news/105-artificial-intelligence/10174-deepminds-differential-nn-thinks-deeply.html|title=DeepMind's Differentiable Neural Network Thinks Deeply|last=Administrator|website=www.i-programmer.info|access-date=2016-10-20}}</ref>

v t e Differentiable computing
General	Differentiable programming Information geometry Statistical manifold Automatic differentiation Neuromorphic computing Pattern recognition Ricci calculus Computational learning theory Inductive bias
Hardware	IPU TPU VPU Memristor SpiNNaker
Software libraries	TensorFlow PyTorch Keras scikit-learn Theano JAX Flux.jl MindSpore
Portals Computer programming Technology