Free energy principle: Difference between revisions
to say it is formal is really not correct - the theory is clearly a work in progress and the precise statement, if there even is one, seems to change from one paper to the next. |
Lamptonian (talk | contribs) No edit summary |
||
(103 intermediate revisions by 50 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Hypothesis in neuroscience |
{{Short description|Hypothesis in neuroscience}} |
||
{{Distinguish|Thermodynamic free energy}} |
|||
The '''free energy principle''' is theory in cognitive science that attempts to explain how living and non-living systems remain in [[non-equilibrium thermodynamics|non-equilibrium steady-states]] by restricting themselves to a limited number of states. It establishes that systems minimise a free energy function of their internal states (not to be confused with [[thermodynamic free energy]]), which entail beliefs about hidden states in their environment. The implicit minimisation of [[variational free energy|free energy]] is formally related to [[variational Bayesian methods]] and was originally introduced by [[Karl Friston]] as an explanation for embodied perception in [[neuroscience]],<ref>{{cite journal | last1=Friston | first1=Karl | last2=Kilner | first2=James | last3=Harrison | first3=Lee | title=A free energy principle for the brain | journal=Journal of Physiology-Paris | publisher=Elsevier BV | volume=100 | issue=1–3 | year=2006 | issn=0928-4257 | doi=10.1016/j.jphysparis.2006.10.001 | pmid=17097864 | pages=70–87| s2cid=637885 |url=http://www.fil.ion.ucl.ac.uk/~karl/A%20free%20energy%20principle%20for%20the%20brain.pdf}}</ref> where it is also known as '''active inference'''. |
|||
The '''free energy principle''' is a theoretical framework suggesting that the brain reduces [[surprisal|surprise]] or [[uncertainty]] by making predictions based on [[Mental model|internal models]] and updating them using [[Stimulus (physiology)|sensory input]]. It highlights the brain's objective of aligning its internal model and the external world to enhance [[Accuracy and precision|prediction accuracy]]. This principle integrates [[Bayesian inference]] with [[active inference]], where actions are guided by predictions and [[sensory feedback]] refines them. It has wide-ranging implications for comprehending [[brain function]], [[perception]], and [[Action (philosophy)|action]].<ref>{{cite journal |
|||
The free energy principle describes the behaviour of a given system by modeling it through a [[Markov blanket]] that tries to minimize the difference between their model of the world and their [[sense]] and associated [[perception]]. This difference can be described as "surprise" and is minimized by continuous correction of the world model of the system. As such, the principle is based on the Bayesian idea of the brain as an “inference engine.” Friston added a second route to minimization: action. By actively changing the world into the expected state, systems can also minimize the free energy of the system. Friston assumes this to be the principle of all biological reaction.<ref name="wired20181112">Shaun Raviv: [https://www.wired.com/story/karl-friston-free-energy-principle-artificial-intelligence/ The Genius Neuroscientist Who Might Hold the Key to True AI]. In: Wired, 13. November 2018</ref> Friston also believes his principle applies to [[mental disorder]]s as well as to [[artificial intelligence]]. AI implementations based on the active inference principle have shown advantages over other methods.<ref name="wired20181112" /> |
|||
| last1 = Bruineberg |
|||
| first1 = Jelle |
|||
| last2 = Kiverstein |
|||
| first2 = Julian |
|||
| last3 = Rietveld |
|||
| first3 = Erik |
|||
| title = The anticipating brain is not a scientist: the free-energy principle from an ecological-enactive perspective |
|||
| journal = Synthese |
|||
| volume = 195 |
|||
| issue = 6 |
|||
| pages = 2417–2444 |
|||
| year = 2018 |
|||
| doi = 10.1007/s11229-016-1239-1 |
|||
| pmid = 30996493 |
|||
| pmc = 6438652 |
|||
}}</ref> |
|||
== Overview == |
|||
The free energy principle has been criticized for being very difficult to understand, even for experts<ref>{{cite journal | last=Freed | first=Peter | title=Research Digest | journal=Neuropsychoanalysis | publisher=Informa UK Limited | volume=12 | issue=1 | year=2010 | issn=1529-4145 | doi=10.1080/15294145.2010.10773634 | pages=103–106| s2cid=220306712 }}</ref> and the mathematical consistency of the theory have been questioned by recent studies.<ref>{{cite journal | last1=Aguilera | first1=Miguel | last2=Millidge | first2=Beren | first3=Alexander| last3=Tschantz | first4=Christopher L | last4=Buckley | title=How particular is the physics of the free energy principle? | journal=Physics of Life Reviews | year=2021 | doi=10.1016/j.plrev.2021.11.001}}</ref><ref>{{cite journal | last1=Biehl | first1=Martin | last2=Pollock | first2=Felix |last3=Kanai | first3=Ryota | title=A Technical Critique of Some Parts of the Free Energy Principle | journal=Entropy | year=2021 | doi=10.3390/e23030293}}</ref> Discussions of the principle have also been criticized as invoking [[metaphysics|metaphysical]] assumptions far removed from a testable scientific prediction, making the principle unfalsifiable.<ref name="First principles in the life scienc">{{cite journal | last1=Colombo | first1=Matteo | last2=Wright | first2=Cory | title=First principles in the life sciences: the free-energy principle, organicism, and mechanism | journal=Synthese | publisher=Springer Science and Business Media LLC | date=2018-09-10 | volume=198 | pages=3463–3488 | issn=0039-7857 | doi=10.1007/s11229-018-01932-w |doi-access=free}}</ref> In a 2018 interview, Friston acknowledged that the free energy principle is not properly [[Falsifiability|falsifiable]]: "the free energy principle is what it is — a [[principle]]. Like [[Hamilton's principle|Hamilton's principle of stationary action]], it cannot be falsified. It cannot be disproven. In fact, there’s not much you can do with it, unless you ask whether measurable systems conform to the principle."<ref>{{Cite journal|last=Friston|first=Karl|date=2018|title=Of woodlice and men: A Bayesian account of cognition, life and consciousness. An interview with Karl Friston (by Martin Fortier & Daniel Friedman)|url=https://www.aliusresearch.org/bulletin02.html|journal=ALIUS Bulletin|volume=2|pages=17–43}}</ref> |
|||
In [[biophysics]] and [[cognitive science]], the free energy principle is a mathematical principle describing a [[Mathematical logic|formal]] account of the representational capacities of physical systems: that is, why things that exist look as if they track properties of the systems to which they are coupled.<ref>{{cite journal |
|||
| last1 = Friston |
|||
| first1 = Karl |
|||
| title = The free-energy principle: a unified brain theory? |
|||
| journal = Nature Reviews Neuroscience |
|||
| volume = 11 |
|||
| issue = 2 |
|||
| pages = 127–138 |
|||
| year = 2010 |
|||
| doi = 10.1038/nrn2787 |
|||
| pmid =20068583 |
|||
| s2cid = 5053247 |
|||
| url = https://www.nature.com/articles/nrn2787 |
|||
| access-date =July 9, 2023 |
|||
}}</ref> |
|||
It establishes that the dynamics of physical systems minimise a quantity known as ''[[surprisal]]'' (which is the negative log probability of some outcome); or equivalently, its variational upper bound, called ''[[variational free energy|free energy]]''. The principle is used especially in [[Bayesian approaches to brain function]], but also some approaches to [[artificial intelligence]]; it is formally related to [[variational Bayesian methods]] and was originally introduced by [[Karl Friston]] as an explanation for embodied perception-action loops in [[neuroscience]].<ref>{{cite journal | last1=Friston | first1=Karl | last2=Kilner | first2=James | last3=Harrison | first3=Lee | title=A free energy principle for the brain | journal=Journal of Physiology-Paris | volume=100 | issue=1–3 | year=2006 | doi=10.1016/j.jphysparis.2006.10.001 | pmid=17097864 | pages=70–87| s2cid=637885 |url=http://www.fil.ion.ucl.ac.uk/~karl/A%20free%20energy%20principle%20for%20the%20brain.pdf}}</ref> |
|||
The free energy principle models the behaviour of systems that are distinct from, but coupled to, another system (e.g., an embedding environment), where the degrees of freedom that implement the interface between the two systems is known as a [[Markov blanket]]. More formally, the free energy principle says that if a system has a "particular partition" (i.e., into particles, with their Markov blankets), then subsets of that system will track the statistical structure of other subsets (which are known as internal and external states or paths of a system). |
|||
The free energy principle is based on the Bayesian idea of the brain as an “[[inference engine]].” Under the free energy principle, systems pursue paths of '''least surprise''', or equivalently, minimize the difference between predictions based on their model of the world and their [[sense]] and associated [[perception]]. This difference is quantified by variational free energy and is minimized by continuous correction of the world model of the system, or by making the world more like the predictions of the system. By actively changing the world to make it closer to the expected state, systems can also minimize the free energy of the system. Friston assumes this to be the principle of all biological reaction.<ref name="wired20181112">Shaun Raviv: [https://www.wired.com/story/karl-friston-free-energy-principle-artificial-intelligence/ The Genius Neuroscientist Who Might Hold the Key to True AI]. In: Wired, 13. November 2018</ref> Friston also believes his principle applies to [[mental disorder]]s as well as to [[artificial intelligence]]. AI implementations based on the active inference principle have shown advantages over other methods.<ref name="wired20181112" /> |
|||
The free energy principle is a mathematical principle of information physics: much like the principle of maximum entropy or the principle of least action, it is true on mathematical grounds. To attempt to falsify the free energy principle is a category mistake, akin to trying to falsify [[calculus]] by making empirical observations. (One cannot invalidate a mathematical theory in this way; instead, one would need to derive a formal contradiction from the theory.) In a 2018 interview, Friston explained what it entails for the free energy principle to not be subject to [[Falsifiability|falsification]]: "I think it is useful to make a fundamental distinction at this point—that we can appeal to later. The distinction is between a state and process theory; i.e., the difference between a normative principle that things may or may not conform to, and a process theory or hypothesis about how that principle is realized. Under this distinction, the free energy principle stands in stark distinction to things like [[predictive coding]] and the Bayesian brain hypothesis. This is because the free energy principle is what it is — a [[principle]]. Like [[Hamilton's principle|Hamilton's principle of stationary action]], it cannot be falsified. It cannot be disproven. In fact, there’s not much you can do with it, unless you ask whether measurable systems conform to the principle. On the other hand, hypotheses that the brain performs some form of Bayesian inference or predictive coding are what they are—hypotheses. These hypotheses may or may not be supported by empirical evidence."<ref>{{Cite journal|last=Friston|first=Karl|date=2018|title=Of woodlice and men: A Bayesian account of cognition, life and consciousness. An interview with Karl Friston (by Martin Fortier & Daniel Friedman)|url=https://www.aliusresearch.org/bulletin02.html|journal=ALIUS Bulletin|volume=2|pages=17–43}}</ref> There are many examples of these hypotheses being supported by empirical evidence.<ref>{{Cite book|last=Friston|first=Karl|date=2022|title=Active Inference: The Free Energy Principle in Mind, Brain, and Behavior|publisher=MIT Press |isbn=9780262045353 |url=https://books.google.com/books?id=KXZ_zgEACAAJ&q=Table+9.1}}</ref> |
|||
== Background == |
== Background == |
||
The notion that [[self-organisation|self-organising]] biological systems – like a cell or brain – can be understood as minimising variational free energy is based upon [[Hermann von Helmholtz|Helmholtz]]’s work on [[unconscious inference]]<ref name="Helmholtz">Helmholtz, H. (1866/1962). Concerning the perceptions in general. In Treatise on physiological optics (J. Southall, Trans., 3rd ed., Vol. III). New York: Dover. Available at https://web.archive.org/web/20180320133752/http://poseidon.sunyopt.edu/BackusLab/Helmholtz/</ref> and subsequent treatments in psychology<ref>{{cite journal | title=Perceptions as hypotheses | journal=Philosophical Transactions of the Royal Society of London. B, Biological Sciences |
The notion that [[self-organisation|self-organising]] biological systems – like a cell or brain – can be understood as minimising variational free energy is based upon [[Hermann von Helmholtz|Helmholtz]]’s work on [[unconscious inference]]<ref name="Helmholtz">Helmholtz, H. (1866/1962). Concerning the perceptions in general. In Treatise on physiological optics (J. Southall, Trans., 3rd ed., Vol. III). New York: Dover. Available at https://web.archive.org/web/20180320133752/http://poseidon.sunyopt.edu/BackusLab/Helmholtz/</ref> and subsequent treatments in psychology<ref>{{cite journal | title=Perceptions as hypotheses | journal=Philosophical Transactions of the Royal Society of London. B, Biological Sciences | volume=290 | issue=1038 | date=1980-07-08 | doi=10.1098/rstb.1980.0090 | pmid=6106237 | bibcode=1980RSPTB.290..181G | pages=181–197|jstor=2395424| last1=Gregory | first1=R. L. | doi-access= }}</ref> and machine learning.<ref name="Dayan">{{cite journal | last1=Dayan | first1=Peter | last2=Hinton | first2=Geoffrey E. | last3=Neal | first3=Radford M. | last4=Zemel | first4=Richard S. | title=The Helmholtz Machine | journal=Neural Computation | volume=7 | issue=5 | year=1995 | doi=10.1162/neco.1995.7.5.889 | pmid=7584891 | pages=889–904| s2cid=1890561 |url=http://www.gatsby.ucl.ac.uk/~dayan/papers/hm95.pdf| hdl=21.11116/0000-0002-D6D3-E | hdl-access=free }}</ref> Variational free energy is a function of observations and a probability density over their hidden causes. This [[Calculus of variations|variational]] density is defined in relation to a probabilistic model that generates predicted observations from hypothesized causes. In this setting, free energy provides an approximation to [[Marginal likelihood|Bayesian model evidence]].<ref>Beal, M. J. (2003). [http://www.cse.buffalo.edu/faculty/mbeal/papers/beal03.pdf Variational Algorithms for Approximate Bayesian Inference]. Ph.D. Thesis, University College London.</ref> Therefore, its minimisation can be seen as a Bayesian inference process. When a system actively makes observations to minimise free energy, it implicitly performs active inference and maximises the evidence for its model of the world. |
||
However, free energy is also an upper bound on the [[self-information]] of outcomes, where the long-term average of [[Self-information|surprise]] is entropy. This means that if a system acts to minimise free energy, it will implicitly place an upper bound on the entropy of the outcomes – or sensory states – it samples.<ref name=" |
However, free energy is also an upper bound on the [[self-information]] of outcomes, where the long-term average of [[Self-information|surprise]] is entropy. This means that if a system acts to minimise free energy, it will implicitly place an upper bound on the entropy of the outcomes – or sensory states – it samples.<ref name="Towards a Geometry and Analysis for">{{cite arXiv | last1=Sakthivadivel | first1=Dalton | title=Towards a Geometry and Analysis for Bayesian Mechanics | year=2022 | class=math-ph | eprint=2204.11900}}</ref><ref name="On Bayesian mechanics: A physics of">{{cite journal | last1=Ramstead | first1=Maxwell | last2=Sakthivadivel | first2=Dalton | last3=Heins | first3=Conor | last4 = Koudahl | first4 = Magnus | last5 = Millidge | first5 = Beren | last6 = Da Costa | first6 = Lancelot | last7 = Klein | first7 = Brennan | last8 = Friston | first8 = Karl | title=On Bayesian mechanics: A physics of and by beliefs | journal=Interface Focus | year=2023 | volume=13 | issue=3 | doi=10.1098/rsfs.2022.0029 | pmid=37213925 | pmc=10198254 | arxiv=2205.11543| s2cid=249017997 }}</ref> |
||
=== Relationship to other theories === |
=== Relationship to other theories === |
||
Active inference is closely related to the [[Good Regulator|good regulator theorem]]<ref>{{cite journal | doi=10.1080/00207727008920220 | title=Every good regulator of a system must be a model of that system | year=1970 | last1=Conant | first1=Roger C. | last2=Ross Ashby | first2=W. | journal=International Journal of Systems Science | volume=1 | issue=2 | pages=89–97 }}</ref> and related accounts of [[self-organisation]],<ref>Kauffman, S. (1993). [https://books.google.com/books?id=lZcSpRJz0dgC&dq=%22The+Origins+of+Order%3A+Self-Organization+and+Selection+in+Evolution%22&pg=PR13 The Origins of Order: Self-Organization and Selection in Evolution]. Oxford: Oxford University Press.</ref><ref>Nicolis, G., & Prigogine, I. (1977). Self-organization in non-equilibrium systems. New York: John Wiley.</ref> such as [[self-assembly]], [[pattern formation]], [[autopoiesis]]<ref>Maturana, H. R., & Varela, F. (1980). [http://topologicalmedialab.net/xinwei/classes/readings/Maturana/autopoesis_and_cognition.pdf Autopoiesis: the organization of the living]. In V. F. Maturana HR (Ed.), Autopoiesis and Cognition. Dordrecht, Netherlands: Reidel.</ref> and [[practopoiesis]].<ref>{{cite journal | doi=10.1016/j.jtbi.2015.03.003 | title=Practopoiesis: Or how life fosters a mind | year=2015 | last1=Nikolić | first1=Danko | journal=Journal of Theoretical Biology | volume=373 | pages=40–61 | pmid=25791287 | arxiv=1402.5332 | bibcode=2015JThBi.373...40N | s2cid=12680941 }}</ref> It addresses the themes considered in [[cybernetics]], [[Synergetics (Haken)|synergetics]]<ref>Haken, H. (1983). Synergetics: An introduction. Non-equilibrium phase transition and self-organisation in physics, chemistry and biology (3rd ed.). Berlin: Springer Verlag.</ref> and [[embodied cognition]]. Because free energy can be expressed as the expected energy of observations under the variational density minus its entropy, it is also related to the [[maximum entropy principle]].<ref>{{cite journal |doi=10.1103/PhysRev.106.620 |url=http://bayes.wustl.edu/etj/articles/theory.1.pdf|title=Information Theory and Statistical Mechanics |year=1957 |last1=Jaynes |first1=E. T. |journal=Physical Review |volume=106 |issue=4 |pages=620–630 |bibcode=1957PhRv..106..620J |s2cid=17870175 }}</ref> Finally, because the time average of energy is action, the principle of minimum variational free energy is a [[principle of least action]]. Active inference allowing for scale invariance has also been applied to other theories and domains. For instance, it has been applied to sociology,<ref>{{Cite journal |last1=Veissière |first1=Samuel P. L. |last2=Constant |first2=Axel |last3=Ramstead |first3=Maxwell J. D. |last4=Friston |first4=Karl J. |last5=Kirmayer |first5=Laurence J. |date=2020 |title=Thinking through other minds: A variational approach to cognition and culture |url=https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/article/abs/thinking-through-other-minds-a-variational-approach-to-cognition-and-culture/9A10399BA85F428D5943DD847092C14A |journal=Behavioral and Brain Sciences |language=en |volume=43 |pages=e90 |doi=10.1017/S0140525X19001213 |pmid=31142395 |s2cid=169038428 |issn=0140-525X}}</ref><ref>{{Cite journal |last1=Ramstead |first1=Maxwell J. D. |last2=Constant |first2=Axel |last3=Badcock |first3=Paul B. |last4=Friston |first4=Karl J. |date=2019-12-01 |title=Variational ecology and the physics of sentient systems |journal=Physics of Life Reviews |series=Physics of Mind |language=en |volume=31 |pages=188–205 |doi=10.1016/j.plrev.2018.12.002 |pmid=30655223 |pmc=6941227 |bibcode=2019PhLRv..31..188R |issn=1571-0645}}</ref><ref>{{Cite journal |last1=Albarracin |first1=Mahault |last2=Demekas |first2=Daphne |last3=Ramstead |first3=Maxwell J. D. |last4=Heins |first4=Conor |date=April 2022 |title=Epistemic Communities under Active Inference |journal=Entropy |language=en |volume=24 |issue=4 |pages=476 |doi=10.3390/e24040476 |pmid=35455140 |pmc=9027706 |bibcode=2022Entrp..24..476A |issn=1099-4300 |doi-access=free }}</ref><ref>{{Cite journal |last1=Albarracin |first1=Mahault |last2=Constant |first2=Axel |last3=Friston |first3=Karl J. |last4=Ramstead |first4=Maxwell James D. |date=2021 |title=A Variational Approach to Scripts |journal=Frontiers in Psychology |volume=12 |page=585493 |doi=10.3389/fpsyg.2021.585493 |pmid=34354621 |pmc=8329037 |issn=1664-1078 |doi-access=free }}</ref> linguistics and communication,<ref>{{Cite journal |last1=Friston |first1=Karl J. |last2=Parr |first2=Thomas |last3=Yufik |first3=Yan |last4=Sajid |first4=Noor |last5=Price |first5=Catherine J. |last6=Holmes |first6=Emma |date=2020-11-01 |title=Generative models, linguistic communication and active inference |journal=Neuroscience & Biobehavioral Reviews |language=en |volume=118 |pages=42–64 |doi=10.1016/j.neubiorev.2020.07.005 |pmid=32687883 |pmc=7758713 |issn=0149-7634}}</ref><ref>{{Cite journal |last1=Tison |first1=Remi |last2=Poirier |first2=Pierre |date=2021-10-02 |title=Communication as Socially Extended Active Inference: An Ecological Approach to Communicative Behavior |url=https://doi.org/10.1080/10407413.2021.1965480 |journal=Ecological Psychology |volume=33 |issue=3–4 |pages=197–235 |doi=10.1080/10407413.2021.1965480 |s2cid=238703201 |issn=1040-7413}}</ref><ref>{{Cite journal |last1=Friston |first1=Karl J. |last2=Frith |first2=Christopher D. |date=2015-07-01 |title=Active inference, communication and hermeneutics |journal=Cortex |series=Special issue: Prediction in speech and language processing |language=en |volume=68 |pages=129–143 |doi=10.1016/j.cortex.2015.03.025 |pmid=25957007 |pmc=4502445 |issn=0010-9452}}</ref> semiotics,<ref>{{Cite journal |last=Kerusauskaite |first=Skaiste |date=2023-06-01 |title=Role of Culture in Meaning Making: Bridging Semiotic Cultural Psychology and Active Inference |url=https://doi.org/10.1007/s12124-022-09744-x |journal=Integrative Psychological and Behavioral Science |language=en |volume=57 |issue=2 |pages=432–443 |doi=10.1007/s12124-022-09744-x |pmid=36585542 |s2cid=255366405 |issn=1936-3567}}</ref><ref>{{Cite book |last1=García |first1=Adolfo M. |url=https://books.google.com/books?id=hPCKEAAAQBAJ&dq=active+inference+semiotics&pg=PT90 |title=The Routledge Handbook of Semiosis and the Brain |last2=Ibáñez |first2=Agustín |date=2022-11-14 |publisher=Taylor & Francis |isbn=978-1-000-72877-4 |language=en}}</ref> and epidemiology <ref>{{Cite journal |last1=Bottemanne |first1=Hugo |last2=Friston |first2=Karl J. |date=2021-12-01 |title=An active inference account of protective behaviours during the COVID-19 pandemic |url=https://doi.org/10.3758/s13415-021-00947-0 |journal=Cognitive, Affective, & Behavioral Neuroscience |language=en |volume=21 |issue=6 |pages=1117–1129 |doi=10.3758/s13415-021-00947-0 |issn=1531-135X |pmc=8518276 |pmid=34652601}}</ref> among others. |
|||
Active inference is closely related to the [[Good Regulator|good regulator theorem]]<ref>Conant, R. C., & Ashby, R. W. (1970). [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.161.3702&rep=rep1&type=pdf Every Good Regulator of a system must be a model of that system]. Int. J. Systems Sci. , 1 (2), 89–97.</ref> and related accounts of [[self-organisation]],<ref>Kauffman, S. (1993). [https://books.google.com/books?hl=en&lr=&id=lZcSpRJz0dgC&oi=fnd&pg=PR13&dq=%22The+Origins+of+Order:+Self-Organization+and+Selection+in+Evolution%22&ots=9_GMeW6MVv&sig=9qVR16wmBt2M6QL9xJu9wkeqGtg#v=onepage&q=%22The%20Origins%20of%20Order%3A%20Self-Organization%20and%20Selection%20in%20Evolution%22&f=false The Origins of Order: Self-Organization and Selection in Evolution]. Oxford: Oxford University Press.</ref><ref>Nicolis, G., & Prigogine, I. (1977). Self-organization in non-equilibrium systems. New York: John Wiley.</ref> such as [[self-assembly]], [[pattern formation]], [[autopoiesis]]<ref>Maturana, H. R., & Varela, F. (1980). [http://topologicalmedialab.net/xinwei/classes/readings/Maturana/autopoesis_and_cognition.pdf Autopoiesis: the organization of the living]. In V. F. Maturana HR (Ed.), Autopoiesis and Cognition. Dordrecht, Netherlands: Reidel.</ref> and [[practopoiesis]].<ref>Nikolić, D. (2015). [https://www.sciencedirect.com/science/article/pii/S002251931500106X Practopoiesis: Or how life fosters a mind]. Journal of theoretical biology, 373, 40-61. |
|||
</ref> It addresses the themes considered in [[cybernetics]], [[Synergetics (Haken)|synergetics]]<ref>Haken, H. (1983). Synergetics: An introduction. Non-equilibrium phase transition and self-organisation in physics, chemistry and biology (3rd ed.). Berlin: Springer Verlag.</ref> and [[embodied cognition]]. Because free energy can be expressed as the expected energy of observations under the variational density minus its entropy, it is also related to the [[maximum entropy principle]].<ref>Jaynes, E. T. (1957). [http://bayes.wustl.edu/etj/articles/theory.1.pdf Information Theory and Statistical Mechanics]. Physical Review Series II, 106 (4), 620–30.</ref> Finally, because the time average of energy is action, the principle of minimum variational free energy is a [[principle of least action]]. |
|||
Negative free energy is formally equivalent to the [[evidence lower bound]], which is commonly used in [[machine learning]] to train [[Generative model|generative models]], such as [[Variational autoencoder|variational autoencoders]]. |
|||
== Definition == |
|||
== Action and perception == |
|||
[[Image:MarokovBlanketFreeEnergyFigure.jpg|500px|right|These schematics illustrate the partition of states into internal and hidden or external states that are separated by a Markov blanket – comprising sensory and active states. The lower panel shows this partition as it would be applied to action and perception in the brain; where active and internal states minimise a free energy functional of sensory states. The ensuing self-organisation of internal states then correspond perception, while action couples brain states back to external states. The upper panel shows exactly the same dependencies but rearranged so that the internal states are associated with the intracellular states of a cell, while the sensory states become the surface states of the cell membrane overlying active states (e.g., the actin filaments of the cytoskeleton).]] |
|||
[[Image:MarokovBlanketFreeEnergyFigure.jpg|500px|'''Figure 1:''' These schematics illustrate the partition of states into the internal states <math>\mu(t)</math> and external (hidden, latent) states <math>\psi(t)</math> that are separated by a Markov blanket – comprising sensory states <math>s(t)</math> and active states <math>a(t)</math>. The upper panel shows exactly the same dependencies but rearranged so that the internal states are associated with the intracellular states of a cell, while the sensory states become the surface states of the cell membrane overlying active states (e.g., the actin filaments of the cytoskeleton). The lower panel shows this partition as it would be applied to action and perception in the brain; where active and internal states minimise a free energy functional of sensory states. The ensuing self-organisation of internal states then correspond to perception, while action couples brain states back to external states. |alt=These schematics illustrate the partition of states into internal and hidden or external states that are separated by a Markov blanket – comprising sensory and active states. The lower panel shows this partition as it would be applied to action and perception in the brain; where active and internal states minimise a free energy functional of sensory states. The ensuing self-organisation of internal states then correspond perception, while action couples brain states back to external states. The upper panel shows exactly the same dependencies but rearranged so that the internal states are associated with the intracellular states of a cell, while the sensory states become the surface states of the cell membrane overlying active states (e.g., the actin filaments of the cytoskeleton).|thumb]] |
|||
'''Definition''' (continuous formulation): Active inference rests on the tuple <math>(\Omega,\Psi,S,A,R,q,p)</math> |
|||
* ''A sample space'' <math>\Omega</math> – from which random fluctuations <math>\omega \in \Omega</math> are drawn |
|||
* ''Hidden or external states'' <math>\Psi:\Psi\times A \times \Omega \to \mathbb{R}</math> – that cause sensory states and depend on action |
|||
* ''Sensory states'' <math>S:\Psi \times A \times \Omega \to \mathbb{R}</math> – a probabilistic mapping from action and hidden states |
|||
* ''Action'' <math>A:S\times R \to \mathbb{R}</math> – that depends on sensory and internal states |
|||
* ''Internal states'' <math>R:R\times S \to \mathbb{R}</math> – that cause action and depend on sensory states |
|||
* ''Generative density'' <math>p(s, \psi \mid m)</math> – over sensory and hidden states under a generative model <math>m</math> |
|||
* ''Variational density'' <math>q(\psi \mid \mu)</math> – over hidden states <math>\psi \in \Psi</math> that is parameterised by internal states <math>\mu \in R</math> |
|||
Active inference applies the techniques of [[Approximate Bayesian computation|approximate Bayesian inference]] to infer the causes of sensory data from a [[Generative model|'generative' model]] of how that data is caused and then uses these inferences to guide action. |
|||
=== Action and perception === |
|||
[[Bayes' theorem|Bayes' rule]] characterizes the probabilistically optimal inversion of such a causal model, but applying it is typically computationally intractable, leading to the use of approximate methods. |
|||
In active inference, the leading class of such approximate methods are [[Variational Bayesian methods|variational methods]], for both practical and theoretical reasons: practical, as they often lead to simple inference procedures; and theoretical, because they are related to fundamental physical principles, as discussed above. |
|||
These variational methods proceed by minimizing an upper bound on the divergence between the Bayes-optimal inference (or '[[Posterior probability|posterior]]') and its approximation according to the method. |
|||
The objective is to maximise model evidence <math>p(s\mid m)</math> or minimise surprise <math>-\log p(s\mid m)</math>. This generally involves an intractable marginalisation over hidden states, so surprise is replaced with an upper variational free energy bound.<ref name="Dayan"/> However, this means that internal states must also minimise free energy, because free energy is a function of sensory and internal states: |
|||
This upper bound is known as the ''free energy'', and we can accordingly characterize perception as the minimization of the free energy with respect to inbound sensory information, and action as the minimization of the same free energy with respect to outbound action information. |
|||
This holistic dual optimization is characteristic of active inference, and the free energy principle is the hypothesis that all systems which perceive and act can be characterized in this way. |
|||
In order to exemplify the mechanics of active inference via the free energy principle, a generative model must be specified, and this typically involves a collection of [[probability density function]]s which together characterize the causal model. |
|||
: <math> a(t) = \underset{a}{\operatorname{arg\,min}} \{ F(s(t),\mu(t)) \}</math> |
|||
One such specification is as follows. |
|||
: <math>\mu(t) = \underset{\mu}{\operatorname{arg\,min}} \{ F(s(t),\mu)) \} </math> |
|||
The system is modelled as inhabiting a state space <math>X</math>, in the sense that its states form the points of this space. |
|||
The state space is then factorized according to <math>X = \Psi\times S\times A\times R</math>, where <math>\Psi</math> is the space of 'external' states that are 'hidden' from the agent (in the sense of not being directly perceived or accessible), <math>S</math> is the space of sensory states that are directly perceived by the agent, <math>A</math> is the space of the agent's possible actions, and <math>R</math> is a space of 'internal' states that are private to the agent. |
|||
Keeping with the Figure 1, note that in the following the <math>\dot{\psi}, \psi, s, a</math> and <math>\mu</math> are functions of (continuous) time <math>t</math>. The generative model is the specification of the following density functions: |
|||
: <math>\underset{\mathrm{free-energy}} {\underbrace{F(s,\mu)}} = \underset{\mathrm{energy}} {\underbrace{ E_q[-\log p(s,\psi \mid m)]}} - \underset{\mathrm{entropy}} {\underbrace{ H[q(\psi \mid \mu)]}} |
|||
* A sensory model, <math>p_S:S \times \Psi\times A \to \mathbb{R}</math>, often written as <math>p_S(s \mid \psi, a)</math>, characterizing the likelihood of sensory data given external states and actions; |
|||
= \underset{\mathrm{surprise}} {\underbrace{ -\log p(s \mid m)}} + \underset{\mathrm{divergence}} {\underbrace{ D_{\mathrm{KL}}[q(\psi \mid \mu) \parallel p(\psi \mid s,m)]}} |
|||
* a stochastic model of the environmental dynamics, <math>p_\Psi: \Psi \times \Psi \times A \to \mathbb{R}</math>, often written <math>p_\Psi(\dot{\psi} \mid \psi, a)</math>, characterizing how the external states are expected by the agent to evolve over time <math>t</math>, given the agent's actions; |
|||
\geq \underset{\mathrm{surprise}} {\underbrace{ -\log p(s \mid m)}} </math> |
|||
* an action model, <math>p_A: A \times R \times S \to \mathbb{R}</math>, written <math>p_A(a \mid \mu, s)</math>, characterizing how the agent's actions depend upon its internal states and sensory data; and |
|||
* an internal model, <math>p_R: R \times S \to \mathbb{R}</math>, written <math>p_R(\mu \mid s)</math>, characterizing how the agent's internal states depend upon its sensory data. |
|||
These density functions determine the factors of a "[[Joint probability distribution|joint model]]", which represents the complete specification of the generative model, and which can be written as |
|||
This induces a dual minimisation with respect to action and internal states that correspond to action and perception respectively. |
|||
: <math>p_{\text{Bayes}}(\dot{\psi}, s, a, \mu \mid \psi) = p_S(s \mid \psi, a)p_\Psi(\dot{\psi} \mid \psi, a)p_A(a \mid \mu, s)p_R(\mu \mid s)</math>. |
|||
Bayes' rule then determines the "posterior density" <math>p(\dot{\psi} | s, a, \mu, \psi)</math>, which expresses a probabilistically optimal belief about the external state <math>\psi</math> given the preceding state and the agent's actions, sensory signals, and internal states. |
|||
Since computing <math>p_{\text{Bayes}}</math> is computationally intractable, the free energy principle asserts the existence of a "variational density" <math>q(\dot{\psi} | s, a, \mu, \psi)</math>, where <math>q</math> is an approximation to <math>p_{\text{Bayes}}</math>. |
|||
One then defines the free energy as |
|||
: <math>\begin{align} |
|||
\underset{\mathrm{free-energy}} {\underbrace{F(\mu, a\, ; s)}} &= \underset{\text{expected energy}} {\underbrace{ \mathbb{E}_{q(\dot{\psi})}[-\log p(\dot{\psi}, s, a, \mu \mid \psi)]}} - \underset{\mathrm{entropy}} {\underbrace{ \mathbb{H}[q(\dot{\psi} \mid s, a, \mu, \psi)]}}\\ |
|||
&= \underset{\mathrm{surprise}} {\underbrace{ -\log p(s)}} + \underset{\mathrm{divergence}} {\underbrace{ \mathbb{KL}[q(\dot{\psi} \mid s, a, \mu, \psi) \parallel p_{\text{Bayes}}(\dot{\psi} \mid s, a, \mu, \psi)]}} \\ |
|||
&\geq \underset{\mathrm{surprise}} {\underbrace{ -\log p(s)}} |
|||
\end{align}</math> |
|||
and defines action and perception as the joint optimization problem |
|||
: <math> \begin{align} |
|||
\mu^* &= \underset{\mu}{\operatorname{arg\,min}} \{ F(\mu, a \,;\, s)) \} \\ |
|||
a^* &= \underset{a}{\operatorname{arg\,min}} \{ F(\mu^*, a \,;\, s) \} |
|||
\end{align}</math> |
|||
where the internal states <math>\mu</math> are typically taken to encode the parameters of the 'variational' density <math>q</math> and hence the agent's "best guess" about the posterior belief over <math>\Psi</math>. |
|||
Note that the free energy is also an upper bound on a measure of the agent's ([[Marginal distribution|marginal]], or average) sensory [[Information content|surprise]], and hence free energy minimization is often motivated by the minimization of surprise. |
|||
== Free energy minimisation == |
== Free energy minimisation == |
||
Line 48: | Line 102: | ||
=== Free energy minimisation and self-organisation === |
=== Free energy minimisation and self-organisation === |
||
Free energy minimisation has been proposed as a hallmark of self-organising systems when cast as [[random dynamical system]]s.<ref> |
Free energy minimisation has been proposed as a hallmark of self-organising systems when cast as [[random dynamical system]]s.<ref>{{cite journal |doi=10.1007/BF01193705 |url=https://www.researchgate.net/publication/227072665|title=Attractors for random dynamical systems |year=1994 |last1=Crauel |first1=Hans |last2=Flandoli |first2=Franco |journal=Probability Theory and Related Fields |volume=100 |issue=3 |pages=365–393 |s2cid=122609512 |doi-access=free }}</ref> This formulation rests on a [[Markov blanket]] (comprising action and sensory states) that separates internal and external states. If internal states and action minimise free energy, then they place an upper bound on the entropy of sensory states: |
||
: <math> \lim_{T\to\infty} \frac{1}{T} \underset{\text{free-action}} {\underbrace{\int_0^T F(s(t),\mu (t))\,dt}} \ge |
: <math> \lim_{T\to\infty} \frac{1}{T} \underset{\text{free-action}} {\underbrace{\int_0^T F(s(t),\mu (t))\,dt}} \ge |
||
\lim_{T\to\infty} \frac{1}{T} \int_0^T \underset{\text{surprise}}{\underbrace{-\log p(s(t)\mid m)}} \, dt = H[p(s\mid m)] </math> |
\lim_{T\to\infty} \frac{1}{T} \int_0^T \underset{\text{surprise}}{\underbrace{-\log p(s(t)\mid m)}} \, dt = H[p(s\mid m)] </math> |
||
This is because – under [[Ergodic theory|ergodic]] assumptions – the long-term average of surprise is entropy. This bound resists a natural tendency to disorder – of the sort associated with the [[second law of thermodynamics]] and the [[fluctuation theorem]]. However, formulating a unifying principle for the life sciences in terms of concepts from statistical physics, such as random dynamical system, non-equilibrium steady state and ergodicity, places substantial constraints on the theoretical and empirical study of biological systems with the risk of obscuring all features that make biological systems interesting kinds of self-organizing systems<ref> |
This is because – under [[Ergodic theory|ergodic]] assumptions – the long-term average of surprise is entropy. This bound resists a natural tendency to disorder – of the sort associated with the [[second law of thermodynamics]] and the [[fluctuation theorem]]. However, formulating a unifying principle for the life sciences in terms of concepts from statistical physics, such as random dynamical system, non-equilibrium steady state and ergodicity, places substantial constraints on the theoretical and empirical study of biological systems with the risk of obscuring all features that make biological systems interesting kinds of self-organizing systems.<ref>{{cite journal | doi=10.1007/s10539-021-09818-x | title=Non-equilibrium thermodynamics and the free energy principle in biology | year=2021 | last1=Colombo | first1=Matteo | last2=Palacios | first2=Patricia | journal=Biology & Philosophy | volume=36 | issue=5 | s2cid=235803361 | doi-access=free }}</ref> |
||
=== Free energy minimisation and Bayesian inference === |
=== Free energy minimisation and Bayesian inference === |
||
All Bayesian inference can be cast in terms of free energy minimisation<ref> |
All Bayesian inference can be cast in terms of free energy minimisation<ref>{{cite journal |url=http://authors.library.caltech.edu/13697/1/ROWnc99.pdf |doi=10.1162/089976699300016674|title=A Unifying Review of Linear Gaussian Models |year=1999 |last1=Roweis |first1=Sam |last2=Ghahramani |first2=Zoubin |journal=Neural Computation |volume=11 |issue=2 |pages=305–345 |pmid=9950734 |s2cid=2590898 }}</ref>{{Failed verification|date=April 2020}}. When free energy is minimised with respect to internal states, the [[Kullback–Leibler divergence]] between the variational and posterior density over hidden states is minimised. This corresponds to approximate [[Bayesian inference]] – when the form of the variational density is fixed – and exact Bayesian inference otherwise. Free energy minimisation therefore provides a generic description of Bayesian inference and filtering (e.g., [[Kalman filter]]ing). It is also used in Bayesian [[model selection]], where free energy can be usefully decomposed into complexity and accuracy: |
||
: <math> \underset{\text{free-energy}} {\underbrace{ F(s,\mu)}} = \underset{\text{complexity}} {\underbrace{ D_\mathrm{KL}[q(\psi\mid\mu)\parallel p(\psi\mid m)]}} - \underset{\mathrm{accuracy}} {\underbrace{E_q[\log p(s\mid\psi,m)]}}</math> |
: <math> \underset{\text{free-energy}} {\underbrace{ F(s,\mu)}} = \underset{\text{complexity}} {\underbrace{ D_\mathrm{KL}[q(\psi\mid\mu)\parallel p(\psi\mid m)]}} - \underset{\mathrm{accuracy}} {\underbrace{E_q[\log p(s\mid\psi,m)]}}</math> |
||
Models with minimum free energy provide an accurate explanation of data, under complexity costs (c.f., [[Occam's razor]] and more formal treatments of computational costs<ref> |
Models with minimum free energy provide an accurate explanation of data, under complexity costs (c.f., [[Occam's razor]] and more formal treatments of computational costs<ref>{{cite journal | url=http://rspa.royalsocietypublishing.org/content/469/2153/20120683 | doi=10.1098/rspa.2012.0683 | title=Thermodynamics as a theory of decision-making with information-processing costs | year=2013 | last1=Ortega | first1=Pedro A. | last2=Braun | first2=Daniel A. | journal=Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences | volume=469 | issue=2153 | arxiv=1204.6481 | bibcode=2013RSPSA.46920683O | s2cid=28080508 }}</ref>). Here, complexity is the divergence between the variational density and prior beliefs about hidden states (i.e., the effective degrees of freedom used to explain the data). |
||
=== Free energy minimisation and thermodynamics === |
=== Free energy minimisation and thermodynamics === |
||
Variational free energy is an information-theoretic functional and is distinct from thermodynamic (Helmholtz) [[Helmholtz free energy|free energy]].<ref> |
Variational free energy is an information-theoretic functional and is distinct from thermodynamic (Helmholtz) [[Helmholtz free energy|free energy]].<ref>{{cite journal |url=http://rscweb.anu.edu.au/~evans/papers/NEFET.pdf |doi=10.1080/0026897031000085173|title=A non-equilibrium free energy theorem for deterministic systems |year=2003 |last1=Evans |first1=Denis J. |journal=Molecular Physics |volume=101 |issue=10 |pages=1551–1554 |bibcode=2003MolPh.101.1551E |s2cid=15129000 }}</ref> However, the complexity term of variational free energy shares the same fixed point as Helmholtz free energy (under the assumption the system is thermodynamically closed but not isolated). This is because if sensory perturbations are suspended (for a suitably long period of time), complexity is minimised (because accuracy can be neglected). At this point, the system is at equilibrium and internal states minimise Helmholtz free energy, by the [[principle of minimum energy]].<ref>{{cite journal | arxiv=cond-mat/9610209 | doi=10.1103/PhysRevLett.78.2690 | title=Nonequilibrium Equality for Free Energy Differences | year=1997 | last1=Jarzynski | first1=C. | journal=Physical Review Letters | volume=78 | issue=14 | pages=2690–2693 | bibcode=1997PhRvL..78.2690J | s2cid=16112025 }}</ref> |
||
=== Free energy minimisation and information theory === |
=== Free energy minimisation and information theory === |
||
Free energy minimisation is equivalent to maximising the [[mutual information]] between sensory states and internal states that parameterise the variational density (for a fixed entropy variational density). |
Free energy minimisation is equivalent to maximising the [[mutual information]] between sensory states and internal states that parameterise the variational density (for a fixed entropy variational density). This relates free energy minimization to the principle of minimum redundancy.<ref>{{cite arXiv | last1=Sakthivadivel | first1=Dalton | title=Towards a Geometry and Analysis for Bayesian Mechanics | year=2022 | class=math-ph |eprint=2204.11900}}</ref><ref name="On Bayesian mechanics: A physics of"/> |
||
[https://www.annualreviews.org/doi/pdf/10.1146/annurev.ne.13.030190.001353 Perceptual neural organization: some approaches based on network models and information theory]. Annu Rev Neurosci. , 13, 257–81.</ref><ref>Bialek, W., Nemenman, I., & Tishby, N. (2001). [http://www.princeton.edu/~wbialek/our_papers/bnt_01a.pdf Predictability, complexity, and learning]. Neural Computat., 13 (11), 2409–63.</ref> |
|||
== Free energy minimisation in neuroscience == |
== Free energy minimisation in neuroscience == |
||
Free energy minimisation provides a useful way to formulate normative (Bayes optimal) models of neuronal inference and learning under uncertainty<ref> |
Free energy minimisation provides a useful way to formulate normative (Bayes optimal) models of neuronal inference and learning under uncertainty<ref>{{cite journal |url=http://www.fil.ion.ucl.ac.uk/~karl/The%20free-energy%20principle%20A%20unified%20brain%20theory.pdf |doi=10.1038/nrn2787|title=The free-energy principle: A unified brain theory? |year=2010 |last1=Friston |first1=Karl |journal=Nature Reviews Neuroscience |volume=11 |issue=2 |pages=127–138 |pmid=20068583 |s2cid=5053247 }}</ref> and therefore subscribes to the [[Bayesian brain]] hypothesis.<ref>{{cite journal |doi=10.1016/j.tins.2004.10.007 |url=http://mrl.isr.uc.pt/pub/bscw.cgi/d27540/ReviewKnillPouget2.pdf |title=The Bayesian brain: The role of uncertainty in neural coding and computation |year=2004 |last1=Knill |first1=David C. |last2=Pouget |first2=Alexandre |journal=Trends in Neurosciences |volume=27 |issue=12 |pages=712–719 |pmid=15541511 |s2cid=9870936 |access-date=2013-05-31 |archive-date=2016-03-04 |archive-url=https://web.archive.org/web/20160304044221/http://mrl.isr.uc.pt/pub/bscw.cgi/d27540/ReviewKnillPouget2.pdf |url-status=dead }}</ref> The neuronal processes described by free energy minimisation depend on the nature of hidden states: <math> \Psi = X \times \Theta \times \Pi </math> that can comprise time-dependent variables, time-invariant parameters and the precision (inverse variance or temperature) of random fluctuations. Minimising variables, parameters, and precision correspond to inference, learning, and the encoding of uncertainty, respectively. |
||
=== Perceptual inference and categorisation === |
=== Perceptual inference and categorisation === |
||
Free energy minimisation formalises the notion of [[unconscious inference]] in perception<ref name="Helmholtz" /><ref name="Dayan" /> and provides a normative (Bayesian) theory of neuronal processing. The associated process theory of neuronal dynamics is based on minimising free energy through gradient descent. This corresponds to [[Generalized filtering|generalised Bayesian filtering]] (where ~ denotes a variable in generalised coordinates of motion and <math>D</math> is a derivative matrix operator):<ref> |
Free energy minimisation formalises the notion of [[unconscious inference]] in perception<ref name="Helmholtz" /><ref name="Dayan" /> and provides a normative (Bayesian) theory of neuronal processing. The associated process theory of neuronal dynamics is based on minimising free energy through gradient descent. This corresponds to [[Generalized filtering|generalised Bayesian filtering]] (where ~ denotes a variable in generalised coordinates of motion and <math>D</math> is a derivative matrix operator):<ref>{{cite journal | doi=10.1155/2010/621670 | doi-access=free | title=Generalised Filtering | year=2010 | last1=Friston | first1=Karl | last2=Stephan | first2=Klaas | last3=Li | first3=Baojuan | last4=Daunizeau | first4=Jean | journal=Mathematical Problems in Engineering | volume=2010 | pages=1–34 }}</ref> |
||
: <math>\dot{\tilde{\mu}} = D \tilde{\mu} - \partial_{\mu}F(s,\mu)\Big|_{\mu = \tilde{\mu}}</math> |
: <math>\dot{\tilde{\mu}} = D \tilde{\mu} - \partial_{\mu}F(s,\mu)\Big|_{\mu = \tilde{\mu}}</math> |
||
Usually, the generative models that define free energy are non-linear and hierarchical (like cortical hierarchies in the brain). Special cases of generalised filtering include [[Kalman filter]]ing, which is formally equivalent to [[predictive coding]]<ref> |
Usually, the generative models that define free energy are non-linear and hierarchical (like cortical hierarchies in the brain). Special cases of generalised filtering include [[Kalman filter]]ing, which is formally equivalent to [[predictive coding]]<ref>{{cite journal |doi=10.1016/j.tins.2004.10.007 |url=https://www.cs.utexas.edu/users/dana/nn.pdf|title=The Bayesian brain: The role of uncertainty in neural coding and computation |year=2004 |last1=Knill |first1=David C. |last2=Pouget |first2=Alexandre |journal=Trends in Neurosciences |volume=27 |issue=12 |pages=712–719 |pmid=15541511 |s2cid=9870936 }}</ref> – a popular metaphor for message passing in the brain. Under hierarchical models, predictive coding involves the recurrent exchange of ascending (bottom-up) prediction errors and descending (top-down) predictions<ref name="Mumford">{{cite journal |doi=10.1007/BF00198477 |url=http://cs.brown.edu/people/tld/projects/cortex/course/suggested_reading_list/supplements/documents/MumfordBC-92.pdf|title=On the computational architecture of the neocortex |year=1992 |last1=Mumford |first1=D. |journal=Biological Cybernetics |volume=66 |issue=3 |pages=241–251 |pmid=1540675 |s2cid=14303625 }}</ref> that is consistent with the anatomy and physiology of sensory<ref>{{cite journal | doi=10.1016/j.neuron.2012.10.038 | title=Canonical Microcircuits for Predictive Coding | year=2012 | last1=Bastos | first1=Andre M. | last2=Usrey | first2=W. Martin | last3=Adams | first3=Rick A. | last4=Mangun | first4=George R. | last5=Fries | first5=Pascal | last6=Friston | first6=Karl J. | journal=Neuron | volume=76 | issue=4 | pages=695–711 | pmid=23177956 | pmc=3777738 }}</ref> and motor systems.<ref>{{cite journal | doi=10.1007/s00429-012-0475-5 | title=Predictions not commands: Active inference in the motor system | year=2013 | last1=Adams | first1=Rick A. | last2=Shipp | first2=Stewart | last3=Friston | first3=Karl J. | journal=Brain Structure and Function | volume=218 | issue=3 | pages=611–643 | pmid=23129312 | pmc=3637647 }}</ref> |
||
=== Perceptual learning and memory === |
=== Perceptual learning and memory === |
||
Line 90: | Line 143: | ||
=== Perceptual precision, attention and salience === |
=== Perceptual precision, attention and salience === |
||
Optimizing the precision parameters corresponds to optimizing the gain of prediction errors (c.f., Kalman gain). In neuronally plausible implementations of predictive coding,<ref name=" |
Optimizing the precision parameters corresponds to optimizing the gain of prediction errors (c.f., Kalman gain). In neuronally plausible implementations of predictive coding,<ref name="Mumford"/> this corresponds to optimizing the excitability of superficial pyramidal cells and has been interpreted in terms of attentional gain.<ref name="Feldman">{{cite journal | doi=10.3389/fnhum.2010.00215 | doi-access=free | title=Attention, Uncertainty, and Free-Energy | year=2010 | last1=Friston | first1=Karl J. | last2=Feldman | first2=Harriet | journal=Frontiers in Human Neuroscience | volume=4 | page=215 | pmid=21160551 | pmc=3001758 }}</ref> |
||
[[File:PESAIM.jpg|thumb|Simulation of the results achieved from a selective attention task carried out by the Bayesian reformulation of the SAIM entitled PE-SAIM in multiple objects environment. The graphs show the time course of the activation for the FOA and the two template units in the Knowledge Network.]] |
[[File:PESAIM.jpg|thumb|Simulation of the results achieved from a selective attention task carried out by the Bayesian reformulation of the SAIM entitled PE-SAIM in multiple objects environment. The graphs show the time course of the activation for the FOA and the two template units in the Knowledge Network.]] |
||
With regard to the top-down vs. bottom-up controversy, which has been addressed as a major open problem of attention, a computational model has succeeded in illustrating the circular nature of the interplay between top-down and bottom-up mechanisms. Using an established emergent model of attention, namely SAIM, the authors proposed a model called PE-SAIM, which, in contrast to the standard version, approaches selective attention from a top-down position. The model takes into account the transmission of prediction errors to the same level or a level above, in order to minimise the energy function that indicates the difference between the data and its cause, or, in other words, between the generative model and the posterior. To increase validity, they also incorporated neural competition between stimuli into their model. A notable feature of this model is the reformulation of the free energy function only in terms of prediction errors during task performance: |
|||
<math>\dfrac{\partial E^{total}(Y^{VP},X^{SN},x^{CN},y^{KN})}{\partial y^{SN}_{mn}}=x^{CN}_{mn}-b^{CN}\varepsilon^{CN}_{nm}+b^{CN}\sum_{k}(\varepsilon^{KN}_{knm})</math> |
<math>\dfrac{\partial E^{total}(Y^{VP},X^{SN},x^{CN},y^{KN})}{\partial y^{SN}_{mn}}=x^{CN}_{mn}-b^{CN}\varepsilon^{CN}_{nm}+b^{CN}\sum_{k}(\varepsilon^{KN}_{knm})</math> |
||
where <math>E^{total}</math> is the total [[energy function]] of the neural networks entail, and <math>\varepsilon^{KN}_{knm}</math> is the prediction error between the generative model (prior) and posterior changing over time.<ref name="Abadi"> |
where <math>E^{total}</math> is the total [[energy function]] of the neural networks entail, and <math>\varepsilon^{KN}_{knm}</math> is the prediction error between the generative model (prior) and posterior changing over time.<ref name="Abadi">{{cite journal | doi=10.1098/rsif.2018.0344 | title=Excitatory versus inhibitory feedback in Bayesian formulations of scene construction | year=2019 | last1=Abadi | first1=Alireza Khatoon | last2=Yahya | first2=Keyvan | last3=Amini | first3=Massoud | last4=Friston | first4=Karl | last5=Heinke | first5=Dietmar | journal=Journal of the Royal Society Interface | volume=16 | issue=154 | pmid=31039693 | pmc=6544897 }}</ref> |
||
Comparing the two models reveals a notable similarity between their respective results while also highlighting a remarkable discrepancy, whereby – in the standard version of the SAIM – the model's focus is mainly upon the excitatory connections, whereas in the PE-SAIM, the inhibitory connections are leveraged to make an inference. The model has also proved to be fit to predict the EEG and fMRI data drawn from human experiments with high precision. In the same vein, Yahya et al. also applied the free energy principle to propose a computational model for template matching in covert selective visual attention that mostly relies on SAIM.<ref name="Yahya"> |
Comparing the two models reveals a notable similarity between their respective results while also highlighting a remarkable discrepancy, whereby – in the standard version of the SAIM – the model's focus is mainly upon the excitatory connections, whereas in the PE-SAIM, the inhibitory connections are leveraged to make an inference. The model has also proved to be fit to predict the EEG and fMRI data drawn from human experiments with high precision. In the same vein, Yahya et al. also applied the free energy principle to propose a computational model for template matching in covert selective visual attention that mostly relies on SAIM.<ref name="Yahya">{{cite journal | doi=10.1007/s10339-013-0597-6 | title=12th Biannual Conference of the German Cognitive Science Society (KogWis 2014) | journal=Cognitive Processing | year=2014 | volume=15 | page=107 | s2cid=10121398 | doi-access=free }}</ref> According to this study, the total free energy of the whole state-space is reached by inserting top-down signals in the original neural networks, whereby we derive a dynamical system comprising both feed-forward and backward prediction error. |
||
== Active inference == |
== Active inference == |
||
When gradient descent is applied to action <math> \dot{a} = -\partial_aF(s,\tilde{\mu}) </math>, motor control can be understood in terms of classical reflex arcs that are engaged by descending (corticospinal) predictions. This provides a formalism that generalizes the equilibrium point solution – to the [[degrees of freedom problem]]<ref> |
When gradient descent is applied to action <math> \dot{a} = -\partial_aF(s,\tilde{\mu}) </math>, motor control can be understood in terms of classical reflex arcs that are engaged by descending (corticospinal) predictions. This provides a formalism that generalizes the equilibrium point solution – to the [[degrees of freedom problem]]<ref>{{cite journal |doi=10.1017/S0140525X0004070X |url=http://e.guigon.free.fr/rsc/article/FeldmanLevin95.pdf |title=The origin and use of positional frames of reference in motor control |year=1995 |last1=Feldman |first1=Anatol G. |last2=Levin |first2=Mindy F. |journal=Behavioral and Brain Sciences |volume=18 |issue=4 |pages=723–744 |s2cid=145164477 |access-date=2013-05-31 |archive-date=2014-03-29 |archive-url=https://web.archive.org/web/20140329220749/http://e.guigon.free.fr/rsc/article/FeldmanLevin95.pdf |url-status=dead }}</ref> – to movement trajectories. |
||
=== Active inference and optimal control === |
=== Active inference and optimal control === |
||
Active inference is related to [[optimal control]] by replacing value or cost-to-go functions with prior beliefs about state transitions or flow.<ref> |
Active inference is related to [[optimal control]] by replacing value or cost-to-go functions with prior beliefs about state transitions or flow.<ref>{{cite journal |doi=10.1016/j.neuron.2011.10.018 |url=http://www.fil.ion.ucl.ac.uk/~karl/What%20Is%20Optimal%20about%20Motor%20Control.pdf|title=What is Optimal about Motor Control? |year=2011 |last1=Friston |first1=Karl |journal=Neuron |volume=72 |issue=3 |pages=488–498 |pmid=22078508 |s2cid=13912462 }}</ref> This exploits the close connection between Bayesian filtering and the solution to the [[Bellman equation]]. However, active inference starts with (priors over) flow <math> f = \Gamma \cdot \nabla V + \nabla \times W </math> that are specified with scalar <math> V(x) </math> and vector <math> W(x) </math> value functions of state space (c.f., the [[Helmholtz decomposition]]). Here, <math> \Gamma </math> is the amplitude of random fluctuations and cost is <math> c(x) = f \cdot \nabla V + \nabla \cdot \Gamma \cdot V</math>. The priors over flow <math> p(\tilde{x}\mid m) </math> induce a prior over states <math> p(x\mid m) = \exp (V(x)) </math> that is the solution to the appropriate forward [[Kolmogorov equations]].<ref>{{cite journal | doi=10.1155/2012/937860 | doi-access=free | title=Free Energy, Value, and Attractors | year=2012 | last1=Friston | first1=Karl | last2=Ao | first2=Ping | journal=Computational and Mathematical Methods in Medicine | volume=2012 | pages=1–27 | pmid=22229042 | pmc=3249597 }}</ref> In contrast, optimal control optimises the flow, given a cost function, under the assumption that <math> W = 0 </math> (i.e., the flow is curl free or has detailed balance). Usually, this entails solving backward [[Kolmogorov equations]].<ref>{{cite journal | arxiv=physics/0505066 | doi=10.1088/1742-5468/2005/11/P11011 | title=Path integrals and symmetry breaking for optimal control theory | year=2005 | last1=Kappen | first1=H. J. | journal=Journal of Statistical Mechanics: Theory and Experiment | volume=2005 | issue=11 | pages=P11011 | bibcode=2005JSMTE..11..011K | s2cid=87027 }}</ref> |
||
=== Active inference and optimal decision (game) theory === |
=== Active inference and optimal decision (game) theory === |
||
[[Optimal decision]] problems (usually formulated as [[partially observable Markov decision process]]es) are treated within active inference by absorbing [[Utility| utility functions]] into prior beliefs. In this setting, states that have a high utility (low cost) are states an agent expects to occupy. By equipping the generative model with hidden states that model control, policies (control sequences) that minimise variational free energy lead to high utility states.<ref> |
[[Optimal decision]] problems (usually formulated as [[partially observable Markov decision process]]es) are treated within active inference by absorbing [[Utility| utility functions]] into prior beliefs. In this setting, states that have a high utility (low cost) are states an agent expects to occupy. By equipping the generative model with hidden states that model control, policies (control sequences) that minimise variational free energy lead to high utility states.<ref>{{cite journal |doi=10.1007/s00422-012-0512-8 |doi-access=free|title=Active inference and agency: Optimal control without cost functions |year=2012 |last1=Friston |first1=Karl |last2=Samothrakis |first2=Spyridon |last3=Montague |first3=Read |journal=Biological Cybernetics |volume=106 |issue=8–9 |pages=523–541 |pmid=22864468 |hdl=10919/78836 |hdl-access=free }}</ref> |
||
Neurobiologically, neuromodulators such as [[dopamine]] are considered to report the precision of prediction errors by modulating the gain of principal cells encoding prediction error.<ref name="Friston_a"> |
Neurobiologically, neuromodulators such as [[dopamine]] are considered to report the precision of prediction errors by modulating the gain of principal cells encoding prediction error.<ref name="Friston_a">{{cite journal | doi=10.1371/journal.pcbi.1002327 | title=Dopamine, Affordance and Active Inference | year=2012 | last1=Friston | first1=Karl J. | last2=Shiner | first2=Tamara | last3=Fitzgerald | first3=Thomas | last4=Galea | first4=Joseph M. | last5=Adams | first5=Rick | last6=Brown | first6=Harriet | last7=Dolan | first7=Raymond J. | last8=Moran | first8=Rosalyn|author8-link=Rosalyn Moran | last9=Stephan | first9=Klaas Enno | last10=Bestmann | first10=Sven | journal=PLOS Computational Biology | volume=8 | issue=1 | pages=e1002327 | pmid=22241972 | pmc=3252266 | bibcode=2012PLSCB...8E2327F | doi-access=free }}</ref> This is closely related to – but formally distinct from – the role of dopamine in reporting prediction errors ''per se''<ref>{{cite journal |doi=10.1126/science.1077349 |url=http://e.guigon.free.fr/rsc/article/FiorilloEtAl03.pdf |title=Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons |year=2003 |last1=Fiorillo |first1=Christopher D. |last2=Tobler |first2=Philippe N. |last3=Schultz |first3=Wolfram |journal=Science |volume=299 |issue=5614 |pages=1898–1902 |pmid=12649484 |bibcode=2003Sci...299.1898F |s2cid=2363255 |access-date=2013-05-31 |archive-date=2016-03-04 |archive-url=https://web.archive.org/web/20160304045504/http://e.guigon.free.fr/rsc/article/FiorilloEtAl03.pdf |url-status=dead }}</ref> and related computational accounts.<ref>{{cite journal |doi=10.1162/0898929052880093 |url=http://ski.cog.brown.edu/papers/Frank_JOCN.pdf|title=Dynamic Dopamine Modulation in the Basal Ganglia: A Neurocomputational Account of Cognitive Deficits in Medicated and Nonmedicated Parkinsonism |year=2005 |last1=Frank |first1=Michael J. |journal=Journal of Cognitive Neuroscience |volume=17 |issue=1 |pages=51–72 |pmid=15701239 |s2cid=7414727 }}</ref> |
||
=== Active inference and cognitive neuroscience === |
=== Active inference and cognitive neuroscience === |
||
Active inference has been used to address a range of issues in [[cognitive neuroscience]], brain function and neuropsychiatry, including action observation,<ref> |
Active inference has been used to address a range of issues in [[cognitive neuroscience]], brain function and neuropsychiatry, including action observation,<ref>{{cite journal |doi=10.1007/s00422-011-0424-z |url=http://www.fil.ion.ucl.ac.uk/~karl/Action%20understanding%20and%20active%20inference.pdf|title=Action understanding and active inference |year=2011 |last1=Friston |first1=Karl |last2=Mattout |first2=Jérémie |last3=Kilner |first3=James |journal=Biological Cybernetics |volume=104 |issue=1–2 |pages=137–160 |pmid=21327826 |pmc=3491875 }}</ref> mirror neurons,<ref>{{cite journal |doi=10.1007/s10339-007-0170-2 |url=http://www.fil.ion.ucl.ac.uk/~karl/Predictive%20coding%20an%20account%20of%20the%20mirror%20neuron%20system.pdf|title=Predictive coding: An account of the mirror neuron system |year=2007 |last1=Kilner |first1=James M. |last2=Friston |first2=Karl J. |last3=Frith |first3=Chris D. |journal=Cognitive Processing |volume=8 |issue=3 |pages=159–166 |pmid=17429704 |pmc=2649419 }}</ref> saccades and visual search,<ref>{{cite journal | doi=10.3389/fpsyg.2012.00151 | doi-access=free | title=Perceptions as Hypotheses: Saccades as Experiments | year=2012 | last1=Friston | first1=Karl | last2=Adams | first2=Rick A. | last3=Perrinet | first3=Laurent | last4=Breakspear | first4=Michael | journal=Frontiers in Psychology | volume=3 | page=151 | pmid=22654776 | pmc=3361132 }}</ref><ref>{{cite journal | doi=10.1371/journal.pone.0190429 | doi-access=free | title=Human visual exploration reduces uncertainty about the sensed world | year=2018 | last1=Mirza | first1=M. Berk | last2=Adams | first2=Rick A. | last3=Mathys | first3=Christoph | last4=Friston | first4=Karl J. | journal=PLOS ONE | volume=13 | issue=1 | pages=e0190429 | pmid=29304087 | pmc=5755757 | bibcode=2018PLoSO..1390429M }}</ref> eye movements,<ref>{{cite journal | doi=10.1007/s00422-014-0620-8 | title=Active inference, eye movements and oculomotor delays | year=2014 | last1=Perrinet | first1=Laurent U. | last2=Adams | first2=Rick A. | last3=Friston | first3=Karl J. | journal=Biological Cybernetics | volume=108 | issue=6 | pages=777–801 | pmid=25128318 | pmc=4250571 }}</ref> sleep,<ref>{{cite journal |doi=10.1016/j.pneurobio.2012.05.003 |doi-access=free|title=Waking and dreaming consciousness: Neurobiological and functional considerations |year=2012 |last1=Hobson |first1=J.A. |last2=Friston |first2=K.J. |journal=Progress in Neurobiology |volume=98 |issue=1 |pages=82–98 |pmid=22609044 |pmc=3389346 }}</ref> illusions,<ref>{{cite journal | doi=10.3389/fpsyg.2012.00043 | doi-access=free | title=Free-Energy and Illusions: The Cornsweet Effect | year=2012 | last1=Brown | first1=Harriet | last2=Friston | first2=Karl J. | journal=Frontiers in Psychology | volume=3 | page=43 | pmid=22393327 | pmc=3289982 }}</ref> attention,<ref name="Feldman" /> action selection,<ref name="Friston_a" /> consciousness,<ref>{{Cite journal|last1=Rudrauf|first1=David|last2=Bennequin|first2=Daniel|last3=Granic|first3=Isabela|last4=Landini|first4=Gregory|last5=Friston|first5=Karl|last6=Williford|first6=Kenneth|date=2017-09-07|title=A mathematical model of embodied consciousness|journal=Journal of Theoretical Biology|volume=428|pages=106–131|doi=10.1016/j.jtbi.2017.05.032|pmid=28554611|bibcode=2017JThBi.428..106R |url=http://discovery.ucl.ac.uk/10057795/1/DR_et_al_A_math_model_of_embodied_consciousness_JTBiol_final_revision_for_submission.pdf}}</ref><ref>{{Cite journal|title=The Projective Consciousness Model and Phenomenal Selfhood|journal = Frontiers in Psychology|volume = 9|pages = 2571|last1=K|first1=Williford|last2=D|first2=Bennequin|date=2018-12-17|language=en|pmid=30618988|pmc = 6304424|last3=K|first3=Friston|last4=D|first4=Rudrauf|doi = 10.3389/fpsyg.2018.02571|doi-access = free}}</ref> hysteria<ref>{{cite journal |doi=10.1093/brain/aws129 |url=http://www.fil.ion.ucl.ac.uk/~karl/A%20Bayesian%20account%20of%20hysteria.pdf|title=A Bayesian account of 'hysteria' |year=2012 |last1=Edwards |first1=M. J. |last2=Adams |first2=R. A. |last3=Brown |first3=H. |last4=Parees |first4=I. |last5=Friston |first5=K. J. |journal=Brain |volume=135 |issue=11 |pages=3495–3512 |pmid=22641838 |pmc=3501967 }}</ref> and psychosis.<ref>{{cite journal | doi=10.1371/journal.pone.0047502 | doi-access=free | title=Smooth Pursuit and Visual Occlusion: Active Inference and Oculomotor Control in Schizophrenia | year=2012 | last1=Adams | first1=Rick A. | last2=Perrinet | first2=Laurent U. | last3=Friston | first3=Karl | journal=PLOS ONE | volume=7 | issue=10 | pages=e47502 | pmid=23110076 | pmc=3482214 | bibcode=2012PLoSO...747502A }}</ref> Explanations of action in active inference often depend on the idea that the brain has 'stubborn predictions' that it cannot update, leading to actions that cause these predictions to come true.<ref>{{Cite journal|last1=Yon|first1=Daniel|last2=Lange|first2=Floris P. de|last3=Press|first3=Clare|date=2019-01-01|title=The Predictive Brain as a Stubborn Scientist|url=https://www.cell.com/trends/cognitive-sciences/abstract/S1364-6613(18)30239-0|journal=Trends in Cognitive Sciences|language=en|volume=23|issue=1|pages=6–8|doi=10.1016/j.tics.2018.10.003|pmid=30429054|s2cid=53280000}}</ref> |
||
== See also == |
== See also == |
||
Line 125: | Line 178: | ||
* {{annotated link|Autopoiesis}} |
* {{annotated link|Autopoiesis}} |
||
* {{annotated link|Bayesian approaches to brain function}} |
* {{annotated link|Bayesian approaches to brain function}} |
||
* [[Adrian Bejan|Constructal law]] - Law of design evolution in nature, animate and inanimate |
|||
* {{annotated link|Decision theory}} |
* {{annotated link|Decision theory}} |
||
* {{annotated link|Embodied cognition}} |
* {{annotated link|Embodied cognition}} |
||
* {{annotated link| |
* {{annotated link|Entropic force}} |
||
* {{annotated link|Principle of minimum energy}} |
|||
* {{annotated link|Info-metrics}} |
* {{annotated link|Info-metrics}} |
||
* {{annotated link|Optimal control}} |
* {{annotated link|Optimal control}} |
||
* {{annotated link|Adaptive system |
* {{annotated link|Adaptive system}} |
||
* {{annotated link|Predictive coding}} |
* {{annotated link|Predictive coding}} |
||
* {{annotated link|Practopoiesis}} |
|||
* {{annotated link|Self-organization}} |
* {{annotated link|Self-organization}} |
||
* {{annotated link|Surprisal}} |
|||
* {{annotated link|Synergetics (Haken)}} |
* {{annotated link|Synergetics (Haken)}} |
||
* {{annotated link|Variational Bayesian methods}} |
* {{annotated link|Variational Bayesian methods}} |
||
* ''[[The Philosophy of 'As if']]'' |
|||
== References == |
== References == |
||
Line 148: | Line 202: | ||
[[Category:Computational neuroscience]] |
[[Category:Computational neuroscience]] |
||
[[Category:Mathematical and theoretical biology]] |
[[Category:Mathematical and theoretical biology]] |
||
[[Category:Theoretical accounts of general intelligence]] |
Latest revision as of 13:24, 22 December 2024
The free energy principle is a theoretical framework suggesting that the brain reduces surprise or uncertainty by making predictions based on internal models and updating them using sensory input. It highlights the brain's objective of aligning its internal model and the external world to enhance prediction accuracy. This principle integrates Bayesian inference with active inference, where actions are guided by predictions and sensory feedback refines them. It has wide-ranging implications for comprehending brain function, perception, and action.[1]
Overview
[edit]In biophysics and cognitive science, the free energy principle is a mathematical principle describing a formal account of the representational capacities of physical systems: that is, why things that exist look as if they track properties of the systems to which they are coupled.[2]
It establishes that the dynamics of physical systems minimise a quantity known as surprisal (which is the negative log probability of some outcome); or equivalently, its variational upper bound, called free energy. The principle is used especially in Bayesian approaches to brain function, but also some approaches to artificial intelligence; it is formally related to variational Bayesian methods and was originally introduced by Karl Friston as an explanation for embodied perception-action loops in neuroscience.[3]
The free energy principle models the behaviour of systems that are distinct from, but coupled to, another system (e.g., an embedding environment), where the degrees of freedom that implement the interface between the two systems is known as a Markov blanket. More formally, the free energy principle says that if a system has a "particular partition" (i.e., into particles, with their Markov blankets), then subsets of that system will track the statistical structure of other subsets (which are known as internal and external states or paths of a system).
The free energy principle is based on the Bayesian idea of the brain as an “inference engine.” Under the free energy principle, systems pursue paths of least surprise, or equivalently, minimize the difference between predictions based on their model of the world and their sense and associated perception. This difference is quantified by variational free energy and is minimized by continuous correction of the world model of the system, or by making the world more like the predictions of the system. By actively changing the world to make it closer to the expected state, systems can also minimize the free energy of the system. Friston assumes this to be the principle of all biological reaction.[4] Friston also believes his principle applies to mental disorders as well as to artificial intelligence. AI implementations based on the active inference principle have shown advantages over other methods.[4]
The free energy principle is a mathematical principle of information physics: much like the principle of maximum entropy or the principle of least action, it is true on mathematical grounds. To attempt to falsify the free energy principle is a category mistake, akin to trying to falsify calculus by making empirical observations. (One cannot invalidate a mathematical theory in this way; instead, one would need to derive a formal contradiction from the theory.) In a 2018 interview, Friston explained what it entails for the free energy principle to not be subject to falsification: "I think it is useful to make a fundamental distinction at this point—that we can appeal to later. The distinction is between a state and process theory; i.e., the difference between a normative principle that things may or may not conform to, and a process theory or hypothesis about how that principle is realized. Under this distinction, the free energy principle stands in stark distinction to things like predictive coding and the Bayesian brain hypothesis. This is because the free energy principle is what it is — a principle. Like Hamilton's principle of stationary action, it cannot be falsified. It cannot be disproven. In fact, there’s not much you can do with it, unless you ask whether measurable systems conform to the principle. On the other hand, hypotheses that the brain performs some form of Bayesian inference or predictive coding are what they are—hypotheses. These hypotheses may or may not be supported by empirical evidence."[5] There are many examples of these hypotheses being supported by empirical evidence.[6]
Background
[edit]The notion that self-organising biological systems – like a cell or brain – can be understood as minimising variational free energy is based upon Helmholtz’s work on unconscious inference[7] and subsequent treatments in psychology[8] and machine learning.[9] Variational free energy is a function of observations and a probability density over their hidden causes. This variational density is defined in relation to a probabilistic model that generates predicted observations from hypothesized causes. In this setting, free energy provides an approximation to Bayesian model evidence.[10] Therefore, its minimisation can be seen as a Bayesian inference process. When a system actively makes observations to minimise free energy, it implicitly performs active inference and maximises the evidence for its model of the world.
However, free energy is also an upper bound on the self-information of outcomes, where the long-term average of surprise is entropy. This means that if a system acts to minimise free energy, it will implicitly place an upper bound on the entropy of the outcomes – or sensory states – it samples.[11][12]
Relationship to other theories
[edit]Active inference is closely related to the good regulator theorem[13] and related accounts of self-organisation,[14][15] such as self-assembly, pattern formation, autopoiesis[16] and practopoiesis.[17] It addresses the themes considered in cybernetics, synergetics[18] and embodied cognition. Because free energy can be expressed as the expected energy of observations under the variational density minus its entropy, it is also related to the maximum entropy principle.[19] Finally, because the time average of energy is action, the principle of minimum variational free energy is a principle of least action. Active inference allowing for scale invariance has also been applied to other theories and domains. For instance, it has been applied to sociology,[20][21][22][23] linguistics and communication,[24][25][26] semiotics,[27][28] and epidemiology [29] among others.
Negative free energy is formally equivalent to the evidence lower bound, which is commonly used in machine learning to train generative models, such as variational autoencoders.
Action and perception
[edit]Active inference applies the techniques of approximate Bayesian inference to infer the causes of sensory data from a 'generative' model of how that data is caused and then uses these inferences to guide action. Bayes' rule characterizes the probabilistically optimal inversion of such a causal model, but applying it is typically computationally intractable, leading to the use of approximate methods. In active inference, the leading class of such approximate methods are variational methods, for both practical and theoretical reasons: practical, as they often lead to simple inference procedures; and theoretical, because they are related to fundamental physical principles, as discussed above.
These variational methods proceed by minimizing an upper bound on the divergence between the Bayes-optimal inference (or 'posterior') and its approximation according to the method. This upper bound is known as the free energy, and we can accordingly characterize perception as the minimization of the free energy with respect to inbound sensory information, and action as the minimization of the same free energy with respect to outbound action information. This holistic dual optimization is characteristic of active inference, and the free energy principle is the hypothesis that all systems which perceive and act can be characterized in this way.
In order to exemplify the mechanics of active inference via the free energy principle, a generative model must be specified, and this typically involves a collection of probability density functions which together characterize the causal model. One such specification is as follows. The system is modelled as inhabiting a state space , in the sense that its states form the points of this space. The state space is then factorized according to , where is the space of 'external' states that are 'hidden' from the agent (in the sense of not being directly perceived or accessible), is the space of sensory states that are directly perceived by the agent, is the space of the agent's possible actions, and is a space of 'internal' states that are private to the agent.
Keeping with the Figure 1, note that in the following the and are functions of (continuous) time . The generative model is the specification of the following density functions:
- A sensory model, , often written as , characterizing the likelihood of sensory data given external states and actions;
- a stochastic model of the environmental dynamics, , often written , characterizing how the external states are expected by the agent to evolve over time , given the agent's actions;
- an action model, , written , characterizing how the agent's actions depend upon its internal states and sensory data; and
- an internal model, , written , characterizing how the agent's internal states depend upon its sensory data.
These density functions determine the factors of a "joint model", which represents the complete specification of the generative model, and which can be written as
- .
Bayes' rule then determines the "posterior density" , which expresses a probabilistically optimal belief about the external state given the preceding state and the agent's actions, sensory signals, and internal states. Since computing is computationally intractable, the free energy principle asserts the existence of a "variational density" , where is an approximation to . One then defines the free energy as
and defines action and perception as the joint optimization problem
where the internal states are typically taken to encode the parameters of the 'variational' density and hence the agent's "best guess" about the posterior belief over . Note that the free energy is also an upper bound on a measure of the agent's (marginal, or average) sensory surprise, and hence free energy minimization is often motivated by the minimization of surprise.
Free energy minimisation
[edit]Free energy minimisation and self-organisation
[edit]Free energy minimisation has been proposed as a hallmark of self-organising systems when cast as random dynamical systems.[30] This formulation rests on a Markov blanket (comprising action and sensory states) that separates internal and external states. If internal states and action minimise free energy, then they place an upper bound on the entropy of sensory states:
This is because – under ergodic assumptions – the long-term average of surprise is entropy. This bound resists a natural tendency to disorder – of the sort associated with the second law of thermodynamics and the fluctuation theorem. However, formulating a unifying principle for the life sciences in terms of concepts from statistical physics, such as random dynamical system, non-equilibrium steady state and ergodicity, places substantial constraints on the theoretical and empirical study of biological systems with the risk of obscuring all features that make biological systems interesting kinds of self-organizing systems.[31]
Free energy minimisation and Bayesian inference
[edit]All Bayesian inference can be cast in terms of free energy minimisation[32][failed verification]. When free energy is minimised with respect to internal states, the Kullback–Leibler divergence between the variational and posterior density over hidden states is minimised. This corresponds to approximate Bayesian inference – when the form of the variational density is fixed – and exact Bayesian inference otherwise. Free energy minimisation therefore provides a generic description of Bayesian inference and filtering (e.g., Kalman filtering). It is also used in Bayesian model selection, where free energy can be usefully decomposed into complexity and accuracy:
Models with minimum free energy provide an accurate explanation of data, under complexity costs (c.f., Occam's razor and more formal treatments of computational costs[33]). Here, complexity is the divergence between the variational density and prior beliefs about hidden states (i.e., the effective degrees of freedom used to explain the data).
Free energy minimisation and thermodynamics
[edit]Variational free energy is an information-theoretic functional and is distinct from thermodynamic (Helmholtz) free energy.[34] However, the complexity term of variational free energy shares the same fixed point as Helmholtz free energy (under the assumption the system is thermodynamically closed but not isolated). This is because if sensory perturbations are suspended (for a suitably long period of time), complexity is minimised (because accuracy can be neglected). At this point, the system is at equilibrium and internal states minimise Helmholtz free energy, by the principle of minimum energy.[35]
Free energy minimisation and information theory
[edit]Free energy minimisation is equivalent to maximising the mutual information between sensory states and internal states that parameterise the variational density (for a fixed entropy variational density). This relates free energy minimization to the principle of minimum redundancy.[36][12]
Free energy minimisation in neuroscience
[edit]Free energy minimisation provides a useful way to formulate normative (Bayes optimal) models of neuronal inference and learning under uncertainty[37] and therefore subscribes to the Bayesian brain hypothesis.[38] The neuronal processes described by free energy minimisation depend on the nature of hidden states: that can comprise time-dependent variables, time-invariant parameters and the precision (inverse variance or temperature) of random fluctuations. Minimising variables, parameters, and precision correspond to inference, learning, and the encoding of uncertainty, respectively.
Perceptual inference and categorisation
[edit]Free energy minimisation formalises the notion of unconscious inference in perception[7][9] and provides a normative (Bayesian) theory of neuronal processing. The associated process theory of neuronal dynamics is based on minimising free energy through gradient descent. This corresponds to generalised Bayesian filtering (where ~ denotes a variable in generalised coordinates of motion and is a derivative matrix operator):[39]
Usually, the generative models that define free energy are non-linear and hierarchical (like cortical hierarchies in the brain). Special cases of generalised filtering include Kalman filtering, which is formally equivalent to predictive coding[40] – a popular metaphor for message passing in the brain. Under hierarchical models, predictive coding involves the recurrent exchange of ascending (bottom-up) prediction errors and descending (top-down) predictions[41] that is consistent with the anatomy and physiology of sensory[42] and motor systems.[43]
Perceptual learning and memory
[edit]In predictive coding, optimising model parameters through a gradient descent on the time integral of free energy (free action) reduces to associative or Hebbian plasticity and is associated with synaptic plasticity in the brain.
Perceptual precision, attention and salience
[edit]Optimizing the precision parameters corresponds to optimizing the gain of prediction errors (c.f., Kalman gain). In neuronally plausible implementations of predictive coding,[41] this corresponds to optimizing the excitability of superficial pyramidal cells and has been interpreted in terms of attentional gain.[44]
With regard to the top-down vs. bottom-up controversy, which has been addressed as a major open problem of attention, a computational model has succeeded in illustrating the circular nature of the interplay between top-down and bottom-up mechanisms. Using an established emergent model of attention, namely SAIM, the authors proposed a model called PE-SAIM, which, in contrast to the standard version, approaches selective attention from a top-down position. The model takes into account the transmission of prediction errors to the same level or a level above, in order to minimise the energy function that indicates the difference between the data and its cause, or, in other words, between the generative model and the posterior. To increase validity, they also incorporated neural competition between stimuli into their model. A notable feature of this model is the reformulation of the free energy function only in terms of prediction errors during task performance:
where is the total energy function of the neural networks entail, and is the prediction error between the generative model (prior) and posterior changing over time.[45] Comparing the two models reveals a notable similarity between their respective results while also highlighting a remarkable discrepancy, whereby – in the standard version of the SAIM – the model's focus is mainly upon the excitatory connections, whereas in the PE-SAIM, the inhibitory connections are leveraged to make an inference. The model has also proved to be fit to predict the EEG and fMRI data drawn from human experiments with high precision. In the same vein, Yahya et al. also applied the free energy principle to propose a computational model for template matching in covert selective visual attention that mostly relies on SAIM.[46] According to this study, the total free energy of the whole state-space is reached by inserting top-down signals in the original neural networks, whereby we derive a dynamical system comprising both feed-forward and backward prediction error.
Active inference
[edit]When gradient descent is applied to action , motor control can be understood in terms of classical reflex arcs that are engaged by descending (corticospinal) predictions. This provides a formalism that generalizes the equilibrium point solution – to the degrees of freedom problem[47] – to movement trajectories.
Active inference and optimal control
[edit]Active inference is related to optimal control by replacing value or cost-to-go functions with prior beliefs about state transitions or flow.[48] This exploits the close connection between Bayesian filtering and the solution to the Bellman equation. However, active inference starts with (priors over) flow that are specified with scalar and vector value functions of state space (c.f., the Helmholtz decomposition). Here, is the amplitude of random fluctuations and cost is . The priors over flow induce a prior over states that is the solution to the appropriate forward Kolmogorov equations.[49] In contrast, optimal control optimises the flow, given a cost function, under the assumption that (i.e., the flow is curl free or has detailed balance). Usually, this entails solving backward Kolmogorov equations.[50]
Active inference and optimal decision (game) theory
[edit]Optimal decision problems (usually formulated as partially observable Markov decision processes) are treated within active inference by absorbing utility functions into prior beliefs. In this setting, states that have a high utility (low cost) are states an agent expects to occupy. By equipping the generative model with hidden states that model control, policies (control sequences) that minimise variational free energy lead to high utility states.[51]
Neurobiologically, neuromodulators such as dopamine are considered to report the precision of prediction errors by modulating the gain of principal cells encoding prediction error.[52] This is closely related to – but formally distinct from – the role of dopamine in reporting prediction errors per se[53] and related computational accounts.[54]
Active inference and cognitive neuroscience
[edit]Active inference has been used to address a range of issues in cognitive neuroscience, brain function and neuropsychiatry, including action observation,[55] mirror neurons,[56] saccades and visual search,[57][58] eye movements,[59] sleep,[60] illusions,[61] attention,[44] action selection,[52] consciousness,[62][63] hysteria[64] and psychosis.[65] Explanations of action in active inference often depend on the idea that the brain has 'stubborn predictions' that it cannot update, leading to actions that cause these predictions to come true.[66]
See also
[edit]- Action-specific perception – Psychological theory that people perceive their environment and events within it
- Affordance – Possibility of an action on an object or environment
- Autopoiesis – Systems concept which entails automatic reproduction and maintenance
- Bayesian approaches to brain function – Explaining the brain's abilities through statistical principles
- Constructal law - Law of design evolution in nature, animate and inanimate
- Decision theory – Branch of applied probability theory
- Embodied cognition – Interdisciplinary theory
- Entropic force – Physical force that originates from thermodynamics instead of fundamental interactions
- Principle of minimum energy – thermodynamic formulation based on the second law
- Info-metrics – Interdisciplinary approach to scientific modelling and information processing
- Optimal control – Mathematical way of attaining a desired output from a dynamic system
- Adaptive system – System that can adapt to the environment
- Predictive coding – Theory of brain function
- Self-organization – Process of creating order by local interactions
- Surprisal – Basic quantity derived from the probability of a particular event occurring from a random variable
- Synergetics (Haken) – A school of thought on thermodynamics and systems phenomena developed by Hermann Haken
- Variational Bayesian methods – Mathematical methods used in Bayesian inference and machine learning
References
[edit]- ^ Bruineberg, Jelle; Kiverstein, Julian; Rietveld, Erik (2018). "The anticipating brain is not a scientist: the free-energy principle from an ecological-enactive perspective". Synthese. 195 (6): 2417–2444. doi:10.1007/s11229-016-1239-1. PMC 6438652. PMID 30996493.
- ^ Friston, Karl (2010). "The free-energy principle: a unified brain theory?". Nature Reviews Neuroscience. 11 (2): 127–138. doi:10.1038/nrn2787. PMID 20068583. S2CID 5053247. Retrieved July 9, 2023.
- ^ Friston, Karl; Kilner, James; Harrison, Lee (2006). "A free energy principle for the brain" (PDF). Journal of Physiology-Paris. 100 (1–3): 70–87. doi:10.1016/j.jphysparis.2006.10.001. PMID 17097864. S2CID 637885.
- ^ a b Shaun Raviv: The Genius Neuroscientist Who Might Hold the Key to True AI. In: Wired, 13. November 2018
- ^ Friston, Karl (2018). "Of woodlice and men: A Bayesian account of cognition, life and consciousness. An interview with Karl Friston (by Martin Fortier & Daniel Friedman)". ALIUS Bulletin. 2: 17–43.
- ^ Friston, Karl (2022). Active Inference: The Free Energy Principle in Mind, Brain, and Behavior. MIT Press. ISBN 9780262045353.
- ^ a b Helmholtz, H. (1866/1962). Concerning the perceptions in general. In Treatise on physiological optics (J. Southall, Trans., 3rd ed., Vol. III). New York: Dover. Available at https://web.archive.org/web/20180320133752/http://poseidon.sunyopt.edu/BackusLab/Helmholtz/
- ^ Gregory, R. L. (1980-07-08). "Perceptions as hypotheses". Philosophical Transactions of the Royal Society of London. B, Biological Sciences. 290 (1038): 181–197. Bibcode:1980RSPTB.290..181G. doi:10.1098/rstb.1980.0090. JSTOR 2395424. PMID 6106237.
- ^ a b Dayan, Peter; Hinton, Geoffrey E.; Neal, Radford M.; Zemel, Richard S. (1995). "The Helmholtz Machine" (PDF). Neural Computation. 7 (5): 889–904. doi:10.1162/neco.1995.7.5.889. hdl:21.11116/0000-0002-D6D3-E. PMID 7584891. S2CID 1890561.
- ^ Beal, M. J. (2003). Variational Algorithms for Approximate Bayesian Inference. Ph.D. Thesis, University College London.
- ^ Sakthivadivel, Dalton (2022). "Towards a Geometry and Analysis for Bayesian Mechanics". arXiv:2204.11900 [math-ph].
- ^ a b Ramstead, Maxwell; Sakthivadivel, Dalton; Heins, Conor; Koudahl, Magnus; Millidge, Beren; Da Costa, Lancelot; Klein, Brennan; Friston, Karl (2023). "On Bayesian mechanics: A physics of and by beliefs". Interface Focus. 13 (3). arXiv:2205.11543. doi:10.1098/rsfs.2022.0029. PMC 10198254. PMID 37213925. S2CID 249017997.
- ^ Conant, Roger C.; Ross Ashby, W. (1970). "Every good regulator of a system must be a model of that system". International Journal of Systems Science. 1 (2): 89–97. doi:10.1080/00207727008920220.
- ^ Kauffman, S. (1993). The Origins of Order: Self-Organization and Selection in Evolution. Oxford: Oxford University Press.
- ^ Nicolis, G., & Prigogine, I. (1977). Self-organization in non-equilibrium systems. New York: John Wiley.
- ^ Maturana, H. R., & Varela, F. (1980). Autopoiesis: the organization of the living. In V. F. Maturana HR (Ed.), Autopoiesis and Cognition. Dordrecht, Netherlands: Reidel.
- ^ Nikolić, Danko (2015). "Practopoiesis: Or how life fosters a mind". Journal of Theoretical Biology. 373: 40–61. arXiv:1402.5332. Bibcode:2015JThBi.373...40N. doi:10.1016/j.jtbi.2015.03.003. PMID 25791287. S2CID 12680941.
- ^ Haken, H. (1983). Synergetics: An introduction. Non-equilibrium phase transition and self-organisation in physics, chemistry and biology (3rd ed.). Berlin: Springer Verlag.
- ^ Jaynes, E. T. (1957). "Information Theory and Statistical Mechanics" (PDF). Physical Review. 106 (4): 620–630. Bibcode:1957PhRv..106..620J. doi:10.1103/PhysRev.106.620. S2CID 17870175.
- ^ Veissière, Samuel P. L.; Constant, Axel; Ramstead, Maxwell J. D.; Friston, Karl J.; Kirmayer, Laurence J. (2020). "Thinking through other minds: A variational approach to cognition and culture". Behavioral and Brain Sciences. 43: e90. doi:10.1017/S0140525X19001213. ISSN 0140-525X. PMID 31142395. S2CID 169038428.
- ^ Ramstead, Maxwell J. D.; Constant, Axel; Badcock, Paul B.; Friston, Karl J. (2019-12-01). "Variational ecology and the physics of sentient systems". Physics of Life Reviews. Physics of Mind. 31: 188–205. Bibcode:2019PhLRv..31..188R. doi:10.1016/j.plrev.2018.12.002. ISSN 1571-0645. PMC 6941227. PMID 30655223.
- ^ Albarracin, Mahault; Demekas, Daphne; Ramstead, Maxwell J. D.; Heins, Conor (April 2022). "Epistemic Communities under Active Inference". Entropy. 24 (4): 476. Bibcode:2022Entrp..24..476A. doi:10.3390/e24040476. ISSN 1099-4300. PMC 9027706. PMID 35455140.
- ^ Albarracin, Mahault; Constant, Axel; Friston, Karl J.; Ramstead, Maxwell James D. (2021). "A Variational Approach to Scripts". Frontiers in Psychology. 12: 585493. doi:10.3389/fpsyg.2021.585493. ISSN 1664-1078. PMC 8329037. PMID 34354621.
- ^ Friston, Karl J.; Parr, Thomas; Yufik, Yan; Sajid, Noor; Price, Catherine J.; Holmes, Emma (2020-11-01). "Generative models, linguistic communication and active inference". Neuroscience & Biobehavioral Reviews. 118: 42–64. doi:10.1016/j.neubiorev.2020.07.005. ISSN 0149-7634. PMC 7758713. PMID 32687883.
- ^ Tison, Remi; Poirier, Pierre (2021-10-02). "Communication as Socially Extended Active Inference: An Ecological Approach to Communicative Behavior". Ecological Psychology. 33 (3–4): 197–235. doi:10.1080/10407413.2021.1965480. ISSN 1040-7413. S2CID 238703201.
- ^ Friston, Karl J.; Frith, Christopher D. (2015-07-01). "Active inference, communication and hermeneutics". Cortex. Special issue: Prediction in speech and language processing. 68: 129–143. doi:10.1016/j.cortex.2015.03.025. ISSN 0010-9452. PMC 4502445. PMID 25957007.
- ^ Kerusauskaite, Skaiste (2023-06-01). "Role of Culture in Meaning Making: Bridging Semiotic Cultural Psychology and Active Inference". Integrative Psychological and Behavioral Science. 57 (2): 432–443. doi:10.1007/s12124-022-09744-x. ISSN 1936-3567. PMID 36585542. S2CID 255366405.
- ^ García, Adolfo M.; Ibáñez, Agustín (2022-11-14). The Routledge Handbook of Semiosis and the Brain. Taylor & Francis. ISBN 978-1-000-72877-4.
- ^ Bottemanne, Hugo; Friston, Karl J. (2021-12-01). "An active inference account of protective behaviours during the COVID-19 pandemic". Cognitive, Affective, & Behavioral Neuroscience. 21 (6): 1117–1129. doi:10.3758/s13415-021-00947-0. ISSN 1531-135X. PMC 8518276. PMID 34652601.
- ^ Crauel, Hans; Flandoli, Franco (1994). "Attractors for random dynamical systems". Probability Theory and Related Fields. 100 (3): 365–393. doi:10.1007/BF01193705. S2CID 122609512.
- ^ Colombo, Matteo; Palacios, Patricia (2021). "Non-equilibrium thermodynamics and the free energy principle in biology". Biology & Philosophy. 36 (5). doi:10.1007/s10539-021-09818-x. S2CID 235803361.
- ^ Roweis, Sam; Ghahramani, Zoubin (1999). "A Unifying Review of Linear Gaussian Models" (PDF). Neural Computation. 11 (2): 305–345. doi:10.1162/089976699300016674. PMID 9950734. S2CID 2590898.
- ^ Ortega, Pedro A.; Braun, Daniel A. (2013). "Thermodynamics as a theory of decision-making with information-processing costs". Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences. 469 (2153). arXiv:1204.6481. Bibcode:2013RSPSA.46920683O. doi:10.1098/rspa.2012.0683. S2CID 28080508.
- ^ Evans, Denis J. (2003). "A non-equilibrium free energy theorem for deterministic systems" (PDF). Molecular Physics. 101 (10): 1551–1554. Bibcode:2003MolPh.101.1551E. doi:10.1080/0026897031000085173. S2CID 15129000.
- ^ Jarzynski, C. (1997). "Nonequilibrium Equality for Free Energy Differences". Physical Review Letters. 78 (14): 2690–2693. arXiv:cond-mat/9610209. Bibcode:1997PhRvL..78.2690J. doi:10.1103/PhysRevLett.78.2690. S2CID 16112025.
- ^ Sakthivadivel, Dalton (2022). "Towards a Geometry and Analysis for Bayesian Mechanics". arXiv:2204.11900 [math-ph].
- ^ Friston, Karl (2010). "The free-energy principle: A unified brain theory?" (PDF). Nature Reviews Neuroscience. 11 (2): 127–138. doi:10.1038/nrn2787. PMID 20068583. S2CID 5053247.
- ^ Knill, David C.; Pouget, Alexandre (2004). "The Bayesian brain: The role of uncertainty in neural coding and computation" (PDF). Trends in Neurosciences. 27 (12): 712–719. doi:10.1016/j.tins.2004.10.007. PMID 15541511. S2CID 9870936. Archived from the original (PDF) on 2016-03-04. Retrieved 2013-05-31.
- ^ Friston, Karl; Stephan, Klaas; Li, Baojuan; Daunizeau, Jean (2010). "Generalised Filtering". Mathematical Problems in Engineering. 2010: 1–34. doi:10.1155/2010/621670.
- ^ Knill, David C.; Pouget, Alexandre (2004). "The Bayesian brain: The role of uncertainty in neural coding and computation" (PDF). Trends in Neurosciences. 27 (12): 712–719. doi:10.1016/j.tins.2004.10.007. PMID 15541511. S2CID 9870936.
- ^ a b Mumford, D. (1992). "On the computational architecture of the neocortex" (PDF). Biological Cybernetics. 66 (3): 241–251. doi:10.1007/BF00198477. PMID 1540675. S2CID 14303625.
- ^ Bastos, Andre M.; Usrey, W. Martin; Adams, Rick A.; Mangun, George R.; Fries, Pascal; Friston, Karl J. (2012). "Canonical Microcircuits for Predictive Coding". Neuron. 76 (4): 695–711. doi:10.1016/j.neuron.2012.10.038. PMC 3777738. PMID 23177956.
- ^ Adams, Rick A.; Shipp, Stewart; Friston, Karl J. (2013). "Predictions not commands: Active inference in the motor system". Brain Structure and Function. 218 (3): 611–643. doi:10.1007/s00429-012-0475-5. PMC 3637647. PMID 23129312.
- ^ a b Friston, Karl J.; Feldman, Harriet (2010). "Attention, Uncertainty, and Free-Energy". Frontiers in Human Neuroscience. 4: 215. doi:10.3389/fnhum.2010.00215. PMC 3001758. PMID 21160551.
- ^ Abadi, Alireza Khatoon; Yahya, Keyvan; Amini, Massoud; Friston, Karl; Heinke, Dietmar (2019). "Excitatory versus inhibitory feedback in Bayesian formulations of scene construction". Journal of the Royal Society Interface. 16 (154). doi:10.1098/rsif.2018.0344. PMC 6544897. PMID 31039693.
- ^ "12th Biannual Conference of the German Cognitive Science Society (KogWis 2014)". Cognitive Processing. 15: 107. 2014. doi:10.1007/s10339-013-0597-6. S2CID 10121398.
- ^ Feldman, Anatol G.; Levin, Mindy F. (1995). "The origin and use of positional frames of reference in motor control" (PDF). Behavioral and Brain Sciences. 18 (4): 723–744. doi:10.1017/S0140525X0004070X. S2CID 145164477. Archived from the original (PDF) on 2014-03-29. Retrieved 2013-05-31.
- ^ Friston, Karl (2011). "What is Optimal about Motor Control?" (PDF). Neuron. 72 (3): 488–498. doi:10.1016/j.neuron.2011.10.018. PMID 22078508. S2CID 13912462.
- ^ Friston, Karl; Ao, Ping (2012). "Free Energy, Value, and Attractors". Computational and Mathematical Methods in Medicine. 2012: 1–27. doi:10.1155/2012/937860. PMC 3249597. PMID 22229042.
- ^ Kappen, H. J. (2005). "Path integrals and symmetry breaking for optimal control theory". Journal of Statistical Mechanics: Theory and Experiment. 2005 (11): P11011. arXiv:physics/0505066. Bibcode:2005JSMTE..11..011K. doi:10.1088/1742-5468/2005/11/P11011. S2CID 87027.
- ^ Friston, Karl; Samothrakis, Spyridon; Montague, Read (2012). "Active inference and agency: Optimal control without cost functions". Biological Cybernetics. 106 (8–9): 523–541. doi:10.1007/s00422-012-0512-8. hdl:10919/78836. PMID 22864468.
- ^ a b Friston, Karl J.; Shiner, Tamara; Fitzgerald, Thomas; Galea, Joseph M.; Adams, Rick; Brown, Harriet; Dolan, Raymond J.; Moran, Rosalyn; Stephan, Klaas Enno; Bestmann, Sven (2012). "Dopamine, Affordance and Active Inference". PLOS Computational Biology. 8 (1): e1002327. Bibcode:2012PLSCB...8E2327F. doi:10.1371/journal.pcbi.1002327. PMC 3252266. PMID 22241972.
- ^ Fiorillo, Christopher D.; Tobler, Philippe N.; Schultz, Wolfram (2003). "Discrete Coding of Reward Probability and Uncertainty by Dopamine Neurons" (PDF). Science. 299 (5614): 1898–1902. Bibcode:2003Sci...299.1898F. doi:10.1126/science.1077349. PMID 12649484. S2CID 2363255. Archived from the original (PDF) on 2016-03-04. Retrieved 2013-05-31.
- ^ Frank, Michael J. (2005). "Dynamic Dopamine Modulation in the Basal Ganglia: A Neurocomputational Account of Cognitive Deficits in Medicated and Nonmedicated Parkinsonism" (PDF). Journal of Cognitive Neuroscience. 17 (1): 51–72. doi:10.1162/0898929052880093. PMID 15701239. S2CID 7414727.
- ^ Friston, Karl; Mattout, Jérémie; Kilner, James (2011). "Action understanding and active inference" (PDF). Biological Cybernetics. 104 (1–2): 137–160. doi:10.1007/s00422-011-0424-z. PMC 3491875. PMID 21327826.
- ^ Kilner, James M.; Friston, Karl J.; Frith, Chris D. (2007). "Predictive coding: An account of the mirror neuron system" (PDF). Cognitive Processing. 8 (3): 159–166. doi:10.1007/s10339-007-0170-2. PMC 2649419. PMID 17429704.
- ^ Friston, Karl; Adams, Rick A.; Perrinet, Laurent; Breakspear, Michael (2012). "Perceptions as Hypotheses: Saccades as Experiments". Frontiers in Psychology. 3: 151. doi:10.3389/fpsyg.2012.00151. PMC 3361132. PMID 22654776.
- ^ Mirza, M. Berk; Adams, Rick A.; Mathys, Christoph; Friston, Karl J. (2018). "Human visual exploration reduces uncertainty about the sensed world". PLOS ONE. 13 (1): e0190429. Bibcode:2018PLoSO..1390429M. doi:10.1371/journal.pone.0190429. PMC 5755757. PMID 29304087.
- ^ Perrinet, Laurent U.; Adams, Rick A.; Friston, Karl J. (2014). "Active inference, eye movements and oculomotor delays". Biological Cybernetics. 108 (6): 777–801. doi:10.1007/s00422-014-0620-8. PMC 4250571. PMID 25128318.
- ^ Hobson, J.A.; Friston, K.J. (2012). "Waking and dreaming consciousness: Neurobiological and functional considerations". Progress in Neurobiology. 98 (1): 82–98. doi:10.1016/j.pneurobio.2012.05.003. PMC 3389346. PMID 22609044.
- ^ Brown, Harriet; Friston, Karl J. (2012). "Free-Energy and Illusions: The Cornsweet Effect". Frontiers in Psychology. 3: 43. doi:10.3389/fpsyg.2012.00043. PMC 3289982. PMID 22393327.
- ^ Rudrauf, David; Bennequin, Daniel; Granic, Isabela; Landini, Gregory; Friston, Karl; Williford, Kenneth (2017-09-07). "A mathematical model of embodied consciousness" (PDF). Journal of Theoretical Biology. 428: 106–131. Bibcode:2017JThBi.428..106R. doi:10.1016/j.jtbi.2017.05.032. PMID 28554611.
- ^ K, Williford; D, Bennequin; K, Friston; D, Rudrauf (2018-12-17). "The Projective Consciousness Model and Phenomenal Selfhood". Frontiers in Psychology. 9: 2571. doi:10.3389/fpsyg.2018.02571. PMC 6304424. PMID 30618988.
- ^ Edwards, M. J.; Adams, R. A.; Brown, H.; Parees, I.; Friston, K. J. (2012). "A Bayesian account of 'hysteria'" (PDF). Brain. 135 (11): 3495–3512. doi:10.1093/brain/aws129. PMC 3501967. PMID 22641838.
- ^ Adams, Rick A.; Perrinet, Laurent U.; Friston, Karl (2012). "Smooth Pursuit and Visual Occlusion: Active Inference and Oculomotor Control in Schizophrenia". PLOS ONE. 7 (10): e47502. Bibcode:2012PLoSO...747502A. doi:10.1371/journal.pone.0047502. PMC 3482214. PMID 23110076.
- ^ Yon, Daniel; Lange, Floris P. de; Press, Clare (2019-01-01). "The Predictive Brain as a Stubborn Scientist". Trends in Cognitive Sciences. 23 (1): 6–8. doi:10.1016/j.tics.2018.10.003. PMID 30429054. S2CID 53280000.