Jump to content

Inception score: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
added categories
General + punct fixes
Line 2: Line 2:


== Definition ==
== Definition ==
Let there be two spaces, the space of images <math>\Omega_X</math> and the space of labels <math>\Omega_Y</math>. The space of labels is finite.
Let there be two spaces, the space of images <math>\Omega_X</math> and the space of labels <math>\Omega_Y</math>. The space of labels is finite.


Let <math>p_{gen}</math> be a probability distribution over <math>\Omega_X</math> that we wish to judge.
Let <math>p_{gen}</math> be a probability distribution over <math>\Omega_X</math> that we wish to judge.


Let a discriminator be a function of type <math display="block">p_{dis}:\Omega_X \to M(\Omega_Y)</math>where <math>M(\Omega_Y)</math> is the set of all probability distributions on <math>\Omega_Y</math>. For any image <math>x</math>, and any label <math>y</math>, let <math>p_{dis}(y|x)</math> be the probability that image <math>x</math> has label <math>y</math>, according to the discriminator. It is usually implemented as an Inception-v3 network trained on ImageNet.
Let a discriminator be a function of type <math display="block">p_{dis}:\Omega_X \to M(\Omega_Y)</math>where <math>M(\Omega_Y)</math> is the set of all probability distributions on <math>\Omega_Y</math>. For any image <math>x</math>, and any label <math>y</math>, let <math>p_{dis}(y|x)</math> be the probability that image <math>x</math> has label <math>y</math>, according to the discriminator. It is usually implemented as an Inception-v3 network trained on ImageNet.


The '''Inception Score''' of <math>p_{gen}</math> relative to <math>p_{dis}</math> is<math display="block">IS(p_{gen}, p_{dis}) := \exp\left( \mathbb E_{x\sim p_{gen}}\left[
The '''Inception Score''' of <math>p_{gen}</math> relative to <math>p_{dis}</math> is<math display="block">IS(p_{gen}, p_{dis}) := \exp\left( \mathbb E_{x\sim p_{gen}}\left[
Line 15: Line 15:
H_y[\mathbb E_{x\sim p_{gen}}[p_{dis}(y | x)]]
H_y[\mathbb E_{x\sim p_{gen}}[p_{dis}(y | x)]]


\mathbb E_{x\sim p_{gen}}[ H_y[p_{dis}(y | x)]]</math>To show that this is nonnegative, use [[Jensen's inequality]].


'''Pseudocode:'''{{blockquote|'''INPUT''' discriminator <math>p_{dis}</math>.
\mathbb E_{x\sim p_{gen}}[ H_y[p_{dis}(y | x)]]</math>To show that this is nonnegative, use [[Jensen's inequality]].


'''Pseudocode:'''<blockquote>'''INPUT''' discriminator <math>p_{dis}</math>.


'''INPUT''' generator <math>g</math>.
'''INPUT''' generator <math>g</math>.
Line 31: Line 29:
Sample more images <math>x_i</math> from generator, and for each, compute <math>D_{KL} \left(p_{dis}(\cdot | x_i) \| \hat p\right)</math>.
Sample more images <math>x_i</math> from generator, and for each, compute <math>D_{KL} \left(p_{dis}(\cdot | x_i) \| \hat p\right)</math>.


Average the results, and take its exponential.
Average the results, and take its exponential.


Return the result.</blockquote>
Return the result.}}


=== Interpretation ===
=== Interpretation ===
A higher inception score is interpreted as "better", as it means that <math>p_{gen}</math> is a "sharp and distinct" collection of pictures.
A higher inception score is interpreted as "better", as it means that <math>p_{gen}</math> is a "sharp and distinct" collection of pictures.


<math>\ln IS(p_{gen}, p_{dis}) \in [0, \ln N]</math>, where <math>N</math> is the total number of possible labels.
<math>\ln IS(p_{gen}, p_{dis}) \in [0, \ln N]</math>, where <math>N</math> is the total number of possible labels.


<math>\ln IS(p_{gen}, p_{dis}) = 0</math> iff for almost all <math>x\sim p_{gen}</math><math display="block">p_{dis}(\cdot | x) = \int p_{dis}(\cdot | x) p_{gen}(x)dx</math>That means <math>p_{gen}</math> is completely "indistinct". That is, for any image <math>x</math> sampled from <math>p_{gen}</math>, discriminator returns exactly the same label predictions <math>p_{dis}(\cdot | x)</math>.
<math>\ln IS(p_{gen}, p_{dis}) = 0</math> iff for almost all <math>x\sim p_{gen}</math><math display="block">p_{dis}(\cdot | x) = \int p_{dis}(\cdot | x) p_{gen}(x)dx</math>That means <math>p_{gen}</math> is completely "indistinct". That is, for any image <math>x</math> sampled from <math>p_{gen}</math>, discriminator returns exactly the same label predictions <math>p_{dis}(\cdot | x)</math>.



The highest inception score <math>N</math> is achieved if and only if the two conditions are both true:
The highest inception score <math>N</math> is achieved if and only if the two conditions are both true:
* For almost all <math>x\sim p_{gen}</math>, the distribution <math>p_{dis}(y|x)</math> is concentrated on one label. That is, <math>H_y[p_{dis}(y|x)] = 0</math>. That is, every image sampled from <math>p_{gen}</math> is exactly classified by the discriminator.
* For almost all <math>x\sim p_{gen}</math>, the distribution <math>p_{dis}(y|x)</math> is concentrated on one label. That is, <math>H_y[p_{dis}(y|x)] = 0</math>. That is, every image sampled from <math>p_{gen}</math> is exactly classified by the discriminator.

* For every label <math>y</math>, the proportion of generated images labelled as <math>y</math> is exactly <math>\mathbb E_{x\sim p_{gen}}[p_{dis}(y | x)] = \frac 1 N</math>. That is, the generated images are equally distributed over all labels.
* For every label <math>y</math>, the proportion of generated images labelled as <math>y</math> is exactly <math>\mathbb E_{x\sim p_{gen}}[p_{dis}(y | x)] = \frac 1 N</math>. That is, the generated images are equally distributed over all labels.


Line 56: Line 52:
== References ==
== References ==
<references />
<references />

[[Category:Machine learning]]
[[Category:Machine learning]]
[[Category:Computer graphics]]
[[Category:Computer graphics]]

Revision as of 02:25, 16 July 2022

The Inception Score (IS) is an algorithm used to assess the quality of images created by a generative model, like a generative adversarial network (GAN).[1] It has been somewhat superseded by the Fréchet inception distance.[2]

Definition

Let there be two spaces, the space of images and the space of labels . The space of labels is finite.

Let be a probability distribution over that we wish to judge.

Let a discriminator be a function of type where is the set of all probability distributions on . For any image , and any label , let be the probability that image has label , according to the discriminator. It is usually implemented as an Inception-v3 network trained on ImageNet.

The Inception Score of relative to isEquivalent rewrites includeTo show that this is nonnegative, use Jensen's inequality.

Pseudocode:

INPUT discriminator .

INPUT generator .

Sample images from generator.

Compute , the probability distribution over labels conditional on image .

Sum up the results to obtain , an empirical estimate of .

Sample more images from generator, and for each, compute .

Average the results, and take its exponential.

Return the result.

Interpretation

A higher inception score is interpreted as "better", as it means that is a "sharp and distinct" collection of pictures.

, where is the total number of possible labels.

iff for almost all That means is completely "indistinct". That is, for any image sampled from , discriminator returns exactly the same label predictions .

The highest inception score is achieved if and only if the two conditions are both true:

  • For almost all , the distribution is concentrated on one label. That is, . That is, every image sampled from is exactly classified by the discriminator.
  • For every label , the proportion of generated images labelled as is exactly . That is, the generated images are equally distributed over all labels.

See also

References

  1. ^ Salimans, Tim; Goodfellow, Ian; Zaremba, Wojciech; Cheung, Vicki; Radford, Alec; Chen, Xi; Chen, Xi (2016). "Improved Techniques for Training GANs". Advances in Neural Information Processing Systems. 29. Curran Associates, Inc.
  2. ^ Borji, Ali (2022). "Pros and cons of GAN evaluation measures: New developments". Computer Vision and Image Understanding. 215: 103329. doi:10.1016/j.cviu.2021.103329.