Jump to content

Ground truth: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
references are needed if something can be doubted - this is a basic article so WTF
 
(90 intermediate revisions by 74 users not shown)
Line 1: Line 1:
{{short description|Information provided by direct observation}}
{{Dablink|For other uses, see [[Ground truth (disambiguation)]].}}
{{for|the documentary|The Ground Truth}}
{{refimprove|date=June 2015}}
'''Ground truth''' is information that is known to be real or true, provided by direct observation and measurement (i.e. [[empirical evidence]]) as opposed to information provided by [[inference]].


==Etymology==
'''Ground truth''' is a term used in [[cartography]], [[meteorology]], analysis of [[aerial photography|aerial photographs]], [[satellite imagery]] and a range of other [[remote sensing]] techniques in which data are gathered at a distance. Ground truth refers to information that is collected "on location." In remote sensing, this is especially important in order to relate image data to real features and materials on the ground. The collection of ground-truth data enables calibration of remote-sensing data, and aids in the interpretation and analysis of what is being sensed.
The ''[[Oxford English Dictionary]]'' (s.v. ''ground truth'') records the use of the word ''Groundtruth'' in the sense of 'fundamental truth' from Henry Ellison's poem "The Siberian Exile's Tale", published in 1833.<ref>
{{cite book
| last1 = Ellison
| first1 = Henry
| title = Mad moments, or first verse attempts by a born natural
| url = https://books.google.com/books?id=nzRcAAAAcAAJ
| publication-date = 1833
| page = 362
| access-date = 2014-10-24
| quote = As the Groundtruth of her own Existence it must be regarded, thro' Him in its highest, purest Aspect shown!
| year = 1833
}}
</ref>


== Statistics and machine learning ==
More specifically, ground truth may refer to a process in which a [[pixel]] on a [[satellite]] image is compared to what is there in reality (at the present time) in order to verify the contents of the pixel on the image. In the case of a classified image, it allows supervised classification to help determine the accuracy of the classification performed by the remote sensing software and therefore minimize errors in the classification such as errors of commission and errors of omission.
"Ground truth" may be seen as a conceptual term relative to the knowledge of the truth concerning a specific question. It is the ideal expected result.<ref>{{cite book |last1=Lemoigne |first1=Yves |last2=Caner |first2=Alessandra |date=2006 |title=Molecular Imaging: Computer Reconstruction and Practice}}</ref> This is used in [[statistical model]]s to prove or disprove [[research]] [[hypothesis|hypotheses]]. The term "ground truthing" refers to the process of gathering the proper [[Objectivity (philosophy)|objective]] (provable) data for this test. Compare with [[gold standard (test)|gold standard]].
For example, suppose we are testing a [[computer stereo vision|stereo vision]] system to see how well it can estimate 3D positions. The "ground truth" might be the positions given by a laser rangefinder which is known to be much more accurate than the camera system.


[[Bayesian spam filtering]] is a common example of supervised learning. In this system, the algorithm is manually taught the differences between spam and non-spam. This depends on the ''ground truth'' of the messages used to train the algorithm &ndash; inaccuracies in the ground truth will correlate to inaccuracies in the resulting spam/non-spam verdicts.
Ground truth is usually done on site, performing surface observations and measurements of various properties of the features of the ground resolution cells that are being studied on the remotely sensed digital image. It also involves taking geographic coordinates of the ground resolution cell with GPS technology and comparing those with the coordinates of the pixel being studied provided by the remote sensing software to understand and analyze the location errors and how it may affect a particular study.


== Remote sensing ==
Ground truth is important in the initial supervised classification of an image. When the identity and location of land cover types are known through a combination of field work, maps, and personal experience these areas are known as training sites. The spectral characteristics of these areas are used to train the remote sensing software using decision rules for classifying the rest of the image. These decision rules such as Maximum Likelihood Classification, Parallelepiped Classification, and Minimum Distance Classification offer different techniques to classify an image. Additional ground truth sites allow the remote sensor to establish an error matrix which validates the accuracy of the classification method used. Different classification methods may have different percentages of error for a given classification project. It is important that the remote sensor chooses a classification method that works best with the number of classifications used while providing the least amount of error.
In [[remote sensing]], "ground truth" refers to information collected at the imaged location. Ground truth allows image data to be related to real features and materials on the ground. The collection of ground truth data enables calibration of remote-sensing data, and aids in the interpretation and analysis of what is being sensed. Examples include [[cartography]], [[meteorology]], analysis of [[aerial photography|aerial photographs]], [[satellite imagery]] and other techniques in which data are gathered at a distance.


More specifically, ground truth may refer to a process in which "[[pixel]]s"<ref>{{cite journal|last1=Fisher|first1=P|title=The Pixel: A Snare and a Delusion|journal=International Journal of Remote Sensing|date=1997|volume=18|issue=15|pages=679–685|doi=10.1080/014311697219015|bibcode=1997IJRS...18..679F}}</ref> on a [[satellite]] image are compared to what is imaged (at the time of capture) in order to verify the contents of the "pixels" in the image (noting that the concept of "pixel" is imaging-system-dependent). In the case of a classified image, supervised classification can help to determine the accuracy of the classification by the remote sensing system which can minimize error in the classification.
Ground truth also helps with atmospheric correction. Since images from satellites obviously have to pass through the atmosphere, they can get distorted because of absorption in the atmosphere. So ground truth can help fully identify objects in satellite photos.


Ground truth is usually done on site, correlating what is known with surface observations and measurements of various properties of the features of the ground resolution cells under study in the remotely sensed digital image. The process also involves taking geographic coordinates of the ground resolution cell with GPS technology and comparing those with the coordinates of the "pixel" being studied provided by the remote sensing software to understand and analyze the location errors and how it may affect a particular study.
==Errors of commission==
An example of an error of commission is when a pixel reports the presence of a feature (such as trees) that, in reality, is absent (no trees are actually present). Ground truthing ensures that the error matrices have a higher accuracy percentage than would be the case if no pixels were ground truthed.


Ground truth is important in the initial supervised classification of an image. When the identity and location of land cover types are known through a combination of field work, maps, and personal experience these areas are known as training sites. The spectral characteristics of these areas are used to train the remote sensing software using decision rules for classifying the rest of the image. These decision rules such as Maximum Likelihood Classification, Parallelopiped Classification, and Minimum Distance Classification offer different techniques to classify an image. Additional ground truth sites allow the remote sensor to establish an error matrix that validates the accuracy of the classification method used. Different classification methods may have different percentages of error for a given classification project. It is important that the remote sensor chooses a classification method that works best with the number of classifications used while providing the least amount of error.
==Errors of omission==
An example of an error of omission is when pixels of a certain thing, for example maple trees, are not classified as maple trees. The process of ground truthing helps to ensure that the pixel is classified correctly and the error matrices are more accurate.


Ground truth also helps with [[atmospheric correction]]. Since images from satellites have to pass through the atmosphere, they can get distorted because of absorption in the atmosphere. So ground truth can help fully identify objects in satellite photos.
==Military Usage==
In US military slang, "ground truth" is used to describe the reality of a tactical situation as
opposed to what intelligence reports and mission plans assert the reality to be. The term is
reflected in the title of the 2006 Iraq War documentary ''[[The Ground Truth]]'' and is used in military publications, for example ''[[Stars and Stripes (newspaper)|Stars and Stripes]]'' saying "Stripes decided to figure out what the ground truth was in Iraq."


===Errors of commission===
The military usage of the term is long-standing but its origins are obscure. It is plausible but difficult to prove that "ground truth" began life as military terminology and then was applied to other domains such as remote sensing control.
An example of an error of commission is when a pixel reports the presence of a feature (such a tree) that, in reality, is absent (no tree is actually present). Ground truthing ensures that the error matrices have a higher accuracy percentage than would be the case if no pixels were ground-truthed. This value is the inverse of the user's accuracy, i.e. Commission Error = 1 - user's accuracy.


===Errors of omission===
==Vernacular Usage==
An example of an error of omission is when pixels of a certain type, for example, maple trees, are not classified as maple trees. The process of ground-truthing helps to ensure that the pixel is classified correctly and the error matrices are more accurate. This value is the inverse of the producer's accuracy, i.e. Omission Error = 1 - producer's accuracy
Somewhat analogous to military slang, "ground truth" may be used in the scientific vernacular to refer to the presumed objective reality of a statement or situation, in contrast with some predicted or computed expectation of reality. "Is (x) ground truth?" is a way of asking "Is x '''really''' what is going on, objectively, versus some evidential claims we are making?"
Ground truthing (verb) in science is gathering objective data to assess whether a theoretical model is accurate. This is the most basic test of the scientific method.


== Geographical information systems ==
==External links==
[[File:GroundTruth processModel01.png|thumb|420px|The ''ground truth representations'' are the GIS elements (fields or objects), and each element is representing (by a cartographic process) a real world object.]]
* [http://forest.esf.edu/unsupervisedClass.html Forestry Organization Remote Sensing Technology Project] (includes an example of an error matrix)


In GIS the spatial data is modeled as ''field'' (like in [[#Remote sensing|remote sensing raster images]]) or as ''object'' (like in [[Vector Map|vectorial map]] representation).<ref>Goodchild, M., "[http://www.geog.ucsb.edu/~good/papers/172.pdf Geographical data modeling]". Computers & Geosciences, vol. 18, no.4, pp. 401-408, 1992.</ref> They are modeled from the real world (also named ''geographical reality''), typically by a cartographic process (illustrated).
[[Category:Satellite meteorology and remote sensing]]
[[Category:Visualization (graphic)]]


[[Geographic information system]]s such as GIS, GPS, and GNSS, have become so widespread that the term "ground truth" has taken on special meaning in that context. If the location coordinates returned by a location method such as GPS are an estimate of a location, then the "ground truth" is the actual location on Earth. A smart phone might return a set of estimated location coordinates such as 43.87870,-103.45901. The ground truth being estimated by those coordinates is the tip of George Washington's nose on [[Mount Rushmore]]. The accuracy of the estimate is the maximum distance between the location coordinates and the ground truth. We could say in this case that the estimate accuracy is 10 meters, meaning that the point on earth represented by the location coordinates is thought to be within 10 meters of George's nose—the ground truth. In slang, the coordinates indicate where we think George Washington's nose is located, and the ground truth is where it really is. In practice a smart phone or hand-held GPS unit is routinely able to estimate the ground truth within 6–10 meters. Specialized instruments can reduce GPS measurement error to under a centimeter.<ref>
[[de:Ground Truth]]
{{cite book
| last1 = Pickles
| first1 = John
| title = Ground Truth: The Social Implications of Geographical Information Systems
| date = 1995| page = 179
}}
</ref>

==Military usage==
US [[military slang]] uses "ground truth" to refer to the facts comprising a tactical situation—as opposed to intelligence reports, mission plans, and other descriptions reflecting the conative or policy-based projections of the industrial·military complex. The term appears in the title of the [[Iraq War]] documentary film ''[[The Ground Truth]]'' (2006), and also in military publications, for example ''[[Stars and Stripes (newspaper)|Stars and Stripes]]'' saying: "Stripes decided to figure out what the ground truth was in Iraq."{{citation needed|date=October 2014}}

==See also==
*[[Baseline (medicine)|Baseline (science)]]
*[[Calibration]]
*[[Foundationalism]]

==References==
{{Reflist}}

==External links==
* [https://web.archive.org/web/20060911012029/http://forest.esf.edu/unsupervisedClass.html Forestry Organization Remote Sensing Technology Project] (includes an example of an error matrix)
[[Category:Optical character recognition| ]]
[[Category:Applications of computer vision]]
[[Category:Automatic identification and data capture]]
[[Category:Computational linguistics]]
[[Category:Machine learning task]]
[[Category:Satellite meteorology]]

Latest revision as of 19:13, 23 June 2024

Ground truth is information that is known to be real or true, provided by direct observation and measurement (i.e. empirical evidence) as opposed to information provided by inference.

Etymology

[edit]

The Oxford English Dictionary (s.v. ground truth) records the use of the word Groundtruth in the sense of 'fundamental truth' from Henry Ellison's poem "The Siberian Exile's Tale", published in 1833.[1]

Statistics and machine learning

[edit]

"Ground truth" may be seen as a conceptual term relative to the knowledge of the truth concerning a specific question. It is the ideal expected result.[2] This is used in statistical models to prove or disprove research hypotheses. The term "ground truthing" refers to the process of gathering the proper objective (provable) data for this test. Compare with gold standard. For example, suppose we are testing a stereo vision system to see how well it can estimate 3D positions. The "ground truth" might be the positions given by a laser rangefinder which is known to be much more accurate than the camera system.

Bayesian spam filtering is a common example of supervised learning. In this system, the algorithm is manually taught the differences between spam and non-spam. This depends on the ground truth of the messages used to train the algorithm – inaccuracies in the ground truth will correlate to inaccuracies in the resulting spam/non-spam verdicts.

Remote sensing

[edit]

In remote sensing, "ground truth" refers to information collected at the imaged location. Ground truth allows image data to be related to real features and materials on the ground. The collection of ground truth data enables calibration of remote-sensing data, and aids in the interpretation and analysis of what is being sensed. Examples include cartography, meteorology, analysis of aerial photographs, satellite imagery and other techniques in which data are gathered at a distance.

More specifically, ground truth may refer to a process in which "pixels"[3] on a satellite image are compared to what is imaged (at the time of capture) in order to verify the contents of the "pixels" in the image (noting that the concept of "pixel" is imaging-system-dependent). In the case of a classified image, supervised classification can help to determine the accuracy of the classification by the remote sensing system which can minimize error in the classification.

Ground truth is usually done on site, correlating what is known with surface observations and measurements of various properties of the features of the ground resolution cells under study in the remotely sensed digital image. The process also involves taking geographic coordinates of the ground resolution cell with GPS technology and comparing those with the coordinates of the "pixel" being studied provided by the remote sensing software to understand and analyze the location errors and how it may affect a particular study.

Ground truth is important in the initial supervised classification of an image. When the identity and location of land cover types are known through a combination of field work, maps, and personal experience these areas are known as training sites. The spectral characteristics of these areas are used to train the remote sensing software using decision rules for classifying the rest of the image. These decision rules such as Maximum Likelihood Classification, Parallelopiped Classification, and Minimum Distance Classification offer different techniques to classify an image. Additional ground truth sites allow the remote sensor to establish an error matrix that validates the accuracy of the classification method used. Different classification methods may have different percentages of error for a given classification project. It is important that the remote sensor chooses a classification method that works best with the number of classifications used while providing the least amount of error.

Ground truth also helps with atmospheric correction. Since images from satellites have to pass through the atmosphere, they can get distorted because of absorption in the atmosphere. So ground truth can help fully identify objects in satellite photos.

Errors of commission

[edit]

An example of an error of commission is when a pixel reports the presence of a feature (such a tree) that, in reality, is absent (no tree is actually present). Ground truthing ensures that the error matrices have a higher accuracy percentage than would be the case if no pixels were ground-truthed. This value is the inverse of the user's accuracy, i.e. Commission Error = 1 - user's accuracy.

Errors of omission

[edit]

An example of an error of omission is when pixels of a certain type, for example, maple trees, are not classified as maple trees. The process of ground-truthing helps to ensure that the pixel is classified correctly and the error matrices are more accurate. This value is the inverse of the producer's accuracy, i.e. Omission Error = 1 - producer's accuracy

Geographical information systems

[edit]
The ground truth representations are the GIS elements (fields or objects), and each element is representing (by a cartographic process) a real world object.

In GIS the spatial data is modeled as field (like in remote sensing raster images) or as object (like in vectorial map representation).[4] They are modeled from the real world (also named geographical reality), typically by a cartographic process (illustrated).

Geographic information systems such as GIS, GPS, and GNSS, have become so widespread that the term "ground truth" has taken on special meaning in that context. If the location coordinates returned by a location method such as GPS are an estimate of a location, then the "ground truth" is the actual location on Earth. A smart phone might return a set of estimated location coordinates such as 43.87870,-103.45901. The ground truth being estimated by those coordinates is the tip of George Washington's nose on Mount Rushmore. The accuracy of the estimate is the maximum distance between the location coordinates and the ground truth. We could say in this case that the estimate accuracy is 10 meters, meaning that the point on earth represented by the location coordinates is thought to be within 10 meters of George's nose—the ground truth. In slang, the coordinates indicate where we think George Washington's nose is located, and the ground truth is where it really is. In practice a smart phone or hand-held GPS unit is routinely able to estimate the ground truth within 6–10 meters. Specialized instruments can reduce GPS measurement error to under a centimeter.[5]

Military usage

[edit]

US military slang uses "ground truth" to refer to the facts comprising a tactical situation—as opposed to intelligence reports, mission plans, and other descriptions reflecting the conative or policy-based projections of the industrial·military complex. The term appears in the title of the Iraq War documentary film The Ground Truth (2006), and also in military publications, for example Stars and Stripes saying: "Stripes decided to figure out what the ground truth was in Iraq."[citation needed]

See also

[edit]

References

[edit]
  1. ^ Ellison, Henry (1833). Mad moments, or first verse attempts by a born natural. p. 362. Retrieved 2014-10-24. As the Groundtruth of her own Existence it must be regarded, thro' Him in its highest, purest Aspect shown!
  2. ^ Lemoigne, Yves; Caner, Alessandra (2006). Molecular Imaging: Computer Reconstruction and Practice.
  3. ^ Fisher, P (1997). "The Pixel: A Snare and a Delusion". International Journal of Remote Sensing. 18 (15): 679–685. Bibcode:1997IJRS...18..679F. doi:10.1080/014311697219015.
  4. ^ Goodchild, M., "Geographical data modeling". Computers & Geosciences, vol. 18, no.4, pp. 401-408, 1992.
  5. ^ Pickles, John (1995). Ground Truth: The Social Implications of Geographical Information Systems. p. 179.
[edit]