Jump to content

Data compression: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Spikeylegs (talk | contribs)
m References: Fixed Dead Link
Data differencing: has sources
 
(230 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{short description|Process of encoding information using fewer bits than the original representation}}
{{short description|Compact encoding of digital data}}
{{redirect|Source coding|the term in computer programming|Source code}}
{{redirect|Source coding|the term in computer programming|Source code}}
{{Use American English|date=March 2021}}


In [[signal processing]], '''data compression''', '''source coding''',<ref name="Wade"/> or '''bit-rate reduction''' is the process of encoding [[information]] using fewer [[bit]]s than the original representation.<ref name="mahdi53"/> Any particular compression is either [[Lossy compression|lossy]] or [[Lossless compression|lossless]]. Lossless compression reduces bits by identifying and eliminating [[Redundancy (information theory)|statistical redundancy]]. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information.<ref name="PujarKadlaskar"/> Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.
In [[information theory]], '''data compression''', '''source coding''',<ref name="Wade"/> or '''bit-rate reduction''' is the process of encoding [[information]] using fewer [[bit]]s than the original representation.<ref name="mahdi53"/> Any particular compression is either [[lossy]] or [[lossless]]. Lossless compression reduces bits by identifying and eliminating [[Redundancy (information theory)|statistical redundancy]]. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information.<ref name="PujarKadlaskar"/> Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.


The process of reducing the size of a [[data file]] is often referred to as data compression. In the context of [[data transmission]], it is called source coding; encoding done at the source of the data before it is stored or transmitted.<ref name="Salomon"/> Source coding should not be confused with [[channel coding]], for error detection and correction or [[line coding]], the means for mapping data onto a signal.
The process of reducing the size of a [[data file]] is often referred to as data compression. In the context of [[data transmission]], it is called source coding: encoding is done at the source of the data before it is stored or transmitted.<ref name="Salomon"/> Source coding should not be confused with [[channel coding]], for error detection and correction or [[line coding]], the means for mapping data onto a signal.


Compression is useful because it reduces resources required to store and transmit data. [[Computational resource]]s are consumed in the compression and decompression processes. Data compression is subject to a [[Space–time tradeoff|space–time complexity trade-off]]. For instance, [[#video|a compression scheme for video]] may require expensive [[Electronic hardware|hardware]] for the video to be decompressed fast enough to be viewed as it is being decompressed, and the option to decompress the video in full before watching it may be inconvenient or require additional storage. The design of data compression schemes involves trade-offs among various factors, including the degree of compression, the amount of distortion introduced (when using [[lossy data compression]]), and the computational resources required to compress and decompress the data.<ref name="MittalVetter"/><ref name="Tank"/>
Data Compression algorithms present a [[Space–time tradeoff|space-time complexity trade-off]] between the bytes needed to store or transmit information, and the [[Computational resource]]s needed to perform the encoding and decoding. The design of data compression schemes involves balancing the degree of compression, the amount of distortion introduced (when using [[lossy data compression]]), and the computational resources or time required to compress and decompress the data.<ref name="Tank"/>


== Lossless ==
== Lossless ==
Line 13: Line 14:
[[Lossless data compression]] [[algorithm]]s usually exploit [[Redundancy (information theory)|statistical redundancy]] to represent data without losing any [[Self-information|information]], so that the process is reversible. Lossless compression is possible because most real-world data exhibits statistical redundancy. For example, an image may have areas of color that do not change over several pixels; instead of coding "red pixel, red pixel, ..." the data may be encoded as "279 red pixels". This is a basic example of [[run-length encoding]]; there are many schemes to reduce file size by eliminating redundancy.
[[Lossless data compression]] [[algorithm]]s usually exploit [[Redundancy (information theory)|statistical redundancy]] to represent data without losing any [[Self-information|information]], so that the process is reversible. Lossless compression is possible because most real-world data exhibits statistical redundancy. For example, an image may have areas of color that do not change over several pixels; instead of coding "red pixel, red pixel, ..." the data may be encoded as "279 red pixels". This is a basic example of [[run-length encoding]]; there are many schemes to reduce file size by eliminating redundancy.


The [[Lempel–Ziv]] (LZ) compression methods are among the most popular algorithms for lossless storage.<ref name="Optimized LZW"/> [[DEFLATE]] is a variation on LZ optimized for decompression speed and compression ratio, but compression can be slow. In the mid-1980s, following work by [[Terry Welch]], the [[Lempel–Ziv–Welch]] (LZW) algorithm rapidly became the method of choice for most general-purpose compression systems. LZW is used in [[Graphics Interchange Format|GIF]] images, programs such as PKZIP, and hardware devices such as modems.<ref>{{Cite book|last=Stephen|first=Wolfram|url=https://www.wolframscience.com/nks/p1069--data-compression/|title=New Kind of Science|year=2002|isbn=1-57955-008-8|location=Champaign, IL|pages=1069}}</ref> LZ methods use a table-based compression model where table entries are substituted for repeated strings of data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often [[Huffman coding|Huffman encoded]]. [[Grammar-based codes]] like this can compress highly repetitive input extremely effectively, for instance, a biological data collection of the same or closely related species, a huge versioned document collection, internet archival, etc. The basic task of grammar-based codes is constructing a context-free grammar deriving a single string. Other practical grammar compression algorithms include [[Sequitur algorithm|Sequitur]] and Re-Pair.
The [[Lempel–Ziv]] (LZ) compression methods are among the most popular algorithms for lossless storage.<ref name="Optimized LZW"/> [[DEFLATE]] is a variation on LZ optimized for decompression speed and compression ratio,<ref>{{Cite book |title=Document Management - Portable document format - Part 1: PDF1.7 |date=July 1, 2008 |publisher=Adobe Systems Incorporated |year=2008 |edition=1st |language=English}}</ref> but compression can be slow. In the mid-1980s, following work by [[Terry Welch]], the [[Lempel–Ziv–Welch]] (LZW) algorithm rapidly became the method of choice for most general-purpose compression systems. LZW is used in [[GIF]] images, programs such as [[PKZIP]], and hardware devices such as modems.<ref>{{Cite book|last=Stephen|first=Wolfram|url=https://www.wolframscience.com/nks/p1069--data-compression/|title=New Kind of Science|year=2002|isbn=1-57955-008-8|location=Champaign, IL|pages=1069}}</ref> LZ methods use a table-based compression model where table entries are substituted for repeated strings of data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often [[Huffman coding|Huffman encoded]]. [[Grammar-based codes]] like this can compress highly repetitive input extremely effectively, for instance, a biological [[data collection]] of the same or closely related species, a huge versioned document collection, internet archival, etc. The basic task of grammar-based codes is constructing a context-free grammar deriving a single string. Other practical grammar compression algorithms include [[Sequitur algorithm|Sequitur]] and [[Re-Pair]].


The strongest modern lossless compressors use [[Randomized algorithm|probabilistic]] models, such as [[prediction by partial matching]]. The [[Burrows–Wheeler transform]] can also be viewed as an indirect form of statistical modelling.<ref name="mahmud2"/> In a further refinement of the direct use of [[probabilistic model]]ling, statistical estimates can be coupled to an algorithm called [[arithmetic coding]]. Arithmetic coding is a more modern coding technique that uses the mathematical calculations of a [[finite-state machine]] to produce a string of encoded bits from a series of input data symbols. It can achieve superior compression compared to other techniques such as the better-known Huffman algorithm. It uses an internal memory state to avoid the need to perform a one-to-one mapping of individual input symbols to distinct representations that use an integer number of bits, and it clears out the internal memory only after encoding the entire string of data symbols. Arithmetic coding applies especially well to adaptive data compression tasks where the statistics vary and are context-dependent, as it can be easily coupled with an adaptive model of the [[probability distribution]] of the input data. An early example of the use of arithmetic coding was in an optional (but not widely used) feature of the [[JPEG]] image coding standard.<ref name=TomLane/> It has since been applied in various other designs including [[H.263]], [[H.264/MPEG-4 AVC]] and [[HEVC]] for video coding.<ref name="HEVC"/>
The strongest modern lossless compressors use [[Randomized algorithm|probabilistic]] models, such as [[prediction by partial matching]]. The [[Burrows–Wheeler transform]] can also be viewed as an indirect form of statistical modelling.<ref name="mahmud2"/> In a further refinement of the direct use of [[probabilistic model]]ling, statistical estimates can be coupled to an algorithm called [[arithmetic coding]]. Arithmetic coding is a more modern coding technique that uses the mathematical calculations of a [[finite-state machine]] to produce a string of encoded bits from a series of input data symbols. It can achieve superior compression compared to other techniques such as the better-known Huffman algorithm. It uses an internal memory state to avoid the need to perform a one-to-one mapping of individual input symbols to distinct representations that use an integer number of bits, and it clears out the internal memory only after encoding the entire string of data symbols. Arithmetic coding applies especially well to adaptive data compression tasks where the statistics vary and are context-dependent, as it can be easily coupled with an adaptive model of the [[probability distribution]] of the input data. An early example of the use of arithmetic coding was in an optional (but not widely used) feature of the [[JPEG]] image coding standard.<ref name=TomLane/> It has since been applied in various other designs including [[H.263]], [[H.264/MPEG-4 AVC]] and [[HEVC]] for video coding.<ref name="HEVC"/>

Archive software typically has the ability to adjust the "dictionary size", where a larger size demands more [[random-access memory]] during compression and decompression, but compresses stronger, especially on repeating patterns in files' content.<ref>{{cite web| url = https://www.winrar-france.fr/winrar_instructions_for_use/source/html/HELPArcOptimal.htm| title = How to choose optimal archiving settings – WinRAR}}</ref><ref>{{cite web| url = https://sevenzip.osdn.jp/chm/cmdline/switches/method.htm| title = (Set compression Method) switch – 7zip| access-date = 2021-11-07| archive-date = 2022-04-09| archive-url = https://web.archive.org/web/20220409225619/https://sevenzip.osdn.jp/chm/cmdline/switches/method.htm| url-status = dead}}</ref>


== Lossy ==
== Lossy ==
{{Main|Lossy compression}}
{{Main|Lossy compression}}


[[File:Comparison of JPEG and PNG.png|thumb|Composite image showing JPG and PNG image compression. Left side of the image is from a JPEG image, showing lossy artefacts; the right side is from a PNG image.]]
In the late 1980s, digital images became more common, and standards for lossless [[image compression]] emerged. In the early 1990s, lossy compression methods began to be widely used.<ref name="Wolfram"/> In these schemes, some loss of information is accepted as dropping nonessential detail can save storage space. There is a corresponding [[trade-off]] between preserving information and reducing size. Lossy data compression schemes are designed by research on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations in [[luminance]] than it is to the variations in color. [[JPEG]] image compression works in part by rounding off nonessential bits of information.<ref name="Arcangel"/> A number of popular compression formats exploit these perceptual differences, including [[psychoacoustics]] for sound, and [[psychovisual]]s for images and video.
In the late 1980s, digital images became more common, and standards for lossless [[image compression]] emerged. In the early 1990s, lossy compression methods began to be widely used.<ref name="Wolfram"/> In these schemes, some loss of information is accepted as dropping nonessential detail can save storage space. There is a corresponding [[trade-off]] between preserving information and reducing size. Lossy data compression schemes are designed by research on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations in [[luminance]] than it is to the variations in color. JPEG image compression works in part by rounding off nonessential bits of information.<ref name="Arcangel"/> A number of popular compression formats exploit these perceptual differences, including [[psychoacoustics]] for sound, and [[psychovisual]]s for images and video.


Most forms of lossy compression are based on [[transform coding]], especially the [[discrete cosine transform]] (DCT). It was first proposed in 1972 by [[N. Ahmed|Nasir Ahmed]], who then developed a working algorithm with T. Natarajan and [[K. R. Rao]] in 1973, before introducing it in January 1974.<ref name="Ahmed">{{cite journal |last=Ahmed |first=Nasir |author-link=N. Ahmed |title=How I Came Up With the Discrete Cosine Transform |journal=[[Digital Signal Processing (journal)|Digital Signal Processing]] |date=January 1991 |volume=1 |issue=1 |pages=4–5 |doi=10.1016/1051-2004(91)90086-Z |url=https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform}}</ref><ref name="DCT"/> DCT is the most widely used lossy compression method, and is used in multimedia formats for [[Image compression|images]] (such as [[JPEG]] and [[HEIF]]),<ref name="JPEG"/> [[Video compression|video]] (such as [[MPEG]], [[H.264/AVC|AVC]] and [[HEVC]]) and audio (such as [[MP3]], [[Advanced Audio Coding|AAC]] and [[Vorbis]]).
Most forms of lossy compression are based on [[transform coding]], especially the [[discrete cosine transform]] (DCT). It was first proposed in 1972 by [[N. Ahmed|Nasir Ahmed]], who then developed a working algorithm with T. Natarajan and [[K. R. Rao]] in 1973, before introducing it in January 1974.<ref name="Ahmed">{{cite journal |last=Ahmed |first=Nasir |author-link=N. Ahmed |title=How I Came Up With the Discrete Cosine Transform |journal=[[Digital Signal Processing (journal)|Digital Signal Processing]] |date=January 1991 |volume=1 |issue=1 |pages=4–5 |doi=10.1016/1051-2004(91)90086-Z |bibcode=1991DSP.....1....4A |url=https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform}}</ref><ref name="DCT"/> DCT is the most widely used lossy compression method, and is used in multimedia formats for images (such as JPEG and [[HEIF]]),<ref name="JPEG"/> [[Video compression|video]] (such as [[MPEG]], [[H.264/AVC|AVC]] and HEVC) and audio (such as [[MP3]], [[Advanced Audio Coding|AAC]] and [[Vorbis]]).


Lossy [[image compression]] is used in [[digital camera]]s, to increase storage capacities. Similarly, [[DVD]]s, [[Blu-ray]] and [[streaming video]] use lossy [[video coding format]]s. Lossy compression is extensively used in video.
Lossy image compression is used in [[digital camera]]s, to increase storage capacities. Similarly, [[DVD]]s, [[Blu-ray]] and [[streaming video]] use lossy [[video coding format]]s. Lossy compression is extensively used in video.


In lossy audio compression, methods of psychoacoustics are used to remove non-audible (or less audible) components of the [[audio signal]]. Compression of human speech is often performed with even more specialized techniques; [[speech coding]] is distinguished as a separate discipline from general-purpose audio compression. Speech coding is used in [[internet telephony]], for example, audio compression is used for CD ripping and is decoded by the audio players.<ref name="mahmud2"/>
In lossy audio compression, methods of psychoacoustics are used to remove non-audible (or less audible) components of the [[audio signal]]. Compression of human speech is often performed with even more specialized techniques; [[speech coding]] is distinguished as a separate discipline from general-purpose audio compression. Speech coding is used in [[internet telephony]], for example, audio compression is used for CD ripping and is decoded by the audio players.<ref name="mahmud2"/>
Line 31: Line 35:


== Theory ==
== Theory ==
The theoretical basis for compression is provided by [[information theory]] and, more specifically, [[algorithmic information theory]] for lossless compression and [[rate–distortion theory]] for lossy compression. These areas of study were essentially created by [[Claude Shannon]], who published fundamental papers on the topic in the late 1940s and early 1950s. Other topics associated with compression include [[coding theory]] and [[statistical inference]].<ref name="Marak"/>
The theoretical basis for compression is provided by [[information theory]] and, more specifically, [[Shannon's source coding theorem]]; domain-specific theories include [[algorithmic information theory]] for lossless compression and [[rate–distortion theory]] for lossy compression. These areas of study were essentially created by [[Claude Shannon]], who published fundamental papers on the topic in the late 1940s and early 1950s. Other topics associated with compression include [[coding theory]] and [[statistical inference]].<ref name="Marak"/>


=== Machine learning ===
=== Machine learning ===
There is a close connection between [[machine learning]] and compression. A system that predicts the [[posterior probabilities]] of a sequence given its entire history can be used for optimal data compression (by using [[arithmetic coding]] on the output distribution). An optimal compressor can be used for prediction (by finding the symbol that compresses best, given the previous history). This equivalence has been used as a justification for using data compression as a benchmark for "general intelligence".<ref name="Mahoney"/><ref name="Market Efficiency"/><ref name="Ben-Gal"/>
There is a close connection between [[machine learning]] and compression. A system that predicts the [[posterior probabilities]] of a sequence given its entire history can be used for optimal data compression (by using [[arithmetic coding]] on the output distribution). Conversely, an optimal compressor can be used for prediction (by finding the symbol that compresses best, given the previous history). This equivalence has been used as a justification for using data compression as a benchmark for "general intelligence".<ref name="Mahoney"/><ref name="Market Efficiency"/><ref name="Ben-Gal"/>


An alternative view can show compression algorithms implicitly map strings into implicit [[feature space vector]]s, and compression-based similarity measures compute similarity within these feature spaces. For each compressor C(.) we define an associated vector space ℵ, such that C(.) maps an input string x, corresponds to the vector norm ||~x||. An exhaustive examination of the feature spaces underlying all compression algorithms is precluded by space; instead, feature vectors chooses to examine three representative lossless compression methods, LZW, LZ77, and PPM.<ref name="ScullyBrodley"/>
An alternative view can show compression algorithms implicitly map strings into implicit [[feature space vector]]s, and compression-based similarity measures compute similarity within these feature spaces. For each compressor C(.) we define an associated vector space ℵ, such that C(.) maps an input string x, corresponding to the vector norm ||~x||. An exhaustive examination of the feature spaces underlying all compression algorithms is precluded by space; instead, feature vectors chooses to examine three representative lossless compression methods, LZW, LZ77, and PPM.<ref name="ScullyBrodley"/>


According to [[AIXI]] theory, a connection more directly explained in [[Hutter Prize]], the best possible compression of x is the smallest possible software which generates x. For example, in that model, a zip file's compressed size includes both the zip file and the unzipping software, since you can't unzip it without both, but there may be an even smaller combined form.
According to [[AIXI]] theory, a connection more directly explained in [[Hutter Prize]], the best possible compression of x is the smallest possible software that generates x. For example, in that model, a zip file's compressed size includes both the zip file and the unzipping software, since you can not unzip it without both, but there may be an even smaller combined form.

Examples of AI-powered audio/video compression software include [[NVIDIA Maxine]], AIVC.<ref>{{cite web |author1=Gary Adcock |title=What Is AI Video Compression? |url=https://massive.io/file-transfer/what-is-ai-video-compression/ |website=massive.io |access-date=6 April 2023 |date=January 5, 2023}}</ref> Examples of software that can perform AI-powered image compression include [[OpenCV]], [[TensorFlow]], [[MATLAB]]'s Image Processing Toolbox (IPT) and High-Fidelity Generative Image Compression.<ref>{{cite arXiv |last1=Mentzer |first1=Fabian |last2=Toderici |first2=George |last3=Tschannen |first3=Michael |last4=Agustsson |first4=Eirikur |title=High-Fidelity Generative Image Compression |year=2020 |class=eess.IV |eprint=2006.09965}}</ref>

In [[unsupervised machine learning]], [[k-means clustering]] can be utilized to compress data by grouping similar data points into clusters. This technique simplifies handling extensive datasets that lack predefined labels and finds widespread use in fields such as [[image compression]].<ref>{{Cite web |title=What is Unsupervised Learning? {{!}} IBM |url=https://www.ibm.com/topics/unsupervised-learning |access-date=2024-02-05 |website=www.ibm.com |date=23 September 2021 |language=en-us}}</ref>

Data compression aims to reduce the size of data files, enhancing storage efficiency and speeding up data transmission. K-means clustering, an unsupervised machine learning algorithm, is employed to partition a dataset into a specified number of clusters, k, each represented by the [[centroid]] of its points. This process condenses extensive datasets into a more compact set of representative points. Particularly beneficial in [[Image processing|image]] and [[signal processing]], k-means clustering aids in data reduction by replacing groups of data points with their centroids, thereby preserving the core information of the original data while significantly decreasing the required storage space.<ref>{{Cite web |date=2023-05-25 |title=Differentially private clustering for large-scale datasets |url=https://blog.research.google/2023/05/differentially-private-clustering-for.html |access-date=2024-03-16 |website=blog.research.google |language=en}}</ref>

[[Large language model]]s (LLMs) are also capable of lossless data compression, as demonstrated by [[DeepMind]]'s research with the Chinchilla 70B model. Developed by DeepMind, Chinchilla 70B effectively compressed data, outperforming conventional methods such as [[Portable Network Graphics]] (PNG) for images and [[Free Lossless Audio Codec]] (FLAC) for audio. It achieved compression of image and audio data to 43.4% and 16.4% of their original sizes, respectively.<ref>{{Cite web |last=Edwards |first=Benj |date=2023-09-28 |title=AI language models can exceed PNG and FLAC in lossless compression, says study |url=https://arstechnica.com/information-technology/2023/09/ai-language-models-can-exceed-png-and-flac-in-lossless-compression-says-study/ |access-date=2024-03-07 |website=Ars Technica |language=en-us}}</ref>


=== Data differencing ===
=== Data differencing ===
[[File:Nubio Diff Screenshot3.png|thumb|[[File comparison|Comparison]] of two revisions of a file]]
{{main|Data differencing}}

Data compression can be viewed as a special case of [[data differencing]].<ref name="RFC 3284"/><ref name="Vdelta"/> Data differencing consists of producing a ''difference'' given a ''source'' and a ''target,'' with patching reproducing the ''target'' given a ''source'' and a ''difference.'' Since there is no separate source and target in data compression, one can consider data compression as data differencing with empty source data, the compressed file corresponding to a difference from nothing. This is the same as considering absolute [[entropy (information theory)|entropy]] (corresponding to data compression) as a special case of [[relative entropy]] (corresponding to data differencing) with no initial data.
Data compression can be viewed as a special case of [[data differencing]].<ref name="RFC 3284"/><ref name="Vdelta"/> Data differencing consists of producing a ''difference'' given a ''source'' and a ''target,'' with patching reproducing the ''target'' given a ''source'' and a ''difference.'' Since there is no separate source and target in data compression, one can consider data compression as data differencing with empty source data, the compressed file corresponding to a difference from nothing. This is the same as considering absolute [[entropy (information theory)|entropy]] (corresponding to data compression) as a special case of [[relative entropy]] (corresponding to data differencing) with no initial data.


Line 52: Line 65:
[[Entropy coding]] originated in the 1940s with the introduction of [[Shannon–Fano coding]],<ref name="Shannon"/> the basis for [[Huffman coding]] which was developed in 1950.<ref name="Huffman"/> [[Transform coding]] dates back to the late 1960s, with the introduction of [[fast Fourier transform]] (FFT) coding in 1968 and the [[Hadamard transform]] in 1969.<ref name="Hadamard"/>
[[Entropy coding]] originated in the 1940s with the introduction of [[Shannon–Fano coding]],<ref name="Shannon"/> the basis for [[Huffman coding]] which was developed in 1950.<ref name="Huffman"/> [[Transform coding]] dates back to the late 1960s, with the introduction of [[fast Fourier transform]] (FFT) coding in 1968 and the [[Hadamard transform]] in 1969.<ref name="Hadamard"/>


An important [[image compression]] technique is the [[discrete cosine transform]] (DCT), a technique developed in the early 1970s.<ref name="Ahmed"/> DCT is the basis for [[JPEG]], a [[lossy compression]] format which was introduced by the [[Joint Photographic Experts Group]] (JPEG) in 1992.<ref name="t81">{{cite web |title=T.81 – DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES |url=https://www.w3.org/Graphics/JPEG/itu-t81.pdf |publisher=[[CCITT]] |date=September 1992 |access-date=12 July 2019}}</ref> JPEG greatly reduces the amount of data required to represent an image at the cost of a relatively small reduction in image quality and has become the most widely used [[image file format]].<ref>{{cite web |title=The JPEG image format explained |url=https://home.bt.com/tech-gadgets/photography/what-is-a-jpeg-11364206889349 |website=[[BT.com]] |publisher=[[BT Group]] |access-date=5 August 2019 |date=31 May 2018}}</ref><ref>{{cite news |last1=Baraniuk |first1=Chris |title=Copy protections could come to JPEGs |url=https://www.bbc.co.uk/news/technology-34538705 |access-date=13 September 2019 |work=[[BBC News]] |agency=[[BBC]] |date=15 October 2015}}</ref> Its highly efficient DCT-based compression algorithm was largely responsible for the wide proliferation of [[digital images]] and [[digital photo]]s.<ref name="Atlantic">{{cite web |title=What Is a JPEG? The Invisible Object You See Every Day |url=https://www.theatlantic.com/technology/archive/2013/09/what-is-a-jpeg-the-invisible-object-you-see-every-day/279954/ |access-date=13 September 2019 |website=[[The Atlantic]] |date=24 September 2013}}</ref>
An important image compression technique is the [[discrete cosine transform]] (DCT), a technique developed in the early 1970s.<ref name="Ahmed"/> DCT is the basis for JPEG, a [[lossy compression]] format which was introduced by the [[Joint Photographic Experts Group]] (JPEG) in 1992.<ref name="t81" /> JPEG greatly reduces the amount of data required to represent an image at the cost of a relatively small reduction in image quality and has become the most widely used [[image file format]].<ref>{{cite web |title=The JPEG image format explained |url=https://home.bt.com/tech-gadgets/photography/what-is-a-jpeg-11364206889349 |website=[[BT.com]] |publisher=[[BT Group]] |access-date=5 August 2019 |date=31 May 2018 |archive-date=5 August 2019 |archive-url=https://web.archive.org/web/20190805194553/https://home.bt.com/tech-gadgets/photography/what-is-a-jpeg-11364206889349 |url-status=dead }}</ref><ref>{{cite news |last1=Baraniuk |first1=Chris |title=Copy protections could come to JPEGs |url=https://www.bbc.co.uk/news/technology-34538705 |access-date=13 September 2019 |work=[[BBC News]] |agency=[[BBC]] |date=15 October 2015}}</ref> Its highly efficient DCT-based compression algorithm was largely responsible for the wide proliferation of [[digital image]]s and [[digital photo]]s.<ref name="Atlantic">{{cite web |title=What Is a JPEG? The Invisible Object You See Every Day |url=https://www.theatlantic.com/technology/archive/2013/09/what-is-a-jpeg-the-invisible-object-you-see-every-day/279954/ |access-date=13 September 2019 |website=[[The Atlantic]] |date=24 September 2013}}</ref>


[[Lempel–Ziv–Welch]] (LZW) is a [[lossless compression]] algorithm developed in 1984. It is used in the [[GIF]] format, introduced in 1987.<ref name="cloanto">{{cite web|url=https://mike.pub/19950127-gif-lzw|title=The GIF Controversy: A Software Developer's Perspective|access-date=26 May 2015}}</ref> [[DEFLATE]], a lossless compression algorithm specified in 1996, is used in the [[Portable Network Graphics]] (PNG) format.<ref name="IETF">{{cite IETF |title=DEFLATE Compressed Data Format Specification version 1.3 |rfc=1951 |section=Abstract |page=1 |author=L. Peter Deutsch |author-link=L. Peter Deutsch |date=May 1996 |publisher=[[Internet Engineering Task Force|IETF]] |access-date=2014-04-23}}</ref>
[[Lempel–Ziv–Welch]] (LZW) is a [[lossless compression]] algorithm developed in 1984. It is used in the [[GIF]] format, introduced in 1987.<ref name="cloanto">{{cite web|url=https://mike.pub/19950127-gif-lzw|title=The GIF Controversy: A Software Developer's Perspective|date=27 January 1995 |access-date=26 May 2015}}</ref> [[DEFLATE]], a lossless compression algorithm specified in 1996, is used in the [[Portable Network Graphics]] (PNG) format.<ref name="IETF">{{cite IETF |title=DEFLATE Compressed Data Format Specification version 1.3 |rfc=1951 |section=Abstract |page=1 |author=L. Peter Deutsch |author-link=L. Peter Deutsch |date=May 1996 |publisher=[[IETF]] |access-date=2014-04-23}}</ref>


[[Wavelet compression]], the use of [[wavelets]] in image compression, began after the development of DCT coding.<ref name="Hoffman"/> The [[JPEG 2000]] standard was introduced in 2000.<ref>{{cite book |last1=Taubman |first1=David |last2=Marcellin |first2=Michael |title=JPEG2000 Image Compression Fundamentals, Standards and Practice: Image Compression Fundamentals, Standards and Practice |date=2012 |publisher=[[Springer Science & Business Media]] |isbn=9781461507994 |url=https://books.google.com/books?id=y7HeBwAAQBAJ&pg=PA402}}</ref> In contrast to the DCT algorithm used by the original JPEG format, JPEG 2000 instead uses [[discrete wavelet transform]] (DWT) algorithms.<ref name="Unser"/><ref>{{cite web |last1=Sullivan |first1=Gary |title=General characteristics and design considerations for temporal subband video coding |publisher=[[Video Coding Experts Group]] |website=[[ITU-T]] |date=8–12 December 2003 |url=https://www.itu.int/wftp3/av-arch/video-site/0312_Wai/VCEG-U06.doc |access-date=13 September 2019}}</ref><ref>{{cite book |last1=Bovik |first1=Alan C. |title=The Essential Guide to Video Processing |date=2009 |publisher=[[Academic Press]] |isbn=9780080922508 |page=355 |url=https://books.google.com/books?id=wXmSPPB_c_0C&pg=PA355}}</ref> JPEG 2000 technology, which includes the [[Motion JPEG 2000]] extension, was selected as the [[video coding standard]] for [[digital cinema]] in 2004.<ref>{{cite book |last1=Swartz |first1=Charles S. |title=Understanding Digital Cinema: A Professional Handbook |date=2005 |publisher=[[Taylor & Francis]] |isbn=9780240806174 |page=147 |url=https://books.google.com/books?id=tYw3ehoBnjkC&pg=PA147}}</ref>
[[Wavelet compression]], the use of [[wavelet]]s in image compression, began after the development of DCT coding.<ref name="Hoffman"/> The [[JPEG 2000]] standard was introduced in 2000.<ref>{{cite book |last1=Taubman |first1=David |last2=Marcellin |first2=Michael |title=JPEG2000 Image Compression Fundamentals, Standards and Practice: Image Compression Fundamentals, Standards and Practice |date=2012 |publisher=[[Springer Science & Business Media]] |isbn=9781461507994 |url=https://books.google.com/books?id=y7HeBwAAQBAJ&pg=PA402}}</ref> In contrast to the DCT algorithm used by the original JPEG format, JPEG 2000 instead uses [[discrete wavelet transform]] (DWT) algorithms.<ref name="Unser"/><ref>{{cite web |last1=Sullivan |first1=Gary |title=General characteristics and design considerations for temporal subband video coding |publisher=[[Video Coding Experts Group]] |website=[[ITU-T]] |date=8–12 December 2003 |url=https://www.itu.int/wftp3/av-arch/video-site/0312_Wai/VCEG-U06.doc |access-date=13 September 2019}}</ref><ref>{{cite book |last1=Bovik |first1=Alan C. |title=The Essential Guide to Video Processing |date=2009 |publisher=[[Academic Press]] |isbn=9780080922508 |page=355 |url=https://books.google.com/books?id=wXmSPPB_c_0C&pg=PA355}}</ref> JPEG 2000 technology, which includes the [[Motion JPEG 2000]] extension, was selected as the [[video coding standard]] for [[digital cinema]] in 2004.<ref>{{cite book |last1=Swartz |first1=Charles S. |title=Understanding Digital Cinema: A Professional Handbook |date=2005 |publisher=[[Taylor & Francis]] |isbn=9780240806174 |page=147 |url=https://books.google.com/books?id=tYw3ehoBnjkC&pg=PA147}}</ref>


=== Audio ===
=== Audio ===
{{see also|Audio coding format|Audio codec}}
{{see also|Audio coding format|Audio codec}}
Audio data compression, not to be confused with [[dynamic range compression]], has the potential to reduce the transmission [[Bandwidth (computing)|bandwidth]] and storage requirements of audio data. [[List of codecs#Audio|Audio compression algorithms]] are implemented in [[software]] as audio [[codec]]s. In both lossy and lossless compression, [[Redundancy (information theory)|information redundancy]] is reduced, using methods such as [[Coding theory|coding]], [[Quantization (signal processing)|quantization]] [[discrete cosine transform]] and [[linear prediction]] to reduce the amount of information used to represent the uncompressed data.
Audio data compression, not to be confused with [[dynamic range compression]], has the potential to reduce the transmission [[Bandwidth (computing)|bandwidth]] and storage requirements of audio data. [[List of codecs#Audio compression formats|Audio compression formats compression algorithms]] are implemented in [[software]] as audio [[codec]]s. In both lossy and lossless compression, [[Redundancy (information theory)|information redundancy]] is reduced, using methods such as [[Coding theory|coding]], [[Quantization (signal processing)|quantization]], DCT and [[linear prediction]] to reduce the amount of information used to represent the uncompressed data.


Lossy audio compression algorithms provide higher compression and are used in numerous audio applications including [[Vorbis]] and [[MP3]]. These algorithms almost all rely on [[psychoacoustics]] to eliminate or reduce fidelity of less audible sounds, thereby reducing the space required to store or transmit them.<ref name="mahdi53"/><ref>{{cite journal |last1=Cunningham |first1=Stuart |last2=McGregor |first2=Iain |title=Subjective Evaluation of Music Compressed with the ACER Codec Compared to AAC, MP3, and Uncompressed PCM |journal=International Journal of Digital Multimedia Broadcasting |volume=2019 |pages=1–16 |date=2019 |language=en|doi=10.1155/2019/8265301 |doi-access=free }}</ref>
Lossy audio compression algorithms provide higher compression and are used in numerous audio applications including [[Vorbis]] and [[MP3]]. These algorithms almost all rely on [[psychoacoustics]] to eliminate or reduce fidelity of less audible sounds, thereby reducing the space required to store or transmit them.<ref name="mahdi53"/><ref>{{cite journal |last1=Cunningham |first1=Stuart |last2=McGregor |first2=Iain |title=Subjective Evaluation of Music Compressed with the ACER Codec Compared to AAC, MP3, and Uncompressed PCM |journal=International Journal of Digital Multimedia Broadcasting |volume=2019 |pages=1–16 |date=2019 |language=en|doi=10.1155/2019/8265301 |doi-access=free }}</ref>
Line 81: Line 94:
Psychoacoustics recognizes that not all data in an audio stream can be perceived by the human [[auditory system]]. Most lossy compression reduces redundancy by first identifying perceptually irrelevant sounds, that is, sounds that are very hard to hear. Typical examples include high frequencies or sounds that occur at the same time as louder sounds. Those irrelevant sounds are coded with decreased accuracy or not at all.
Psychoacoustics recognizes that not all data in an audio stream can be perceived by the human [[auditory system]]. Most lossy compression reduces redundancy by first identifying perceptually irrelevant sounds, that is, sounds that are very hard to hear. Typical examples include high frequencies or sounds that occur at the same time as louder sounds. Those irrelevant sounds are coded with decreased accuracy or not at all.


Due to the nature of lossy algorithms, [[audio quality]] suffers a [[digital generation loss]] when a file is decompressed and recompressed. This makes lossy compression unsuitable for storing the intermediate results in professional audio engineering applications, such as sound editing and multitrack recording. However, lossy formats such as [[MP3]] are very popular with end users as the file size is reduced to 5-20% of the original size and a megabyte can store about a minute's worth of music at adequate quality.
Due to the nature of lossy algorithms, [[audio quality]] suffers a [[digital generation loss]] when a file is decompressed and recompressed. This makes lossy compression unsuitable for storing the intermediate results in professional audio engineering applications, such as sound editing and multitrack recording. However, lossy formats such as [[MP3]] are very popular with end-users as the file size is reduced to 5-20% of the original size and a megabyte can store about a minute's worth of music at adequate quality.

Several proprietary lossy compression algorithms have been developed that provide higher quality audio performance by using a combination of lossless and lossy algorithms with adaptive bit rates and lower compression ratios. Examples include [[aptX]], [[LDAC (codec)|LDAC]], [[LHDC (codec)|LHDC]], [[Master Quality Authenticated#Codec description|MQA]] and [[SCL6 (codec)|SCL6]].


===== Coding methods =====
===== Coding methods =====
To determine what information in an audio signal is perceptually irrelevant, most lossy compression algorithms use transforms such as the [[modified discrete cosine transform]] (MDCT) to convert [[time domain]] sampled waveforms into a transform domain, typically the [[frequency domain]]. Once transformed, component frequencies can be prioritized according to how audible they are. Audibility of spectral components is assessed using the [[absolute threshold of hearing]] and the principles of [[simultaneous masking]]—the phenomenon wherein a signal is masked by another signal separated by frequency—and, in some cases, [[temporal masking]]—where a signal is masked by another signal separated by time. [[Equal-loudness contour]]s may also be used to weight the perceptual importance of components. Models of the human ear-brain combination incorporating such effects are often called [[psychoacoustic model]]s.<ref name="faxin47"/>
To determine what information in an audio signal is perceptually irrelevant, most lossy compression algorithms use transforms such as the [[modified discrete cosine transform]] (MDCT) to convert [[time domain]] sampled waveforms into a transform domain, typically the [[frequency domain]]. Once transformed, component frequencies can be prioritized according to how audible they are. Audibility of spectral components is assessed using the [[absolute threshold of hearing]] and the principles of [[simultaneous masking]]—the phenomenon wherein a signal is masked by another signal separated by frequency—and, in some cases, [[temporal masking]]—where a signal is masked by another signal separated by time. [[Equal-loudness contour]]s may also be used to weigh the perceptual importance of components. Models of the human ear-brain combination incorporating such effects are often called [[psychoacoustic model]]s.<ref name="faxin47"/>


Other types of lossy compressors, such as the [[linear predictive coding]] (LPC) used with speech, are source-based coders. LPC uses a model of the human vocal tract to analyze speech sounds and infer the parameters used by the model to produce them moment to moment. These changing parameters are transmitted or stored and used to drive another model in the decoder which reproduces the sound.
Other types of lossy compressors, such as the [[linear predictive coding]] (LPC) used with speech, are source-based coders. LPC uses a model of the human vocal tract to analyze speech sounds and infer the parameters used by the model to produce them moment to moment. These changing parameters are transmitted or stored and used to drive another model in the decoder which reproduces the sound.
Line 90: Line 105:
Lossy formats are often used for the distribution of streaming audio or interactive communication (such as in cell phone networks). In such applications, the data must be decompressed as the data flows, rather than after the entire data stream has been transmitted. Not all audio codecs can be used for streaming applications.<ref name="Jaiswal"/>
Lossy formats are often used for the distribution of streaming audio or interactive communication (such as in cell phone networks). In such applications, the data must be decompressed as the data flows, rather than after the entire data stream has been transmitted. Not all audio codecs can be used for streaming applications.<ref name="Jaiswal"/>


[[Latency (engineering)|Latency]] is introduced by the methods used to encode and decode the data. Some codecs will analyze a longer segment, called a ''frame'', of the data to optimize efficiency, and then code it in a manner that requires a larger segment of data at one time to decode. The inherent latency of the coding algorithm can be critical; for example, when there is a two-way transmission of data, such as with a telephone conversation, significant delays may seriously degrade the perceived quality.<!--[[User:Kvng/RTH]]-->
[[Latency (engineering)|Latency]] is introduced by the methods used to encode and decode the data. Some codecs will analyze a longer segment, called a ''frame'', of the data to optimize efficiency, and then code it in a manner that requires a larger segment of data at one time to decode. The inherent latency of the coding algorithm can be critical; for example, when there is a two-way transmission of data, such as with a telephone conversation, significant delays may seriously degrade the perceived quality.


In contrast to the speed of compression, which is proportional to the number of operations required by the algorithm, here latency refers to the number of samples that must be analysed before a block of audio is processed. In the minimum case, latency is zero samples (e.g., if the coder/decoder simply reduces the number of bits used to quantize the signal). Time domain algorithms such as LPC also often have low latencies, hence their popularity in speech coding for telephony. In algorithms such as MP3, however, a large number of samples have to be analyzed to implement a psychoacoustic model in the frequency domain, and latency is on the order of 23&nbsp;ms (46&nbsp;ms for two-way communication).
In contrast to the speed of compression, which is proportional to the number of operations required by the algorithm, here latency refers to the number of samples that must be analyzed before a block of audio is processed. In the minimum case, latency is zero samples (e.g., if the coder/decoder simply reduces the number of bits used to quantize the signal). Time domain algorithms such as LPC also often have low latencies, hence their popularity in speech coding for telephony. In algorithms such as MP3, however, a large number of samples have to be analyzed to implement a psychoacoustic model in the frequency domain, and latency is on the order of 23&nbsp;ms.


===== Speech encoding =====
===== Speech encoding =====
[[Speech encoding]] is an important category of audio data compression. The perceptual models used to estimate what a human ear can hear are generally somewhat different from those used for music. The range of frequencies needed to convey the sounds of a human voice are normally far narrower than that needed for music, and the sound is normally less complex. As a result, speech can be encoded at high quality using a relatively low bit rate.
[[Speech encoding]] is an important category of audio data compression. The perceptual models used to estimate what aspects of speech a human ear can hear are generally somewhat different from those used for music. The range of frequencies needed to convey the sounds of a human voice is normally far narrower than that needed for music, and the sound is normally less complex. As a result, speech can be encoded at high quality using a relatively low bit rate.

If the data to be compressed is analog (such as a voltage that varies with time), quantization is employed to digitize it into numbers (normally integers). This is referred to as analog-to-digital (A/D) conversion. If the integers generated by quantization are 8 bits each, then the entire range of the analog signal is divided into 256 intervals and all the signal values within an interval are quantized to the same number. If 16-bit integers are generated, then the range of the analog signal is divided into 65,536 intervals.


This relation illustrates the compromise between high resolution (a large number of analog intervals) and high compression (small integers generated). This application of quantization is used by several speech compression methods. This is accomplished, in general, by some combination of two approaches:
This is accomplished, in general, by some combination of two approaches:
* Only encoding sounds that could be made by a single human voice.
* Only encoding sounds that could be made by a single human voice.
* Throwing away more of the data in the signal—keeping just enough to reconstruct an "intelligible" voice rather than the full frequency range of human [[hearing (sense)|hearing]].
* Throwing away more of the data in the signal—keeping just enough to reconstruct an "intelligible" voice rather than the full frequency range of human [[hearing]].


Perhaps the earliest algorithms used in speech encoding (and audio data compression in general) were the [[A-law algorithm]] and the [[μ-law algorithm]].
The earliest algorithms used in speech encoding (and audio data compression in general) were the [[A-law algorithm]] and the [[μ-law algorithm]].


==== History ====
==== History ====
[[File:Placa-audioPC-925.jpg|right|thumb|Solidyne 922: The world's first commercial audio bit compression [[sound card]] for PC, 1990]]
[[File:Placa-audioPC-925.jpg|right|thumb|Solidyne 922: The world's first commercial audio bit compression [[sound card]] for PC, 1990]]


In 1950, [[Bell Labs]] filed the patent on [[differential pulse-code modulation]] (DPCM).<ref name="DPCM"/> [[Adaptive DPCM]] (ADPCM) was introduced by P. Cummiskey, [[Nikil Jayant|Nikil S. Jayant]] and [[James L. Flanagan]] at [[Bell Labs]] in 1973.<ref>P. Cummiskey, Nikil S. Jayant, and J. L. Flanagan, "Adaptive quantization in differential PCM coding of speech", ''Bell Syst. Tech. J.'', vol. 52, pp. 1105—1118, Sept. 1973</ref><ref>{{cite journal |last1=Cummiskey |first1=P. |last2=Jayant |first2=Nikil S. |last3=Flanagan |first3=J. L. |title=Adaptive quantization in differential PCM coding of speech |journal=The Bell System Technical Journal |date=1973 |volume=52 |issue=7 |pages=1105–1118 |doi=10.1002/j.1538-7305.1973.tb02007.x |issn=0005-8580}}</ref>
Early audio research was conducted at [[Bell Labs]]. There, in 1950, [[C. Chapin Cutler]] filed the patent on [[differential pulse-code modulation]] (DPCM).<ref name="DPCM"/> In 1973, [[Adaptive DPCM]] (ADPCM) was introduced by P. Cummiskey, [[Nikil Jayant|Nikil S. Jayant]] and [[James L. Flanagan]].<ref>{{cite journal|doi=10.1002/j.1538-7305.1973.tb02007.x|title=Adaptive Quantization in Differential PCM Coding of Speech|year=1973|last1=Cummiskey|first1=P.|last2=Jayant|first2=N. S.|last3=Flanagan|first3=J. L.|journal=Bell System Technical Journal|volume=52|issue=7|pages=1105–1118}}</ref><ref>{{cite journal |last1=Cummiskey |first1=P. |last2=Jayant |first2=Nikil S. |last3=Flanagan |first3=J. L. |title=Adaptive quantization in differential PCM coding of speech |journal=The Bell System Technical Journal |date=1973 |volume=52 |issue=7 |pages=1105–1118 |doi=10.1002/j.1538-7305.1973.tb02007.x |issn=0005-8580}}</ref>


[[Perceptual coding]] was first used for [[speech coding]] compression, with [[linear predictive coding]] (LPC).<ref name="Schroeder2014">{{cite book |last1=Schroeder |first=Manfred R. |title=Acoustics, Information, and Communication: Memorial Volume in Honor of Manfred R. Schroeder |date=2014 |publisher=Springer |isbn=9783319056609 |chapter=Bell Laboratories |page=388 |chapter-url=https://books.google.com/books?id=d9IkBAAAQBAJ&pg=PA388}}</ref> Initial concepts for LPC date back to the work of [[Fumitada Itakura]] ([[Nagoya University]]) and Shuzo Saito ([[Nippon Telegraph and Telephone]]) in 1966.<ref>{{cite journal |last1=Gray |first1=Robert M. |title=A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol |journal=Found. Trends Signal Process. |date=2010 |volume=3 |issue=4 |pages=203–303 |doi=10.1561/2000000036 |url=https://ee.stanford.edu/~gray/lpcip.pdf |issn=1932-8346}}</ref> During the 1970s, [[Bishnu S. Atal]] and [[Manfred R. Schroeder]] at [[Bell Labs]] developed a form of LPC called [[adaptive predictive coding]] (APC), a perceptual coding algorithm that exploited the masking properties of the human ear, followed in the early 1980s with the [[code-excited linear prediction]] (CELP) algorithm which achieved a significant [[compression ratio]] for its time.<ref name="Schroeder2014"/> Perceptual coding is used by modern audio compression formats such as [[MP3]]<ref name="Schroeder2014"/> and [[Advanced Audio Codec|AAC]].
[[Perceptual coding]] was first used for [[speech coding]] compression, with [[linear predictive coding]] (LPC).<ref name="Schroeder2014">{{cite book |last1=Schroeder |first1=Manfred R. |title=Acoustics, Information, and Communication: Memorial Volume in Honor of Manfred R. Schroeder |date=2014 |publisher=Springer |isbn=9783319056609 |chapter=Bell Laboratories |page=388 |chapter-url=https://books.google.com/books?id=d9IkBAAAQBAJ&pg=PA388}}</ref> Initial concepts for LPC date back to the work of [[Fumitada Itakura]] ([[Nagoya University]]) and Shuzo Saito ([[Nippon Telegraph and Telephone]]) in 1966.<ref>{{cite journal |last1=Gray |first1=Robert M. |title=A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol |journal=Found. Trends Signal Process. |date=2010 |volume=3 |issue=4 |pages=203–303 |doi=10.1561/2000000036 |url=https://ee.stanford.edu/~gray/lpcip.pdf |archive-url=https://web.archive.org/web/20100704113551/http://ee.stanford.edu/~gray/lpcip.pdf |archive-date=2010-07-04 |url-status=live |issn=1932-8346|doi-access=free }}</ref> During the 1970s, [[Bishnu S. Atal]] and [[Manfred R. Schroeder]] at [[Bell Labs]] developed a form of LPC called [[adaptive predictive coding]] (APC), a perceptual coding algorithm that exploited the masking properties of the human ear, followed in the early 1980s with the [[code-excited linear prediction]] (CELP) algorithm which achieved a significant [[data compression ratio|compression ratio]] for its time.<ref name="Schroeder2014"/> Perceptual coding is used by modern audio compression formats such as [[MP3]]<ref name="Schroeder2014"/> and [[Advanced Audio Codec|AAC]].


[[Discrete cosine transform]] (DCT), developed by [[N. Ahmed|Nasir Ahmed]], T. Natarajan and [[K. R. Rao]] in 1974,<ref name="DCT"/> provided the basis for the [[modified discrete cosine transform]] (MDCT) used by modern audio compression formats such as MP3<ref name="Guckert">{{cite web |last1=Guckert |first1=John |title=The Use of FFT and MDCT in MP3 Audio Compression |url=http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |website=[[University of Utah]] |date=Spring 2012 |access-date=14 July 2019}}</ref> and AAC. MDCT was proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987,<ref>J. P. Princen, A. W. Johnson und A. B. Bradley: ''Subband/transform coding using filter bank designs based on time domain aliasing cancellation'', IEEE Proc. Intl. Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2161–2164, 1987.</ref> following earlier work by Princen and Bradley in 1986.<ref>John P. Princen, Alan B. Bradley: ''Analysis/synthesis filter bank design based on time domain aliasing cancellation'', IEEE Trans. Acoust. Speech Signal Processing, ''ASSP-34'' (5), 1153–1161, 1986.</ref> The MDCT is used by modern audio compression formats such as [[Dolby Digital]],<ref name="Luo">{{cite book |last1=Luo |first1=Fa-Long |title=Mobile Multimedia Broadcasting Standards: Technology and Practice |date=2008 |publisher=[[Springer Science & Business Media]] |isbn=9780387782638 |page=590 |url=https://books.google.com/books?id=l6PovWat8SMC&pg=PA590}}</ref><ref>{{cite journal |last1=Britanak |first1=V. |title=On Properties, Relations, and Simplified Implementation of Filter Banks in the Dolby Digital (Plus) AC-3 Audio Coding Standards |journal=IEEE Transactions on Audio, Speech, and Language Processing |date=2011 |volume=19 |issue=5 |pages=1231–1241 |doi=10.1109/TASL.2010.2087755|s2cid=897622 }}</ref> [[MP3]],<ref name="Guckert">{{cite web |last1=Guckert |first1=John |title=The Use of FFT and MDCT in MP3 Audio Compression |url=http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |website=[[University of Utah]] |date=Spring 2012 |access-date=14 July 2019}}</ref> and [[Advanced Audio Coding]] (AAC).<ref name=brandenburg>{{cite web|url=http://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|title=MP3 and AAC Explained|last=Brandenburg|first=Karlheinz|year=1999|url-status=live|archive-url=https://web.archive.org/web/20170213191747/https://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|archive-date=2017-02-13}}</ref>
[[Discrete cosine transform]] (DCT), developed by [[N. Ahmed|Nasir Ahmed]], T. Natarajan and [[K. R. Rao]] in 1974,<ref name="DCT"/> provided the basis for the [[modified discrete cosine transform]] (MDCT) used by modern audio compression formats such as MP3,<ref name="Guckert">{{cite web |last1=Guckert |first1=John |title=The Use of FFT and MDCT in MP3 Audio Compression |url=http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |archive-url=https://web.archive.org/web/20140124152337/http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |archive-date=2014-01-24 |url-status=live |website=[[University of Utah]] |date=Spring 2012 |access-date=14 July 2019}}</ref> [[Dolby Digital]],<ref name="Luo">{{cite book |last1=Luo |first1=Fa-Long |title=Mobile Multimedia Broadcasting Standards: Technology and Practice |date=2008 |publisher=[[Springer Science & Business Media]] |isbn=9780387782638 |page=590 |url=https://books.google.com/books?id=l6PovWat8SMC&pg=PA590}}</ref><ref>{{cite journal |last1=Britanak |first1=V. |title=On Properties, Relations, and Simplified Implementation of Filter Banks in the Dolby Digital (Plus) AC-3 Audio Coding Standards |journal=IEEE Transactions on Audio, Speech, and Language Processing |date=2011 |volume=19 |issue=5 |pages=1231–1241 |doi=10.1109/TASL.2010.2087755|s2cid=897622 }}</ref> and AAC.<ref name=brandenburg>{{cite web|url=http://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|title=MP3 and AAC Explained|last=Brandenburg|first=Karlheinz|year=1999|url-status=live|archive-url=https://web.archive.org/web/20170213191747/https://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|archive-date=2017-02-13}}</ref> MDCT was proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987,<ref>{{cite book|doi=10.1109/ICASSP.1987.1169405|chapter=Subband/Transform coding using filter bank designs based on time domain aliasing cancellation|title=ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing|year=1987|last1=Princen|first1=J.|last2=Johnson|first2=A.|last3=Bradley|first3=A.|volume=12|pages=2161–2164|s2cid=58446992}}</ref> following earlier work by Princen and Bradley in 1986.<ref>{{cite journal|doi=10.1109/TASSP.1986.1164954|title=Analysis/Synthesis filter bank design based on time domain aliasing cancellation|year=1986|last1=Princen|first1=J.|last2=Bradley|first2=A.|journal=IEEE Transactions on Acoustics, Speech, and Signal Processing|volume=34|issue=5|pages=1153–1161}}</ref>


The world's first commercial [[broadcast automation]] audio compression system was developed by Oscar Bonello, an engineering professor at the [[University of Buenos Aires]].<ref name="Solidyne"/> In 1983, using the psychoacoustic principle of the masking of critical bands first published in 1967,<ref name="Zwicker"/> he started developing a practical application based on the recently developed [[IBM PC]] computer, and the broadcast automation system was launched in 1987 under the name [[Audicom]]. Twenty years later, almost all the radio stations in the world were using similar technology manufactured by a number of companies.
The world's first commercial [[broadcast automation]] audio compression system was developed by Oscar Bonello, an engineering professor at the [[University of Buenos Aires]].
<ref>{{cite news|title=Ricardo Sametband, La Nación Newspaper "Historia de un pionero en audio digital" |url=https://www.lanacion.com.ar/tecnologia/la-historia-de-un-pionero-del-audio-digital-nid187775|language=es}}</ref>
In 1983, using the psychoacoustic principle of the masking of critical bands first published in 1967,<ref name="Zwicker"/> he started developing a practical application based on the recently developed [[IBM PC]] computer, and the broadcast automation system was launched in 1987 under the name [[Audicom]].
<ref name="Solidyne">{{cite web |title=Summary of some of Solidyne's contributions to Broadcast Engineering |url=http://www.solidynepro.com/nosotros-breve-historia/ |work=Brief History of Solidyne |publisher=Buenos Aires: Solidyne |access-date=6 March 2013 |archive-url=https://web.archive.org/web/20130308063719/http://www.solidynepro.com/indexahtmlp_Hist-ENG%2Ct.htm |archive-date=8 March 2013 }}</ref>
35 years later, almost all the radio stations in the world were using this technology manufactured by a number of companies because the inventor refuses to get invention patents for his work. He prefers declaring it of Public Domain publishing it
<ref>{{cite news|language=en|title=Anuncio del Audicom, AES Journal, July-August 1992, Vol 40, # 7/8, pag 647|url=http://www.aes.org/e-lib/browse.cfm?elib=19076}}<!-- auto-translated by Module:CS1 translator --></ref>


A literature compendium for a large variety of audio coding systems was published in the IEEE's ''Journal on Selected Areas in Communications'' (''JSAC''), in February 1988. While there were some papers from before that time, this collection documented an entire variety of finished, working audio coders, nearly all of them using perceptual (i.e. masking) techniques and some kind of frequency analysis and back-end noiseless coding.<ref name="Possibilities"/> Several of these papers remarked on the difficulty of obtaining good, clean digital audio for research purposes. Most, if not all, of the authors in the ''JSAC'' edition were also active in the [[MPEG-1]] Audio committee, which created the MP3 format.
A literature compendium for a large variety of audio coding systems was published in the IEEE's ''Journal on Selected Areas in Communications'' (''JSAC''), in February 1988. While there were some papers from before that time, this collection documented an entire variety of finished, working audio coders, nearly all of them using perceptual techniques and some kind of frequency analysis and back-end noiseless coding.<ref name="Possibilities"/>


=== Video ===
=== Video ===
{{see also|Video coding format|Video codec}}
{{see also|Video coding format|Video codec}}
[[Uncompressed video]] requires a very high [[Uncompressed video#Storage and Data Rates for Uncompressed Video|data rate]]. Although [[List of codecs#Lossless video compression|lossless video compression]] codecs perform at a compression factor of 5 to 12, a typical [[H.264]] lossy compression video has a compression factor between 20 and 200.<ref name="MSU2007"/>
Video compression is a practical implementation of source coding in information theory. In practice, most video codecs are used alongside audio compression techniques to store the separate but complementary data streams as one combined package using so-called ''[[digital container format|container format]]s''.<ref name="CSIP"/>


The two key video compression techniques used in [[video coding standards]] are the DCT and [[motion compensation]] (MC). Most video coding standards, such as the [[H.26x]] and [[MPEG]] formats, typically use motion-compensated DCT video coding (block motion compensation).<ref>{{cite book |last1=Chen |first1=Jie |last2=Koc |first2=Ut-Va |last3=Liu |first3=KJ Ray |title=Design of Digital Video Coding Systems: A Complete Compressed Domain Approach |date=2001 |publisher=[[CRC Press]] |isbn=9780203904183 |page=71 |url=https://books.google.com/books?id=LUzFKU3HeegC&pg=PA71}}</ref><ref name="Li">{{cite book |last1=Li |first1=Jian Ping |title=Proceedings of the International Computer Conference 2006 on Wavelet Active Media Technology and Information Processing: Chongqing, China, 29-31 August 2006 |date=2006 |publisher=[[World Scientific]] |isbn=9789812709998 |page=847 |url=https://books.google.com/books?id=FZiK3zXdK7sC&pg=PA847}}</ref>
[[Uncompressed video]] requires a very high [[Uncompressed video#Storage and Data Rates for Uncompressed Video|data rate]]. Although [[List of codecs#Lossless video compression|lossless video compression]] codecs perform at a compression factor of 5 to 12, a typical [[H.264/MPEG-4 AVC|H.264]] lossy compression video has a compression factor between 20 and 200.<ref name="MSU2007"/>


Most video codecs are used alongside audio compression techniques to store the separate but complementary data streams as one combined package using so-called ''[[container format]]s''.<ref name="CSIP"/>
The two key video compression techniques used in [[video coding standards]] are the [[discrete cosine transform]] (DCT) and [[motion compensation]] (MC). Most video coding standards, such as the [[H.26x]] and [[MPEG]] formats, typically use motion-compensated DCT video coding (block motion compensation).<ref>{{cite book |last1=Chen |first1=Jie |last2=Koc |first2=Ut-Va |last3=Liu |first3=KJ Ray |title=Design of Digital Video Coding Systems: A Complete Compressed Domain Approach |date=2001 |publisher=[[CRC Press]] |isbn=9780203904183 |page=71 |url=https://books.google.com/books?id=LUzFKU3HeegC&pg=PA71}}</ref><ref name="Li">{{cite book |last1=Li |first1=Jian Ping |title=Proceedings of the International Computer Conference 2006 on Wavelet Active Media Technology and Information Processing: Chongqing, China, 29-31 August 2006 |date=2006 |publisher=[[World Scientific]] |isbn=9789812709998 |page=847 |url=https://books.google.com/books?id=FZiK3zXdK7sC&pg=PA847}}</ref>


==== Encoding theory ====
==== Encoding theory ====
Video data may be represented as a series of still image frames. Such data usually contains abundant amounts of spatial and temporal [[redundancy (information theory)|redundancy]]. Video compression algorithms attempt to reduce redundancy and store information more compactly.
Video data may be represented as a series of still image frames. Such data usually contains abundant amounts of spatial and temporal [[redundancy (information theory)|redundancy]]. Video compression algorithms attempt to reduce redundancy and store information more compactly.


Most [[video compression formats]] and [[video codec|codecs]] exploit both spatial and temporal redundancy (e.g. through difference coding with [[motion compensation]]). Similarities can be encoded by only storing differences between e.g. temporally adjacent frames (inter-frame coding) or spatially adjacent pixels (intra-frame coding).
Most [[video compression formats]] and [[video codec|codecs]] exploit both spatial and temporal redundancy (e.g. through difference coding with [[motion compensation]]). Similarities can be encoded by only storing differences between e.g. temporally adjacent frames (inter-frame coding) or spatially adjacent pixels (intra-frame coding). [[Inter-frame]] compression (a temporal [[delta encoding]]) (re)uses data from one or more earlier or later frames in a sequence to describe the current frame. [[Intra-frame coding]], on the other hand, uses only data from within the current frame, effectively being still-image compression.<ref name="faxin47"/>
[[inter frame|Inter-frame]] compression (a temporal [[delta encoding]]) is one of the most powerful compression techniques. It (re)uses data from one or more earlier or later frames in a sequence to describe the current frame. [[Intra-frame coding]], on the other hand, uses only data from within the current frame, effectively being still-[[image compression]].<ref name="faxin47"/>


A [[Video coding format#Intra-frame video coding formats|class of specialized formats]] used in camcorders and video editing use less complex compression schemes that restrict their prediction techniques to intra-frame prediction.
The [[Video coding format#Intra-frame video coding formats|intra-frame video coding formats]] used in camcorders and video editing employ simpler compression that uses only intra-frame prediction. This simplifies video editing software, as it prevents a situation in which a compressed frame refers to data that the editor has deleted.


Usually video compression additionally employs [[lossy compression]] techniques like [[quantization (image processing)|quantization]] that reduce aspects of the source data that are (more or less) irrelevant to the human visual perception by exploiting perceptual features of human vision. For example, small differences in color are more difficult to perceive than are changes in brightness. Compression algorithms can average a color across these similar areas to reduce space, in a manner similar to those used in [[JPEG]] image compression.<ref name="TomLane"/> As in all lossy compression, there is a [[trade-off]] between [[video quality]] and [[bit rate]], cost of processing the compression and decompression, and system requirements. Highly compressed video may present visible or distracting [[compression artifact|artifacts]].
Usually, video compression additionally employs [[lossy compression]] techniques like [[quantization (image processing)|quantization]] that reduce aspects of the source data that are (more or less) irrelevant to the human visual perception by exploiting perceptual features of human vision. For example, small differences in color are more difficult to perceive than are changes in brightness. Compression algorithms can average a color across these similar areas in a manner similar to those used in JPEG image compression.<ref name="TomLane"/> As in all lossy compression, there is a [[trade-off]] between [[video quality]] and [[bit rate]], cost of processing the compression and decompression, and system requirements. Highly compressed video may present visible or distracting [[compression artifact|artifacts]].


Other methods than the prevalent DCT-based transform formats, such as [[fractal compression]], [[matching pursuit]] and the use of a [[discrete wavelet transform]] (DWT), have been the subject of some research, but are typically not used in practical products (except for the use of [[wavelet compression|wavelet coding]] as still-image coders without motion compensation). Interest in fractal compression seems to be waning, due to recent theoretical analysis showing a comparative lack of effectiveness of such methods.<ref name="faxin47"/>
Other methods other than the prevalent DCT-based transform formats, such as [[fractal compression]], [[matching pursuit]] and the use of a [[discrete wavelet transform]] (DWT), have been the subject of some research, but are typically not used in practical products. [[Wavelet compression]] is used in still-image coders and video coders without motion compensation. Interest in fractal compression seems to be waning, due to recent theoretical analysis showing a comparative lack of effectiveness of such methods.<ref name="faxin47"/>


===== Inter-frame coding =====
===== Inter-frame coding =====
{{main|Inter frame}}
{{main|Inter frame}}
{{see|Motion compensation}}
{{further|Motion compensation}}


Inter-frame coding works by comparing each frame in the video with the previous one. Individual frames of a video sequence are compared from one frame to the next, and the [[video codec|video compression codec]] sends only the [[residual frame|differences]] to the reference frame. If the frame contains areas where nothing has moved, the system can simply issue a short command that copies that part of the previous frame into the next one. If sections of the frame move in a simple manner, the compressor can emit a (slightly longer) command that tells the decompressor to shift, rotate, lighten, or darken the copy. This longer command still remains much shorter than intraframe compression. Usually the encoder will also transmit a residue signal which describes the remaining more subtle differences to the reference imagery. Using entropy coding, these residue signals have a more compact representation than the full signal. In areas of video with more motion, the compression must encode more data to keep up with the larger number of pixels that are changing. Commonly during explosions, flames, flocks of animals, and in some panning shots, the high-frequency detail leads to quality decreases or to increases in the [[variable bitrate]].
In inter-frame coding, individual frames of a video sequence are compared from one frame to the next, and the [[video codec|video compression codec]] records the [[residual frame|differences]] to the reference frame. If the frame contains areas where nothing has moved, the system can simply issue a short command that copies that part of the previous frame into the next one. If sections of the frame move in a simple manner, the compressor can emit a (slightly longer) command that tells the decompressor to shift, rotate, lighten, or darken the copy. This longer command still remains much shorter than data generated by intra-frame compression. Usually, the encoder will also transmit a residue signal which describes the remaining more subtle differences to the reference imagery. Using entropy coding, these residue signals have a more compact representation than the full signal. In areas of video with more motion, the compression must encode more data to keep up with the larger number of pixels that are changing. Commonly during explosions, flames, flocks of animals, and in some panning shots, the high-frequency detail leads to quality decreases or to increases in the [[variable bitrate]].


==== Hybrid block-based transform formats ====
==== Hybrid block-based transform formats ====
{{See|Discrete cosine transform}}
[[File:Hybrid video encoder processing stages.svg|thumb|upright=2|Processing stages of a typical video encoder]]
[[File:Hybrid video encoder processing stages.svg|thumb|upright=2|Processing stages of a typical video encoder]]
Today, nearly all commonly used video compression methods (e.g., those in standards approved by the [[ITU-T]] or [[International Organization for Standardization|ISO]]) share the same basic architecture that dates back to [[H.261]] which was standardized in 1988 by the ITU-T. They mostly rely on the DCT, applied to rectangular blocks of neighboring pixels, and temporal prediction using [[motion vector]]s, as well as nowadays also an in-loop filtering step.


Many commonly used video compression methods (e.g., those in standards approved by the [[ITU-T]] or [[ISO]]) share the same basic architecture that dates back to [[H.261]] which was standardized in 1988 by the ITU-T. They mostly rely on the DCT, applied to rectangular blocks of neighboring pixels, and temporal prediction using [[motion vector]]s, as well as nowadays also an in-loop filtering step.
In the prediction stage, various [[data deduplication|deduplication]] and difference-coding techniques are applied that help decorrelate data and describe new data based on already transmitted data.


In the prediction stage, various [[data deduplication|deduplication]] and difference-coding techniques are applied that help decorrelate data and describe new data based on already transmitted data.
Then rectangular blocks of (residue) [[pixel]] data are transformed to the frequency domain to ease targeting irrelevant information in quantization and for some spatial redundancy reduction. The [[discrete cosine transform]] (DCT) that is widely used in this regard was introduced by [[N. Ahmed]], T. Natarajan and [[K. R. Rao]] in 1974.<ref name="DCT"/>


In the main lossy processing stage that data gets quantized in order to reduce information that is irrelevant to human visual perception.
Then rectangular blocks of remaining [[pixel]] data are transformed to the frequency domain. In the main lossy processing stage, frequency domain data gets quantized in order to reduce information that is irrelevant to human visual perception.


In the last stage statistical redundancy gets largely eliminated by an [[entropy encoding|entropy coder]] which often applies some form of arithmetic coding.
In the last stage statistical redundancy gets largely eliminated by an [[entropy coder]] which often applies some form of arithmetic coding.


In an additional in-loop filtering stage various filters can be applied to the reconstructed image signal. By computing these filters also inside the encoding loop they can help compression because they can be applied to reference material before it gets used in the prediction process and they can be guided using the original signal. The most popular example are [[deblocking filter]]s that blur out blocking artefacts from quantization discontinuities at transform block boundaries.
In an additional in-loop filtering stage various filters can be applied to the reconstructed image signal. By computing these filters also inside the encoding loop they can help compression because they can be applied to reference material before it gets used in the prediction process and they can be guided using the original signal. The most popular example are [[deblocking filter]]s that blur out blocking artifacts from quantization discontinuities at transform block boundaries.


==== History ====
==== History ====
{{Main|Video coding format#History}}
{{Main|Video coding format#History}}


In 1967, A.H. Robinson and C. Cherry proposed a [[run-length encoding]] bandwidth compression scheme for the transmission of analog television signals.<ref name="robinson">{{cite journal |author1-last=Robinson |author1-first=A. H. |author2-last=Cherry |author2-first=C. |title=Results of a prototype television bandwidth compression scheme |journal=[[Proceedings of the IEEE]] |publisher=[[IEEE]] |volume=55 |number=3 |date=1967 |pages=356–364 |doi=10.1109/PROC.1967.5493}}</ref> [[Discrete cosine transform]] (DCT), which is fundamental to modern video compression,<ref name="Ghanbari"/> was introduced by [[N. Ahmed|Nasir Ahmed]], T. Natarajan and [[K. R. Rao]] in 1974.<ref name="DCT"/><ref name="patents"/>
In 1967, A.H. Robinson and C. Cherry proposed a [[run-length encoding]] bandwidth compression scheme for the transmission of analog television signals.<ref name="robinson">{{cite journal |author1-last=Robinson |author1-first=A. H. |author2-last=Cherry |author2-first=C. |title=Results of a prototype television bandwidth compression scheme |journal=[[Proceedings of the IEEE]] |publisher=[[IEEE]] |volume=55 |number=3 |date=1967 |pages=356–364 |doi=10.1109/PROC.1967.5493}}</ref> The DCT, which is fundamental to modern video compression,<ref name="Ghanbari"/> was introduced by [[N. Ahmed|Nasir Ahmed]], T. Natarajan and [[K. R. Rao]] in 1974.<ref name="DCT"/><ref name="patents"/>


[[H.261]], which debuted in 1988, commercially introduced the prevalent basic architecture of video compression technology.<ref name="history">http://www.real.com/resources/digital-video-file-formats/</ref> It was the first [[video coding format]] based on DCT compression, which would subsequently become the standard for all of the major video coding formats that followed.<ref name="Ghanbari">{{cite book |last1=Ghanbari |first1=Mohammed |title=Standard Codecs: Image Compression to Advanced Video Coding |date=2003 |publisher=[[Institution of Engineering and Technology]] |isbn=9780852967102 |pages=1–2 |url=https://books.google.com/books?id=7XuU8T3ooOAC&pg=PA1}}</ref> H.261 was developed by a number of companies, including [[Hitachi]], [[PictureTel]], [[Nippon Telegraph and Telephone|NTT]], [[BT plc|BT]] and [[Toshiba]].<ref>{{cite web |title=Patent statement declaration registered as H261-07 |url=https://www.itu.int/net4/ipr/details_ps.aspx?sector=ITU-T&id=H261-07 |website=ITU |access-date=11 July 2019}}</ref>
[[H.261]], which debuted in 1988, commercially introduced the prevalent basic architecture of video compression technology.<ref name="history">{{Cite web|url=http://www.real.com/resources/digital-video-file-formats/|title=The History of Video File Formats Infographic — RealPlayer|date=22 April 2012}}</ref> It was the first [[video coding format]] based on DCT compression.<ref name="Ghanbari">{{cite book |last1=Ghanbari |first1=Mohammed |title=Standard Codecs: Image Compression to Advanced Video Coding |date=2003 |publisher=[[Institution of Engineering and Technology]] |isbn=9780852967102 |pages=1–2 |url=https://books.google.com/books?id=7XuU8T3ooOAC&pg=PA1}}</ref> H.261 was developed by a number of companies, including [[Hitachi]], [[PictureTel]], [[Nippon Telegraph and Telephone|NTT]], [[BT plc|BT]] and [[Toshiba]].<ref>{{cite web |title=Patent statement declaration registered as H261-07 |url=https://www.itu.int/net4/ipr/details_ps.aspx?sector=ITU-T&id=H261-07 |website=ITU |access-date=11 July 2019}}</ref>


The most popular [[video coding standard]]s used for codecs have been the [[MPEG]] standards. [[MPEG-1]] was developed by the [[Motion Picture Experts Group]] (MPEG) in 1991, and it was designed to compress [[VHS]]-quality video. It was succeeded in 1994 by [[MPEG-2]]/[[H.262/MPEG-2 Part 2|H.262]],<ref name="history"/> which was developed by a number of companies, primarily [[Sony]], [[Technicolor SA|Thomson]] and [[Mitsubishi Electric]].<ref name="mp2-patents">{{cite web |title=MPEG-2 Patent List |url=https://www.mpegla.com/wp-content/uploads/m2-att1.pdf |website=[[MPEG LA]] |access-date=7 July 2019}}</ref> MPEG-2 became the standard video format for [[DVD]] and [[Standard definition|SD]] [[digital television]].<ref name="history"/> In 1999, it was followed by [[MPEG-4 Visual|MPEG-4]]/[[H.263]], which was a major leap forward for video compression technology.<ref name="history"/> It was developed by a number of companies, primarily Mitsubishi Electric, [[Hitachi]] and [[Panasonic]].<ref name="mp4-patents">{{cite web |title=MPEG-4 Visual - Patent List |url=https://www.mpegla.com/wp-content/uploads/m4v-att1.pdf |website=[[MPEG LA]] |access-date=6 July 2019}}</ref>
The most popular [[video coding standard]]s used for codecs have been the [[MPEG]] standards. [[MPEG-1]] was developed by the [[Motion Picture Experts Group]] (MPEG) in 1991, and it was designed to compress [[VHS]]-quality video. It was succeeded in 1994 by [[MPEG-2]]/[[H.262]],<ref name="history"/> which was developed by a number of companies, primarily [[Sony]], [[Technicolor SA|Thomson]] and [[Mitsubishi Electric]].<ref name="mp2-patents">{{cite web |title=MPEG-2 Patent List |url=https://www.mpegla.com/wp-content/uploads/m2-att1.pdf |archive-url=https://web.archive.org/web/20190529164140/https://www.mpegla.com/wp-content/uploads/m2-att1.pdf |archive-date=2019-05-29 |url-status=live |website=[[MPEG LA]] |access-date=7 July 2019}}</ref> MPEG-2 became the standard video format for [[DVD]] and [[SD digital television]].<ref name="history"/> In 1999, it was followed by [[MPEG-4 Visual|MPEG-4]]/[[H.263]].<ref name="history"/> It was also developed by a number of companies, primarily Mitsubishi Electric, [[Hitachi]] and [[Panasonic]].<ref name="mp4-patents">{{cite web |title=MPEG-4 Visual - Patent List |url=https://www.mpegla.com/wp-content/uploads/m4v-att1.pdf |archive-url=https://web.archive.org/web/20190706184528/https://www.mpegla.com/wp-content/uploads/m4v-att1.pdf |archive-date=2019-07-06 |url-status=live |website=[[MPEG LA]] |access-date=6 July 2019}}</ref>


The most widely used video coding format is [[H.264/MPEG-4 AVC]]. It was developed in 2003 by a number of organizations, primarily Panasonic, [[Godo kaisha|Godo Kaisha IP Bridge]] and [[LG Electronics]].<ref name="avc-patents">{{cite web |title=AVC/H.264 {{ndash}} Patent List |url=https://www.mpegla.com/wp-content/uploads/avc-att1.pdf |website=MPEG LA |access-date=6 July 2019}}</ref> AVC commercially introduced the modern [[context-adaptive binary arithmetic coding]] (CABAC) and [[context-adaptive variable-length coding]] (CAVLC) algorithms. AVC is the main video encoding standard for [[Blu-ray Disc]]s, and is widely used by streaming internet services such as [[YouTube]], [[Netflix]], [[Vimeo]], and [[iTunes Store]], web software such as [[Adobe Flash Player]] and [[Microsoft Silverlight]], and various [[HDTV]] broadcasts over terrestrial and satellite television.
[[H.264/MPEG-4 AVC]] was developed in 2003 by a number of organizations, primarily Panasonic, [[Godo kaisha|Godo Kaisha IP Bridge]] and [[LG Electronics]].<ref name="avc-patents">{{cite web |title=AVC/H.264 {{ndash}} Patent List |url=https://www.mpegla.com/wp-content/uploads/avc-att1.pdf |website=MPEG LA |access-date=6 July 2019}}</ref> AVC commercially introduced the modern [[context-adaptive binary arithmetic coding]] (CABAC) and [[context-adaptive variable-length coding]] (CAVLC) algorithms. AVC is the main video encoding standard for [[Blu-ray Disc]]s, and is widely used by video sharing websites and streaming internet services such as [[YouTube]], [[Netflix]], [[Vimeo]], and [[iTunes Store]], web software such as [[Adobe Flash Player]] and [[Microsoft Silverlight]], and various [[HDTV]] broadcasts over terrestrial and satellite television.


===Genetics===
===Genetics===
[[Genetics compression algorithms]] are the latest generation of lossless algorithms that compress data (typically sequences of nucleotides) using both conventional compression algorithms and genetic algorithms adapted to the specific datatype. In 2012, a team of scientists from Johns Hopkins University published a genetic compression algorithm that does not use a reference genome for compression. HAPZIPPER was tailored for [[HapMap]] data and achieves over 20-fold compression (95% reduction in file size), providing 2- to 4-fold better compression and is less computationally intensive than the leading general-purpose compression utilities. For this, Chanda, Elhaik, and Bader introduced MAF-based encoding (MAFE), which reduces the heterogeneity of the dataset by sorting SNPs by their minor allele frequency, thus homogenizing the dataset.<ref name="HapZipper"/> Other algorithms developed in 2009 and 2013 (DNAZip and GenomeZip) have compression ratios of up to 1200-fold—allowing 6 billion basepair diploid human genomes to be stored in 2.5 megabytes (relative to a reference genome or averaged over many genomes).<ref name="genome email"/><ref name="genome contracts"/> For a benchmark in genetics/genomics data compressors, see <ref name="Morteza"/>
{{see also|Compression of Genomic Re-Sequencing Data}}
[[Compression of Genomic Re-Sequencing Data|Genetics compression algorithms]] are the latest generation of lossless algorithms that compress data (typically sequences of nucleotides) using both conventional compression algorithms and genetic algorithms adapted to the specific datatype. In 2012, a team of scientists from Johns Hopkins University published a genetic compression algorithm that does not use a reference genome for compression. HAPZIPPER was tailored for [[International HapMap Project|HapMap]] data and achieves over 20-fold compression (95% reduction in file size), providing 2- to 4-fold better compression and in much faster time than the leading general-purpose compression utilities. For this, Chanda, Elhaik, and Bader introduced MAF based encoding (MAFE), which reduces the heterogeneity of the dataset by sorting SNPs by their minor allele frequency, thus homogenizing the dataset.<ref name="HapZipper"/> Other algorithms in 2009 and 2013 (DNAZip and GenomeZip) have compression ratios of up to 1200-fold—allowing 6 billion basepair diploid human genomes to be stored in 2.5 megabytes (relative to a reference genome or averaged over many genomes).<ref name="genome email"/><ref name="genome contracts"/> For a benchmark in genetics/genomics data compressors, see <ref name="Morteza"/>


== Outlook and currently unused potential ==
== Outlook and currently unused potential ==
Line 178: Line 192:


== See also ==
== See also ==
{{div col|colwidth=18em}}
{{div col|colwidth=20em}}
* [[Auditory masking]]
* [[HTTP compression]]
* [[HTTP compression]]
* [[Kolmogorov complexity]]
* [[Kolmogorov complexity]]
* [[Magic compression algorithm]]
* [[Minimum description length]]
* [[Minimum description length]]
* [[Modulo-N code]]
* [[Modulo-N code]]
* [[Motion coding]]
* [[Motion coding]]
* [[Perceptual audio coder]]
* [[Range coding]]
* [[Range encoding]]
* [[Set redundancy compression]]
* [[Sub-band coding]]
* [[Sub-band coding]]
* [[Universal code (data compression)]]
* [[Universal code (data compression)]]
Line 194: Line 206:


== References ==
== References ==
{{reflist|30em|refs=
{{reflist|refs=
<ref name="Wade">{{cite book
<ref name="Wade">{{cite book
| last = Wade
| last = Wade
Line 206: Line 218:
| isbn = 978-0-521-42336-6
| isbn = 978-0-521-42336-6
| page = 34
| page = 34
| quote = The broad objective of source coding is to exploit or remove 'inefficient' redundancy in the [[Pulse-code modulation|PCM]] source and thereby achieve a reduction in the overall source rate R.
| quote = The broad objective of source coding is to exploit or remove 'inefficient' redundancy in the [[PCM]] source and thereby achieve a reduction in the overall source rate R.
}}</ref>
}}</ref>


<ref name="mahdi53">{{cite journal|last=Mahdi|first=O.A.|author2=Mohammed, M.A. |author3=Mohamed, A.J. |title=Implementing a Novel Approach an Convert Audio Compression to Text Coding via Hybrid Technique|journal=International Journal of Computer Science Issues|date=November 2012|volume=9|issue=6, No. 3|pages=53–59|url=http://ijcsi.org/papers/IJCSI-9-6-3-53-59.pdf|access-date=6 March 2013}}</ref>
<ref name="mahdi53">{{cite journal|last=Mahdi|first=O.A.|author2=Mohammed, M.A. |author3=Mohamed, A.J. |title=Implementing a Novel Approach an Convert Audio Compression to Text Coding via Hybrid Technique|journal=International Journal of Computer Science Issues|date=November 2012|volume=9|issue=6, No. 3|pages=53–59|url=http://ijcsi.org/papers/IJCSI-9-6-3-53-59.pdf |archive-url=https://web.archive.org/web/20130320103924/http://ijcsi.org/papers/IJCSI-9-6-3-53-59.pdf |archive-date=2013-03-20 |url-status=live|access-date=6 March 2013}}</ref>


<ref name="PujarKadlaskar">{{cite journal|last=Pujar|first=J.H.|author2=Kadlaskar, L.M.|title=A New Lossless Method of Image Compression and Decompression Using Huffman Coding Techniques|journal=Journal of Theoretical and Applied Information Technology|date=May 2010|volume=15|issue=1|pages=18–23|url=http://www.jatit.org/volumes/research-papers/Vol15No1/3Vol15No1.pdf}}</ref>
<ref name="PujarKadlaskar">{{cite journal|last=Pujar|first=J.H.|author2=Kadlaskar, L.M.|title=A New Lossless Method of Image Compression and Decompression Using Huffman Coding Techniques|journal=Journal of Theoretical and Applied Information Technology|date=May 2010|volume=15|issue=1|pages=18–23|url=http://www.jatit.org/volumes/research-papers/Vol15No1/3Vol15No1.pdf |archive-url=https://web.archive.org/web/20100524105217/http://www.jatit.org/volumes/research-papers/Vol15No1/3Vol15No1.pdf |archive-date=2010-05-24 |url-status=live}}</ref>


<ref name="Salomon">{{cite book |last=Salomon |first=David |title=A Concise Introduction to Data Compression |year=2008 |publisher=Springer |location=Berlin |isbn=9781848000728}}</ref>
<ref name="Salomon">{{cite book |last=Salomon |first=David |title=A Concise Introduction to Data Compression |year=2008 |publisher=Springer |location=Berlin |isbn=9781848000728}}</ref>


<ref name="Tank">{{cite book |last=Tank |first=M.K. |title=Thinkquest 2010: Proceedings of the First International Conference on Contours of Computing Technology |year=2011 |publisher=Springer |location=Berlin |pages=275–283|doi=10.1007/978-81-8489-989-4_51 |chapter=Implementation of Lempel-ZIV algorithm for lossless compression using VHDL |isbn=978-81-8489-988-7 }}</ref>
<ref name="MittalVetter">{{citation |author1=S. Mittal |author2=J. Vetter |title=A Survey Of Architectural Approaches for Data Compression in Cache and Main Memory Systems |journal=IEEE Transactions on Parallel and Distributed Systems |volume=27 |issue=5 |pages=1524–1536 |date=2015 |publisher=IEEE|doi=10.1109/TPDS.2015.2435788 |s2cid=11706516 }}</ref>


<ref name="Optimized LZW">{{cite journal|last=Navqi|first=Saud|author2=Naqvi, R. |author3=Riaz, R.A. |author4= Siddiqui, F. |title=Optimized RTL design and implementation of LZW algorithm for high bandwidth applications|journal=Electrical Review|date=April 2011|volume=2011|issue=4|pages=279–285|url=http://pe.org.pl/articles/2011/4/68.pdf |archive-url=https://web.archive.org/web/20130520105146/http://pe.org.pl/articles/2011/4/68.pdf |archive-date=2013-05-20 |url-status=live}}</ref>
<ref name="Tank">{{cite book |last=Tank |first=M.K. |title=Implementation of Limpel-Ziv algorithm for lossless compression using VHDL |work=Thinkquest 2010: Proceedings of the First International Conference on Contours of Computing Technology |year=2011 |publisher=Springer |location=Berlin |pages=275–283|doi=10.1007/978-81-8489-989-4_51 |chapter=Implementation of Lempel-ZIV algorithm for lossless compression using VHDL |isbn=978-81-8489-988-7 }}</ref>

<ref name="Optimized LZW">{{cite journal|last=Navqi|first=Saud|author2=Naqvi, R. |author3=Riaz, R.A. |author4= Siddiqui, F. |title=Optimized RTL design and implementation of LZW algorithm for high bandwidth applications|journal=Electrical Review|date=April 2011|volume=2011|issue=4|pages=279–285|url=http://pe.org.pl/articles/2011/4/68.pdf}}</ref>


<ref name="Wolfram">{{cite book|last=Wolfram|first=Stephen|title=A New Kind of Science|publisher=Wolfram Media, Inc.|year=2002|page=[https://archive.org/details/newkindofscience00wolf/page/1069 1069]|isbn=978-1-57955-008-0|url-access=registration|url=https://archive.org/details/newkindofscience00wolf/page/1069}}</ref>
<ref name="Wolfram">{{cite book|last=Wolfram|first=Stephen|title=A New Kind of Science|publisher=Wolfram Media, Inc.|year=2002|page=[https://archive.org/details/newkindofscience00wolf/page/1069 1069]|isbn=978-1-57955-008-0|url-access=registration|url=https://archive.org/details/newkindofscience00wolf/page/1069}}</ref>


<ref name="mahmud2">{{cite journal|last=Mahmud|first=Salauddin|title=An Improved Data Compression Method for General Data|journal=International Journal of Scientific & Engineering Research|date=March 2012|volume=3|issue=3|page=2|url=http://www.ijser.org/researchpaper%5CAn-Improved-Data-Compression-Method-for-General-Data.pdf|access-date=6 March 2013}}</ref>
<ref name="mahmud2">{{cite journal|last=Mahmud|first=Salauddin|title=An Improved Data Compression Method for General Data|journal=International Journal of Scientific & Engineering Research|date=March 2012|volume=3|issue=3|page=2|url=http://www.ijser.org/researchpaper%5CAn-Improved-Data-Compression-Method-for-General-Data.pdf |archive-url=https://web.archive.org/web/20131102022116/http://www.ijser.org/researchpaper%5CAn-Improved-Data-Compression-Method-for-General-Data.pdf |archive-date=2013-11-02 |url-status=live|access-date=6 March 2013}}</ref>


<ref name=TomLane>{{cite web|last=Lane|first=Tom|title=JPEG Image Compression FAQ, Part 1|url=http://www.faqs.org/faqs/jpeg-faq/part1/|work=Internet FAQ Archives|publisher=Independent JPEG Group|access-date=6 March 2013}}</ref>
<ref name=TomLane>{{cite web|last=Lane|first=Tom|title=JPEG Image Compression FAQ, Part 1|url=http://www.faqs.org/faqs/jpeg-faq/part1/|work=Internet FAQ Archives|publisher=Independent JPEG Group|access-date=6 March 2013}}</ref>


<ref name="HEVC">{{cite journal|title=Overview of the High Efficiency Video Coding (HEVC) Standard |author=G. J. Sullivan |author2=J.-R. Ohm |author3=W.-J. Han |author4-link=Thomas Wiegand |author4=T. Wiegand |journal=IEEE Transactions on Circuits and Systems for Video Technology |publisher=[[IEEE]] |volume=22 |issue=12 |pages=1649–1668 |date=December 2012 |author-link=Gary Sullivan (engineer) |doi=10.1109/TCSVT.2012.2221191 |doi-access=free }}</ref>
<ref name="HEVC">{{cite journal|title=Overview of the High Efficiency Video Coding (HEVC) Standard |author=G. J. Sullivan |author2=J.-R. Ohm |author3=W.-J. Han |author4-link=Thomas Wiegand |author4=T. Wiegand |journal=IEEE Transactions on Circuits and Systems for Video Technology |publisher=[[IEEE]] |volume=22 |issue=12 |pages=1649–1668 |date=December 2012 |author-link=Gary Sullivan (engineer) |doi=10.1109/TCSVT.2012.2221191 |s2cid=64404 |doi-access= }}</ref>


<ref name="Arcangel">{{cite web|last=Arcangel|first=Cory|title=On Compression|url=http://www.coryarcangel.com/downloads/Cory-Arcangel-OnC.pdf|access-date=6 March 2013}}</ref>
<ref name="Arcangel">{{cite web|last=Arcangel|first=Cory|title=On Compression|url=http://www.coryarcangel.com/downloads/Cory-Arcangel-OnC.pdf |archive-url=https://web.archive.org/web/20130728082920/http://www.coryarcangel.com/downloads/Cory-Arcangel-OnC.pdf |archive-date=2013-07-28 |url-status=live|access-date=6 March 2013}}</ref>


<ref name="Marak">{{cite web|last=Marak|first=Laszlo|title=On image compression|url=http://www.ujoimro.com/resources/Laszlo_Marak_image_compression.pdf|publisher=University of Marne la Vallee|access-date=6 March 2013|archive-url=https://web.archive.org/web/20150528012028/http://www.ujoimro.com/resources/Laszlo_Marak_image_compression.pdf|archive-date=28 May 2015|url-status=dead|df=dmy-all}}</ref>
<ref name="Marak">{{cite web|last=Marak|first=Laszlo|title=On image compression|url=http://www.ujoimro.com/resources/Laszlo_Marak_image_compression.pdf|publisher=University of Marne la Vallee|access-date=6 March 2013|archive-url=https://web.archive.org/web/20150528012028/http://www.ujoimro.com/resources/Laszlo_Marak_image_compression.pdf|archive-date=28 May 2015|url-status=dead|df=dmy-all}}</ref>
Line 235: Line 245:
<ref name="Mahoney">{{cite web|last=Mahoney|first=Matt|title=Rationale for a Large Text Compression Benchmark|url=http://cs.fit.edu/~mmahoney/compression/rationale.html|publisher=Florida Institute of Technology|access-date=5 March 2013}}</ref>
<ref name="Mahoney">{{cite web|last=Mahoney|first=Matt|title=Rationale for a Large Text Compression Benchmark|url=http://cs.fit.edu/~mmahoney/compression/rationale.html|publisher=Florida Institute of Technology|access-date=5 March 2013}}</ref>


<ref name="Market Efficiency">{{cite journal|author1=Shmilovici A. |author2=Kahiri Y. |author3=Ben-Gal I. |author4=Hauser S. |title= Measuring the Efficiency of the Intraday Forex Market with a Universal Data Compression Algorithm|url= http://www.eng.tau.ac.il/~bengal/28.pdf|journal= Computational Economics|volume= 33|issue= 2|pages= 131–154|year= 2009 |doi=10.1007/s10614-008-9153-3|citeseerx=10.1.1.627.3751 |s2cid=17234503 }}</ref>
<ref name="Market Efficiency">{{cite journal|author1=Shmilovici A. |author2=Kahiri Y. |author3=Ben-Gal I. |author4=Hauser S. |title= Measuring the Efficiency of the Intraday Forex Market with a Universal Data Compression Algorithm|url= http://www.eng.tau.ac.il/~bengal/28.pdf |archive-url=https://web.archive.org/web/20090709143601/http://www.eng.tau.ac.il/~bengal/28.pdf |archive-date=2009-07-09 |url-status=live|journal= Computational Economics|volume= 33|issue= 2|pages= 131–154|year= 2009 |doi=10.1007/s10614-008-9153-3|citeseerx=10.1.1.627.3751 |s2cid=17234503 }}</ref>


<ref name="Ben-Gal">{{cite journal|author= I. Ben-Gal|title= On the Use of Data Compression Measures to Analyze Robust Designs|url = http://www.eng.tau.ac.il/~bengal/Journal%20Paper.pdf|journal=IEEE Transactions on Reliability|volume= 54|issue= 3|pages= 381–388|year= 2008|doi= 10.1109/TR.2005.853280|s2cid= 9376086}}</ref>
<ref name="Ben-Gal">{{cite journal|author= I. Ben-Gal|title= On the Use of Data Compression Measures to Analyze Robust Designs|url = http://www.eng.tau.ac.il/~bengal/Journal%20Paper.pdf|journal=IEEE Transactions on Reliability|volume= 54|issue= 3|pages= 381–388|year= 2008|doi= 10.1109/TR.2005.853280|s2cid= 9376086}}</ref>


<ref name="ScullyBrodley">{{Cite journal|author1=D. Scully|author2=Carla E. Brodley|s2cid=12311412|author2-link=Carla Brodley|date=2006|title=Compression and machine learning: A new perspective on feature space vectors|journal=Data Compression Conference, 2006|page=332|doi=10.1109/DCC.2006.13|isbn=0-7695-2545-8}}</ref>
<ref name="ScullyBrodley">{{Cite book|author1=D. Scully|author2=Carla E. Brodley|title=Data Compression Conference (DCC'06) |chapter=Compression and Machine Learning: A New Perspective on Feature Space Vectors |s2cid=12311412|author2-link=Carla Brodley|date=2006|page=332|doi=10.1109/DCC.2006.13|isbn=0-7695-2545-8}}</ref>


<ref name="RFC 3284">{{cite web|last=Korn|first=D.|title=RFC 3284: The VCDIFF Generic Differencing and Compression Data Format|url=http://tools.ietf.org/html/rfc3284|publisher=Internet Engineering Task Force|access-date=5 March 2013|display-authors=etal}}</ref>
<ref name="RFC 3284">{{cite web|last=Korn|first=D.|title=RFC 3284: The VCDIFF Generic Differencing and Compression Data Format|date=July 2002 |url=http://tools.ietf.org/html/rfc3284|publisher=Internet Engineering Task Force|access-date=5 March 2013|display-authors=etal}}</ref>


<ref name="Vdelta">{{cite book| first1=D.G. | last1 = Korn | first2 = K.P. | last2=Vo |title=Vdelta: Differencing and Compression | series=Practical Reusable Unix Software | editor = B. Krishnamurthy | location=New York | publisher=John Wiley & Sons, Inc.| year = 1995}}</ref>
<ref name="Vdelta">{{cite book| first1=D.G. | last1 = Korn | first2 = K.P. | last2=Vo |title=Vdelta: Differencing and Compression | series=Practical Reusable Unix Software | editor = B. Krishnamurthy | location=New York | publisher=John Wiley & Sons, Inc.| year = 1995}}</ref>
Line 255: Line 265:
<ref name="faxin47">{{cite book|author1=Faxin Yu |author2=Hao Luo |author3=Zheming Lu |title=Three-Dimensional Model Analysis and Processing|url=https://archive.org/details/threedimensional00yufa |url-access=limited |year=2010|publisher=Springer|location=Berlin|isbn=9783642126512|page=[https://archive.org/details/threedimensional00yufa/page/n62 47]}}</ref>
<ref name="faxin47">{{cite book|author1=Faxin Yu |author2=Hao Luo |author3=Zheming Lu |title=Three-Dimensional Model Analysis and Processing|url=https://archive.org/details/threedimensional00yufa |url-access=limited |year=2010|publisher=Springer|location=Berlin|isbn=9783642126512|page=[https://archive.org/details/threedimensional00yufa/page/n62 47]}}</ref>


<ref name="Possibilities">{{cite web|title=File Compression Possibilities|url=https://www.gadgetcouncil.com/compress-1GB-files-into-10-mb/|work=A Brief guide to compress a file in 4 different ways}}</ref>
<ref name="Possibilities">{{cite web|title=File Compression Possibilities|url=https://www.gadgetcouncil.com/compress-1GB-files-into-10-mb/|work=A Brief guide to compress a file in 4 different ways|date=17 February 2017}}</ref>

<ref name="Solidyne">{{cite web |title=Summary of some of Solidyne's contributions to Broadcast Engineering |url=http://www.solidynepro.com/indexahtmlp_Hist-ENG,t.htm |work=Brief History of Solidyne |publisher=Buenos Aires: Solidyne |access-date=6 March 2013 |url-status=dead |archive-url=https://web.archive.org/web/20130308063719/http://www.solidynepro.com/indexahtmlp_Hist-ENG%2Ct.htm |archive-date=8 March 2013 }}</ref>


<ref name="Zwicker">{{cite book|last=Zwicker|first=Eberhard|title=The Ear As A Communication Receiver|year=1967|publisher=Acoustical Society of America|location=Melville, NY|url=http://asa.aip.org/books/ear.html|display-authors=etal|access-date=2011-11-11|archive-url=https://web.archive.org/web/20000914080525/http://asa.aip.org/books/ear.html|archive-date=2000-09-14|url-status=dead}}</ref>
<ref name="Zwicker">{{cite book|last=Zwicker|first=Eberhard|title=The Ear As A Communication Receiver|year=1967|publisher=Acoustical Society of America|location=Melville, NY|url=http://asa.aip.org/books/ear.html|display-authors=etal|access-date=2011-11-11|archive-url=https://web.archive.org/web/20000914080525/http://asa.aip.org/books/ear.html|archive-date=2000-09-14|url-status=dead}}</ref>
Line 263: Line 271:
<ref name="CSIP">{{cite web|title=Video Coding|url=http://csip.ece.gatech.edu/drupal7/?q=technical-area/video-coding|work=CSIP website|publisher=Center for Signal and Information Processing, Georgia Institute of Technology|access-date=6 March 2013|archive-url=https://web.archive.org/web/20130523194345/http://csip.ece.gatech.edu/drupal7/?q=technical-area%2Fvideo-coding|archive-date=23 May 2013|url-status=dead|df=dmy-all}}</ref>
<ref name="CSIP">{{cite web|title=Video Coding|url=http://csip.ece.gatech.edu/drupal7/?q=technical-area/video-coding|work=CSIP website|publisher=Center for Signal and Information Processing, Georgia Institute of Technology|access-date=6 March 2013|archive-url=https://web.archive.org/web/20130523194345/http://csip.ece.gatech.edu/drupal7/?q=technical-area%2Fvideo-coding|archive-date=23 May 2013|url-status=dead|df=dmy-all}}</ref>


<ref name="MSU2007">{{cite report|author= Dmitriy Vatolin |collaboration=Graphics & Media Lab Video Group|title=Lossless Video Codecs Comparison '2007|date=March 2007|publisher=Moscow State University |url=http://compression.ru/video/codec_comparison/pdf/msu_lossless_codecs_comparison_2007_eng.pdf}}</ref>
<ref name="MSU2007">{{cite report|author= Dmitriy Vatolin |collaboration=Graphics & Media Lab Video Group|title=Lossless Video Codecs Comparison '2007|date=March 2007|publisher=Moscow State University |url=http://compression.ru/video/codec_comparison/pdf/msu_lossless_codecs_comparison_2007_eng.pdf |archive-url=https://web.archive.org/web/20080515091507/http://www.compression.ru/video/codec_comparison/pdf/msu_lossless_codecs_comparison_2007_eng.pdf |archive-date=2008-05-15 |url-status=live}}</ref>


<ref name="DCT">{{cite journal |author1=Nasir Ahmed |author1-link=N. Ahmed |author2=T. Natarajan |author3=Kamisetty Ramamohan Rao |journal=IEEE Transactions on Computers|title=Discrete Cosine Transform|volume=C-23|issue=1|pages=90–93|date=January 1974 |doi=10.1109/T-C.1974.223784 |url=https://www.ic.tu-berlin.de/fileadmin/fg121/Source-Coding_WS12/selected-readings/Ahmed_et_al.__1974.pdf
<ref name="DCT">{{cite journal |author1=Nasir Ahmed |author1-link=N. Ahmed |author2=T. Natarajan |author3=Kamisetty Ramamohan Rao |journal=IEEE Transactions on Computers|title=Discrete Cosine Transform|volume=C-23|issue=1|pages=90–93|date=January 1974 |doi=10.1109/T-C.1974.223784 |s2cid=149806273 |url=https://www.ic.tu-berlin.de/fileadmin/fg121/Source-Coding_WS12/selected-readings/Ahmed_et_al.__1974.pdf |archive-url=https://web.archive.org/web/20161208075733/https://www.ic.tu-berlin.de/fileadmin/fg121/Source-Coding_WS12/selected-readings/Ahmed_et_al.__1974.pdf |archive-date=2016-12-08 |url-status=live
}}</ref>
}}</ref>


<ref name="DPCM">{{US patent reference|inventor=C. Chapin Cutler|title=Differential Quantization of Communication Signals|number=2605361|A-Datum=1950-06-29|issue-date=1952-07-29}}</ref>
<ref name="DPCM">{{US patent reference|inventor=C. Chapin Cutler|title=Differential Quantization of Communication Signals|number=2605361|A-Datum=1950-06-29|issue-date=1952-07-29}}</ref>


<ref name="Shannon">{{cite journal|author1=Claude Elwood Shannon|editor-surname1= Alcatel-Lucent|journal=Bell System Technical Journal|title=A Mathematical Theory of Communication |volume=27 |issue=3–4 |date=1948 |pages=379–423, 623–656 |doi= 10.1002/j.1538-7305.1948.tb01338.x|hdl= 11858/00-001M-0000-002C-4314-2|url=http://www.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf |access-date=2019-04-21
<ref name="Shannon">{{cite journal|author1=Claude Elwood Shannon|editor-surname1= Alcatel-Lucent|journal=Bell System Technical Journal|title=A Mathematical Theory of Communication |volume=27 |issue=3–4 |date=1948 |pages=379–423, 623–656 |doi= 10.1002/j.1538-7305.1948.tb01338.x|hdl= 11858/00-001M-0000-002C-4314-2|url=http://www.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf |archive-url=https://web.archive.org/web/20110524064232/http://math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf |archive-date=2011-05-24 |url-status=live |access-date=2019-04-21
|author1-link= Claude Elwood Shannon}}</ref>
|author1-link= Claude Elwood Shannon|hdl-access=free}}</ref>


<ref name="Huffman">{{citation|surname1=[[David Albert Huffman]]|periodical=[[Proceedings of the IRE]]|title=A method for the construction of minimum-redundancy codes|volume=40|issue=9|pages=1098–1101|date=September 1952 |doi=10.1109/JRPROC.1952.273898 |url=http://compression.ru/download/articles/huff/huffman_1952_minimum-redundancy-codes.pdf
<ref name="Huffman">{{citation|surname1=[[David Albert Huffman]]|periodical=[[Proceedings of the IRE]]|title=A method for the construction of minimum-redundancy codes|volume=40|issue=9|pages=1098–1101|date=September 1952 |doi=10.1109/JRPROC.1952.273898 |url=http://compression.ru/download/articles/huff/huffman_1952_minimum-redundancy-codes.pdf |archive-url=https://web.archive.org/web/20051008115257/http://compression.ru/download/articles/huff/huffman_1952_minimum-redundancy-codes.pdf |archive-date=2005-10-08 |url-status=live
}}</ref>
}}</ref>


Line 279: Line 287:
}}</ref>
}}</ref>


<ref name="Hadamard">William K. Pratt, Julius Kane, Harry C. Andrews: "[https://ieeexplore.ieee.org/abstract/document/1448799/ Hadamard transform image coding]", in Proceedings of the IEEE 57.1 (1969): Seiten 58–68</ref>
<ref name="Hadamard">{{cite journal|doi=10.1109/PROC.1969.6869|title=Hadamard transform image coding|year=1969|last1=Pratt|first1=W.K.|last2=Kane|first2=J.|last3=Andrews|first3=H.C.|journal=Proceedings of the IEEE|volume=57|pages=58–68}}</ref>


<ref name="t81">{{cite web |title=T.81 – DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES |url=https://www.w3.org/Graphics/JPEG/itu-t81.pdf |publisher=[[CCITT]] |date=September 1992 |access-date=12 July 2019}}</ref>
<ref name="patents">{{cite conference |first=Cliff |last=Reader |publisher=Society of Photo-Optical Instrumentation Engineers |book-title=Applications of Digital Image Processing XXXIX |title=Patent landscape for royalty-free video coding |volume=9971 |pages=99711B |location=San Diego, California |date=2016-08-31 |url=https://www.youtube.com/watch?v=wi1BefrfTos |bibcode=2016SPIE.9971E..1BR |doi= 10.1117/12.2239493 |editor-last=Tescher |editor-first=Andrew G}} Lecture recording, from 3:05:10.</ref>


<ref name="patents">{{cite conference |first=Cliff |last=Reader |publisher=Society of Photo-Optical Instrumentation Engineers |book-title=Applications of Digital Image Processing XXXIX |title=Patent landscape for royalty-free video coding |series=Applications of Digital Image Processing XXXIX |volume=9971 |pages=99711B |location=San Diego, California |date=2016-08-31 |url=https://www.youtube.com/watch?v=wi1BefrfTos | archive-url=https://web.archive.org/web/20161208075738/https://www.youtube.com/watch?v=wi1BefrfTos| archive-date=2016-12-08 | url-status=dead|bibcode=2016SPIE.9971E..1BR |doi= 10.1117/12.2239493 |editor-last=Tescher |editor-first=Andrew G}} Lecture recording, from 3:05:10.</ref>
<ref name="HapZipper">{{cite journal|vauthors=Chanda P, Bader JS, Elhaik E| title=HapZipper: sharing HapMap populations just got easier|journal=Nucleic Acids Research|date=27 Jul 2012 |volume=40 |issue=20 |page=e159 |doi=10.1093/nar/gks709 |url=http://nar.oxfordjournals.org/content/40/20/e159.full-text-lowres.pdf | pmid=22844100 |pmc=3488212}}</ref>

<ref name="HapZipper">{{cite journal|vauthors=Chanda P, Bader JS, Elhaik E| title=HapZipper: sharing HapMap populations just got easier|journal=Nucleic Acids Research|date=27 Jul 2012 |volume=40 |issue=20 |page=e159 |doi=10.1093/nar/gks709 |url= | pmid=22844100 |pmc=3488212}}</ref>


<ref name="genome email">{{cite journal | journal=Bioinformatics | date=Jan 15, 2009 | volume=25 |issue=2 | pages=274–5| title=Human genomes as email attachments |vauthors=Christley S, Lu Y, Li C, Xie X| pmid=18996942 | doi=10.1093/bioinformatics/btn582| doi-access=free }}</ref>
<ref name="genome email">{{cite journal | journal=Bioinformatics | date=Jan 15, 2009 | volume=25 |issue=2 | pages=274–5| title=Human genomes as email attachments |vauthors=Christley S, Lu Y, Li C, Xie X| pmid=18996942 | doi=10.1093/bioinformatics/btn582| doi-access=free }}</ref>
Line 289: Line 299:
<ref name="genome contracts">{{cite journal | journal= Bioinformatics |date=September 2013 | volume=29 |issue=17 | pages=2199–202 | title=The human genome contracts again |vauthors=Pavlichin DS, Weissman T, Yona G | pmid=23793748 | doi=10.1093/bioinformatics/btt362 | doi-access=free }}</ref>
<ref name="genome contracts">{{cite journal | journal= Bioinformatics |date=September 2013 | volume=29 |issue=17 | pages=2199–202 | title=The human genome contracts again |vauthors=Pavlichin DS, Weissman T, Yona G | pmid=23793748 | doi=10.1093/bioinformatics/btt362 | doi-access=free }}</ref>


<ref name="Morteza">{{cite journal|doi=10.3390/info7040056|doi-access=free|title=A Survey on Data Compression Methods for Biological Sequences|year=2016|last1=Hosseini|first1=Morteza|last2=Pratas|first2=Diogo|last3=Pinho|first3=Armando|journal=Information|volume=7|issue=4|page=56}}</ref>
<ref name="Morteza">M. Hosseini, D. Pratas, and A. Pinho. 2016. A survey on data compression methods for biological sequences. ''Information'' '''7'''(4):(2016): 56</ref>


<ref name="World Capacity">{{cite journal|author1=Hilbert, Martin|author2=López, Priscila|title=The World's Technological Capacity to Store, Communicate, and Compute Information|journal=Science|date=1 April 2011|volume=332|issue=6025|pages=60–65|doi=10.1126/science.1200970|pmid=21310967|bibcode=2011Sci...332...60H|s2cid=206531385}}</ref>
<ref name="World Capacity">{{cite journal|author1=Hilbert, Martin|author2=López, Priscila|title=The World's Technological Capacity to Store, Communicate, and Compute Information|journal=Science|date=1 April 2011|volume=332|issue=6025|pages=60–65|doi=10.1126/science.1200970|pmid=21310967|bibcode=2011Sci...332...60H|s2cid=206531385|doi-access=free}}</ref>


<ref name="Hoffman">{{cite book |last1=Hoffman |first1=Roy |title=Data Compression in Digital Systems |date=2012 |publisher=[[Springer Science & Business Media]] |isbn=9781461560319 |page=124 |url=https://books.google.com/books?id=FOfTBwAAQBAJ |quote=Basically, wavelet coding is a variant on DCT-based transform coding that reduces or eliminates some of its limitations. (...) Another advantage is that rather than working with 8 × 8 blocks of pixels, as do JPEG and other block-based DCT techniques, wavelet coding can simultaneously compress the entire image.}}</ref>
<ref name="Hoffman">{{cite book |last1=Hoffman |first1=Roy |title=Data Compression in Digital Systems |date=2012 |publisher=[[Springer Science & Business Media]] |isbn=9781461560319 |page=124 |url=https://books.google.com/books?id=FOfTBwAAQBAJ |quote=Basically, wavelet coding is a variant on DCT-based transform coding that reduces or eliminates some of its limitations. (...) Another advantage is that rather than working with 8 × 8 blocks of pixels, as do JPEG and other block-based DCT techniques, wavelet coding can simultaneously compress the entire image.}}</ref>
Line 299: Line 309:


== External links ==
== External links ==
* [http://dvd-hq.info/data_compression_3.php Data Compression Basics (Video)]
* {{citation |chapter-url=http://dvd-hq.info/data_compression_3.php |title=Data Compression Basics |chapter=Part 3: Video compression}}
* {{citation |url=http://extranet.ateme.com/download.php?file=1114 |archive-url=https://web.archive.org/web/20090905092232/http://extranet.ateme.com/download.php?file=1114 |archive-date=2009-09-05 |author=Pierre Larbier |title=Using 10-bit AVC/H.264 Encoding with 4:2:2 for Broadcast Contribution |publisher=Ateme}}
* [http://extranet.ateme.com/download.php?file=1114 Video compression 4:2:2 10-bit and its benefits]
* [http://extranet.ateme.com/download.php?file=1194 Why does 10-bit save bandwidth (even when content is 8-bit)?]
* {{Webarchive|url=https://web.archive.org/web/20170830224011/http://extranet.ateme.com/download.php?file=1194 |date=2017-08-30 |title=Why does 10-bit save bandwidth (even when content is 8-bit)?}}
* [http://extranet.ateme.com/download.php?file=1196 Which compression technology should be used]
* {{Webarchive|url=https://web.archive.org/web/20170830224021/http://extranet.ateme.com/download.php?file=1196 |date=2017-08-30 |title=Which compression technology should be used?}}
* [http://media.wiley.com/product_data/excerpt/99/04705184/0470518499.pdf Wiley Introduction to Compression Theory]
* {{citation |url=http://media.wiley.com/product_data/excerpt/99/04705184/0470518499.pdf |archive-url=https://web.archive.org/web/20070928023157/http://media.wiley.com/product_data/excerpt/99/04705184/0470518499.pdf |archive-date=2007-09-28 |url-status=live |publisher=Wiley |title=Introduction to Compression Theory}}
* [http://tech.ebu.ch/docs/tech/tech3296.pdf EBU subjective listening tests on low-bitrate audio codecs]
* [http://tech.ebu.ch/docs/tech/tech3296.pdf EBU subjective listening tests on low-bitrate audio codecs]
* [http://techgage.com/article/audio_archiving_guide_part_1_-_music_formats/ Audio Archiving Guide: Music Formats] (Guide for helping a user pick out the right codec)
* [http://techgage.com/article/audio_archiving_guide_part_1_-_music_formats/ Audio Archiving Guide: Music Formats] (Guide for helping a user pick out the right codec)
* {{webarchive |url=https://web.archive.org/web/20070928023157/http://mia.ece.uic.edu/~papers/WWW/MultimediaStandards/chapter7.pdf |date=September 28, 2007 |title=MPEG 1&2 video compression intro (pdf format) }}
* {{webarchive |url=https://web.archive.org/web/20070928023157/http://mia.ece.uic.edu/~papers/WWW/MultimediaStandards/chapter7.pdf |archive-url=https://web.archive.org/web/20040408044944/http://mia.ece.uic.edu/~papers/WWW/MultimediaStandards/chapter7.pdf |archive-date=2004-04-08 |url-status=live |date=September 28, 2007 |title=MPEG 1&2 video compression intro (pdf format) }}
* [http://wiki.hydrogenaud.io/index.php?title=Lossless_comparison hydrogenaudio wiki comparison]
* [http://wiki.hydrogenaud.io/index.php?title=Lossless_comparison hydrogenaudio wiki comparison]
* [https://www.cs.cmu.edu/afs/cs/project/pscico-guyb/realworld/www/compression.pdf Introduction to Data Compression] by Guy E Blelloch from [[Carnegie Mellon University|CMU]]
* [https://www.cs.cmu.edu/afs/cs/project/pscico-guyb/realworld/www/compression.pdf Introduction to Data Compression] by Guy E Blelloch from [[CMU]]
* [https://web.archive.org/web/20081015080632/http://www.hdgreetings.com/ecard/video-1080p.aspx HD Greetings – 1080p Uncompressed source material for compression testing and research]
* [http://www.monkeysaudio.com/theory.html Explanation of lossless signal compression method used by most codecs]
* [http://www.monkeysaudio.com/theory.html Explanation of lossless signal compression method used by most codecs]
* {{Webarchive|url=https://web.archive.org/web/20100315021124/http://www.videsignline.com/howto/showArticle.jhtml?articleID=185301351 |date=2010-03-15 |title=Videsignline – Intro to Video Compression}}
* [http://www.soundexpert.info/ Interactive blind listening tests of audio codecs over the internet]
* {{webarchive |url=https://web.archive.org/web/20130527124650/http://public.dhe.ibm.com/common/ssi/ecm/en/tsu12345usen/TSU12345USEN.PDF |archive-url=https://web.archive.org/web/20130527124650/http://public.dhe.ibm.com/common/ssi/ecm/en/tsu12345usen/TSU12345USEN.PDF |archive-date=2013-05-27 |url-status=live |title=Data Footprint Reduction Technology}}
* [https://web.archive.org/web/20111201183842/http://www.testvid.com/index.html TestVid – 2,000+ HD and other uncompressed source video clips for compression testing]
* [http://siliconmentor.blogspot.in/2014/12/what-is-run-length-coding-in-video.html What is Run length Coding in video compression]
* [http://www.videsignline.com/howto/showArticle.jhtml?articleID=185301351 Videsignline – Intro to Video Compression]

* [https://web.archive.org/web/20130527124650/http://public.dhe.ibm.com/common/ssi/ecm/en/tsu12345usen/TSU12345USEN.PDF Data Footprint Reduction Technology]
* [https://www.wolframscience.com/nks/notes-10-5--history-of-data-compression/ History of Data Compression]
* [http://siliconmentor.blogspot.in/2014/12/what-is-run-length-coding-in-video.html What is Run length Coding in video compression.]
{{Compression Methods}}
{{Compression Methods}}
{{Compression formats}}
{{Compression formats}}
{{Compression Software Implementations}}
{{Compression Software Implementations}}
{{data}}
{{data}}
{{Computer files}}


{{Authority control}}
{{Authority control}}

Latest revision as of 16:13, 29 November 2024

In information theory, data compression, source coding,[1] or bit-rate reduction is the process of encoding information using fewer bits than the original representation.[2] Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information.[3] Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

The process of reducing the size of a data file is often referred to as data compression. In the context of data transmission, it is called source coding: encoding is done at the source of the data before it is stored or transmitted.[4] Source coding should not be confused with channel coding, for error detection and correction or line coding, the means for mapping data onto a signal.

Data Compression algorithms present a space-time complexity trade-off between the bytes needed to store or transmit information, and the Computational resources needed to perform the encoding and decoding. The design of data compression schemes involves balancing the degree of compression, the amount of distortion introduced (when using lossy data compression), and the computational resources or time required to compress and decompress the data.[5]

Lossless

[edit]

Lossless data compression algorithms usually exploit statistical redundancy to represent data without losing any information, so that the process is reversible. Lossless compression is possible because most real-world data exhibits statistical redundancy. For example, an image may have areas of color that do not change over several pixels; instead of coding "red pixel, red pixel, ..." the data may be encoded as "279 red pixels". This is a basic example of run-length encoding; there are many schemes to reduce file size by eliminating redundancy.

The Lempel–Ziv (LZ) compression methods are among the most popular algorithms for lossless storage.[6] DEFLATE is a variation on LZ optimized for decompression speed and compression ratio,[7] but compression can be slow. In the mid-1980s, following work by Terry Welch, the Lempel–Ziv–Welch (LZW) algorithm rapidly became the method of choice for most general-purpose compression systems. LZW is used in GIF images, programs such as PKZIP, and hardware devices such as modems.[8] LZ methods use a table-based compression model where table entries are substituted for repeated strings of data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often Huffman encoded. Grammar-based codes like this can compress highly repetitive input extremely effectively, for instance, a biological data collection of the same or closely related species, a huge versioned document collection, internet archival, etc. The basic task of grammar-based codes is constructing a context-free grammar deriving a single string. Other practical grammar compression algorithms include Sequitur and Re-Pair.

The strongest modern lossless compressors use probabilistic models, such as prediction by partial matching. The Burrows–Wheeler transform can also be viewed as an indirect form of statistical modelling.[9] In a further refinement of the direct use of probabilistic modelling, statistical estimates can be coupled to an algorithm called arithmetic coding. Arithmetic coding is a more modern coding technique that uses the mathematical calculations of a finite-state machine to produce a string of encoded bits from a series of input data symbols. It can achieve superior compression compared to other techniques such as the better-known Huffman algorithm. It uses an internal memory state to avoid the need to perform a one-to-one mapping of individual input symbols to distinct representations that use an integer number of bits, and it clears out the internal memory only after encoding the entire string of data symbols. Arithmetic coding applies especially well to adaptive data compression tasks where the statistics vary and are context-dependent, as it can be easily coupled with an adaptive model of the probability distribution of the input data. An early example of the use of arithmetic coding was in an optional (but not widely used) feature of the JPEG image coding standard.[10] It has since been applied in various other designs including H.263, H.264/MPEG-4 AVC and HEVC for video coding.[11]

Archive software typically has the ability to adjust the "dictionary size", where a larger size demands more random-access memory during compression and decompression, but compresses stronger, especially on repeating patterns in files' content.[12][13]

Lossy

[edit]
Composite image showing JPG and PNG image compression. Left side of the image is from a JPEG image, showing lossy artefacts; the right side is from a PNG image.

In the late 1980s, digital images became more common, and standards for lossless image compression emerged. In the early 1990s, lossy compression methods began to be widely used.[14] In these schemes, some loss of information is accepted as dropping nonessential detail can save storage space. There is a corresponding trade-off between preserving information and reducing size. Lossy data compression schemes are designed by research on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations in luminance than it is to the variations in color. JPEG image compression works in part by rounding off nonessential bits of information.[15] A number of popular compression formats exploit these perceptual differences, including psychoacoustics for sound, and psychovisuals for images and video.

Most forms of lossy compression are based on transform coding, especially the discrete cosine transform (DCT). It was first proposed in 1972 by Nasir Ahmed, who then developed a working algorithm with T. Natarajan and K. R. Rao in 1973, before introducing it in January 1974.[16][17] DCT is the most widely used lossy compression method, and is used in multimedia formats for images (such as JPEG and HEIF),[18] video (such as MPEG, AVC and HEVC) and audio (such as MP3, AAC and Vorbis).

Lossy image compression is used in digital cameras, to increase storage capacities. Similarly, DVDs, Blu-ray and streaming video use lossy video coding formats. Lossy compression is extensively used in video.

In lossy audio compression, methods of psychoacoustics are used to remove non-audible (or less audible) components of the audio signal. Compression of human speech is often performed with even more specialized techniques; speech coding is distinguished as a separate discipline from general-purpose audio compression. Speech coding is used in internet telephony, for example, audio compression is used for CD ripping and is decoded by the audio players.[9]

Lossy compression can cause generation loss.

Theory

[edit]

The theoretical basis for compression is provided by information theory and, more specifically, Shannon's source coding theorem; domain-specific theories include algorithmic information theory for lossless compression and rate–distortion theory for lossy compression. These areas of study were essentially created by Claude Shannon, who published fundamental papers on the topic in the late 1940s and early 1950s. Other topics associated with compression include coding theory and statistical inference.[19]

Machine learning

[edit]

There is a close connection between machine learning and compression. A system that predicts the posterior probabilities of a sequence given its entire history can be used for optimal data compression (by using arithmetic coding on the output distribution). Conversely, an optimal compressor can be used for prediction (by finding the symbol that compresses best, given the previous history). This equivalence has been used as a justification for using data compression as a benchmark for "general intelligence".[20][21][22]

An alternative view can show compression algorithms implicitly map strings into implicit feature space vectors, and compression-based similarity measures compute similarity within these feature spaces. For each compressor C(.) we define an associated vector space ℵ, such that C(.) maps an input string x, corresponding to the vector norm ||~x||. An exhaustive examination of the feature spaces underlying all compression algorithms is precluded by space; instead, feature vectors chooses to examine three representative lossless compression methods, LZW, LZ77, and PPM.[23]

According to AIXI theory, a connection more directly explained in Hutter Prize, the best possible compression of x is the smallest possible software that generates x. For example, in that model, a zip file's compressed size includes both the zip file and the unzipping software, since you can not unzip it without both, but there may be an even smaller combined form.

Examples of AI-powered audio/video compression software include NVIDIA Maxine, AIVC.[24] Examples of software that can perform AI-powered image compression include OpenCV, TensorFlow, MATLAB's Image Processing Toolbox (IPT) and High-Fidelity Generative Image Compression.[25]

In unsupervised machine learning, k-means clustering can be utilized to compress data by grouping similar data points into clusters. This technique simplifies handling extensive datasets that lack predefined labels and finds widespread use in fields such as image compression.[26]

Data compression aims to reduce the size of data files, enhancing storage efficiency and speeding up data transmission. K-means clustering, an unsupervised machine learning algorithm, is employed to partition a dataset into a specified number of clusters, k, each represented by the centroid of its points. This process condenses extensive datasets into a more compact set of representative points. Particularly beneficial in image and signal processing, k-means clustering aids in data reduction by replacing groups of data points with their centroids, thereby preserving the core information of the original data while significantly decreasing the required storage space.[27]

Large language models (LLMs) are also capable of lossless data compression, as demonstrated by DeepMind's research with the Chinchilla 70B model. Developed by DeepMind, Chinchilla 70B effectively compressed data, outperforming conventional methods such as Portable Network Graphics (PNG) for images and Free Lossless Audio Codec (FLAC) for audio. It achieved compression of image and audio data to 43.4% and 16.4% of their original sizes, respectively.[28]

Data differencing

[edit]
Comparison of two revisions of a file

Data compression can be viewed as a special case of data differencing.[29][30] Data differencing consists of producing a difference given a source and a target, with patching reproducing the target given a source and a difference. Since there is no separate source and target in data compression, one can consider data compression as data differencing with empty source data, the compressed file corresponding to a difference from nothing. This is the same as considering absolute entropy (corresponding to data compression) as a special case of relative entropy (corresponding to data differencing) with no initial data.

The term differential compression is used to emphasize the data differencing connection.

Uses

[edit]

Image

[edit]

Entropy coding originated in the 1940s with the introduction of Shannon–Fano coding,[31] the basis for Huffman coding which was developed in 1950.[32] Transform coding dates back to the late 1960s, with the introduction of fast Fourier transform (FFT) coding in 1968 and the Hadamard transform in 1969.[33]

An important image compression technique is the discrete cosine transform (DCT), a technique developed in the early 1970s.[16] DCT is the basis for JPEG, a lossy compression format which was introduced by the Joint Photographic Experts Group (JPEG) in 1992.[34] JPEG greatly reduces the amount of data required to represent an image at the cost of a relatively small reduction in image quality and has become the most widely used image file format.[35][36] Its highly efficient DCT-based compression algorithm was largely responsible for the wide proliferation of digital images and digital photos.[37]

Lempel–Ziv–Welch (LZW) is a lossless compression algorithm developed in 1984. It is used in the GIF format, introduced in 1987.[38] DEFLATE, a lossless compression algorithm specified in 1996, is used in the Portable Network Graphics (PNG) format.[39]

Wavelet compression, the use of wavelets in image compression, began after the development of DCT coding.[40] The JPEG 2000 standard was introduced in 2000.[41] In contrast to the DCT algorithm used by the original JPEG format, JPEG 2000 instead uses discrete wavelet transform (DWT) algorithms.[42][43][44] JPEG 2000 technology, which includes the Motion JPEG 2000 extension, was selected as the video coding standard for digital cinema in 2004.[45]

Audio

[edit]

Audio data compression, not to be confused with dynamic range compression, has the potential to reduce the transmission bandwidth and storage requirements of audio data. Audio compression formats compression algorithms are implemented in software as audio codecs. In both lossy and lossless compression, information redundancy is reduced, using methods such as coding, quantization, DCT and linear prediction to reduce the amount of information used to represent the uncompressed data.

Lossy audio compression algorithms provide higher compression and are used in numerous audio applications including Vorbis and MP3. These algorithms almost all rely on psychoacoustics to eliminate or reduce fidelity of less audible sounds, thereby reducing the space required to store or transmit them.[2][46]

The acceptable trade-off between loss of audio quality and transmission or storage size depends upon the application. For example, one 640 MB compact disc (CD) holds approximately one hour of uncompressed high fidelity music, less than 2 hours of music compressed losslessly, or 7 hours of music compressed in the MP3 format at a medium bit rate. A digital sound recorder can typically store around 200 hours of clearly intelligible speech in 640 MB.[47]

Lossless audio compression produces a representation of digital data that can be decoded to an exact digital duplicate of the original. Compression ratios are around 50–60% of the original size,[48] which is similar to those for generic lossless data compression. Lossless codecs use curve fitting or linear prediction as a basis for estimating the signal. Parameters describing the estimation and the difference between the estimation and the actual signal are coded separately.[49]

A number of lossless audio compression formats exist. See list of lossless codecs for a listing. Some formats are associated with a distinct system, such as Direct Stream Transfer, used in Super Audio CD and Meridian Lossless Packing, used in DVD-Audio, Dolby TrueHD, Blu-ray and HD DVD.

Some audio file formats feature a combination of a lossy format and a lossless correction; this allows stripping the correction to easily obtain a lossy file. Such formats include MPEG-4 SLS (Scalable to Lossless), WavPack, and OptimFROG DualStream.

When audio files are to be processed, either by further compression or for editing, it is desirable to work from an unchanged original (uncompressed or losslessly compressed). Processing of a lossily compressed file for some purpose usually produces a final result inferior to the creation of the same compressed file from an uncompressed original. In addition to sound editing or mixing, lossless audio compression is often used for archival storage, or as master copies.

Lossy audio compression

[edit]
Comparison of spectrograms of audio in an uncompressed format and several lossy formats. The lossy spectrograms show bandlimiting of higher frequencies, a common technique associated with lossy audio compression.

Lossy audio compression is used in a wide range of applications. In addition to standalone audio-only applications of file playback in MP3 players or computers, digitally compressed audio streams are used in most video DVDs, digital television, streaming media on the Internet, satellite and cable radio, and increasingly in terrestrial radio broadcasts. Lossy compression typically achieves far greater compression than lossless compression, by discarding less-critical data based on psychoacoustic optimizations.[50]

Psychoacoustics recognizes that not all data in an audio stream can be perceived by the human auditory system. Most lossy compression reduces redundancy by first identifying perceptually irrelevant sounds, that is, sounds that are very hard to hear. Typical examples include high frequencies or sounds that occur at the same time as louder sounds. Those irrelevant sounds are coded with decreased accuracy or not at all.

Due to the nature of lossy algorithms, audio quality suffers a digital generation loss when a file is decompressed and recompressed. This makes lossy compression unsuitable for storing the intermediate results in professional audio engineering applications, such as sound editing and multitrack recording. However, lossy formats such as MP3 are very popular with end-users as the file size is reduced to 5-20% of the original size and a megabyte can store about a minute's worth of music at adequate quality.

Several proprietary lossy compression algorithms have been developed that provide higher quality audio performance by using a combination of lossless and lossy algorithms with adaptive bit rates and lower compression ratios. Examples include aptX, LDAC, LHDC, MQA and SCL6.

Coding methods
[edit]

To determine what information in an audio signal is perceptually irrelevant, most lossy compression algorithms use transforms such as the modified discrete cosine transform (MDCT) to convert time domain sampled waveforms into a transform domain, typically the frequency domain. Once transformed, component frequencies can be prioritized according to how audible they are. Audibility of spectral components is assessed using the absolute threshold of hearing and the principles of simultaneous masking—the phenomenon wherein a signal is masked by another signal separated by frequency—and, in some cases, temporal masking—where a signal is masked by another signal separated by time. Equal-loudness contours may also be used to weigh the perceptual importance of components. Models of the human ear-brain combination incorporating such effects are often called psychoacoustic models.[51]

Other types of lossy compressors, such as the linear predictive coding (LPC) used with speech, are source-based coders. LPC uses a model of the human vocal tract to analyze speech sounds and infer the parameters used by the model to produce them moment to moment. These changing parameters are transmitted or stored and used to drive another model in the decoder which reproduces the sound.

Lossy formats are often used for the distribution of streaming audio or interactive communication (such as in cell phone networks). In such applications, the data must be decompressed as the data flows, rather than after the entire data stream has been transmitted. Not all audio codecs can be used for streaming applications.[50]

Latency is introduced by the methods used to encode and decode the data. Some codecs will analyze a longer segment, called a frame, of the data to optimize efficiency, and then code it in a manner that requires a larger segment of data at one time to decode. The inherent latency of the coding algorithm can be critical; for example, when there is a two-way transmission of data, such as with a telephone conversation, significant delays may seriously degrade the perceived quality.

In contrast to the speed of compression, which is proportional to the number of operations required by the algorithm, here latency refers to the number of samples that must be analyzed before a block of audio is processed. In the minimum case, latency is zero samples (e.g., if the coder/decoder simply reduces the number of bits used to quantize the signal). Time domain algorithms such as LPC also often have low latencies, hence their popularity in speech coding for telephony. In algorithms such as MP3, however, a large number of samples have to be analyzed to implement a psychoacoustic model in the frequency domain, and latency is on the order of 23 ms.

Speech encoding
[edit]

Speech encoding is an important category of audio data compression. The perceptual models used to estimate what aspects of speech a human ear can hear are generally somewhat different from those used for music. The range of frequencies needed to convey the sounds of a human voice is normally far narrower than that needed for music, and the sound is normally less complex. As a result, speech can be encoded at high quality using a relatively low bit rate.

This is accomplished, in general, by some combination of two approaches:

  • Only encoding sounds that could be made by a single human voice.
  • Throwing away more of the data in the signal—keeping just enough to reconstruct an "intelligible" voice rather than the full frequency range of human hearing.

The earliest algorithms used in speech encoding (and audio data compression in general) were the A-law algorithm and the μ-law algorithm.

History

[edit]
Solidyne 922: The world's first commercial audio bit compression sound card for PC, 1990

Early audio research was conducted at Bell Labs. There, in 1950, C. Chapin Cutler filed the patent on differential pulse-code modulation (DPCM).[52] In 1973, Adaptive DPCM (ADPCM) was introduced by P. Cummiskey, Nikil S. Jayant and James L. Flanagan.[53][54]

Perceptual coding was first used for speech coding compression, with linear predictive coding (LPC).[55] Initial concepts for LPC date back to the work of Fumitada Itakura (Nagoya University) and Shuzo Saito (Nippon Telegraph and Telephone) in 1966.[56] During the 1970s, Bishnu S. Atal and Manfred R. Schroeder at Bell Labs developed a form of LPC called adaptive predictive coding (APC), a perceptual coding algorithm that exploited the masking properties of the human ear, followed in the early 1980s with the code-excited linear prediction (CELP) algorithm which achieved a significant compression ratio for its time.[55] Perceptual coding is used by modern audio compression formats such as MP3[55] and AAC.

Discrete cosine transform (DCT), developed by Nasir Ahmed, T. Natarajan and K. R. Rao in 1974,[17] provided the basis for the modified discrete cosine transform (MDCT) used by modern audio compression formats such as MP3,[57] Dolby Digital,[58][59] and AAC.[60] MDCT was proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987,[61] following earlier work by Princen and Bradley in 1986.[62]

The world's first commercial broadcast automation audio compression system was developed by Oscar Bonello, an engineering professor at the University of Buenos Aires. [63] In 1983, using the psychoacoustic principle of the masking of critical bands first published in 1967,[64] he started developing a practical application based on the recently developed IBM PC computer, and the broadcast automation system was launched in 1987 under the name Audicom. [65] 35 years later, almost all the radio stations in the world were using this technology manufactured by a number of companies because the inventor refuses to get invention patents for his work. He prefers declaring it of Public Domain publishing it [66]

A literature compendium for a large variety of audio coding systems was published in the IEEE's Journal on Selected Areas in Communications (JSAC), in February 1988. While there were some papers from before that time, this collection documented an entire variety of finished, working audio coders, nearly all of them using perceptual techniques and some kind of frequency analysis and back-end noiseless coding.[67]

Video

[edit]

Uncompressed video requires a very high data rate. Although lossless video compression codecs perform at a compression factor of 5 to 12, a typical H.264 lossy compression video has a compression factor between 20 and 200.[68]

The two key video compression techniques used in video coding standards are the DCT and motion compensation (MC). Most video coding standards, such as the H.26x and MPEG formats, typically use motion-compensated DCT video coding (block motion compensation).[69][70]

Most video codecs are used alongside audio compression techniques to store the separate but complementary data streams as one combined package using so-called container formats.[71]

Encoding theory

[edit]

Video data may be represented as a series of still image frames. Such data usually contains abundant amounts of spatial and temporal redundancy. Video compression algorithms attempt to reduce redundancy and store information more compactly.

Most video compression formats and codecs exploit both spatial and temporal redundancy (e.g. through difference coding with motion compensation). Similarities can be encoded by only storing differences between e.g. temporally adjacent frames (inter-frame coding) or spatially adjacent pixels (intra-frame coding). Inter-frame compression (a temporal delta encoding) (re)uses data from one or more earlier or later frames in a sequence to describe the current frame. Intra-frame coding, on the other hand, uses only data from within the current frame, effectively being still-image compression.[51]

The intra-frame video coding formats used in camcorders and video editing employ simpler compression that uses only intra-frame prediction. This simplifies video editing software, as it prevents a situation in which a compressed frame refers to data that the editor has deleted.

Usually, video compression additionally employs lossy compression techniques like quantization that reduce aspects of the source data that are (more or less) irrelevant to the human visual perception by exploiting perceptual features of human vision. For example, small differences in color are more difficult to perceive than are changes in brightness. Compression algorithms can average a color across these similar areas in a manner similar to those used in JPEG image compression.[10] As in all lossy compression, there is a trade-off between video quality and bit rate, cost of processing the compression and decompression, and system requirements. Highly compressed video may present visible or distracting artifacts.

Other methods other than the prevalent DCT-based transform formats, such as fractal compression, matching pursuit and the use of a discrete wavelet transform (DWT), have been the subject of some research, but are typically not used in practical products. Wavelet compression is used in still-image coders and video coders without motion compensation. Interest in fractal compression seems to be waning, due to recent theoretical analysis showing a comparative lack of effectiveness of such methods.[51]

Inter-frame coding
[edit]

In inter-frame coding, individual frames of a video sequence are compared from one frame to the next, and the video compression codec records the differences to the reference frame. If the frame contains areas where nothing has moved, the system can simply issue a short command that copies that part of the previous frame into the next one. If sections of the frame move in a simple manner, the compressor can emit a (slightly longer) command that tells the decompressor to shift, rotate, lighten, or darken the copy. This longer command still remains much shorter than data generated by intra-frame compression. Usually, the encoder will also transmit a residue signal which describes the remaining more subtle differences to the reference imagery. Using entropy coding, these residue signals have a more compact representation than the full signal. In areas of video with more motion, the compression must encode more data to keep up with the larger number of pixels that are changing. Commonly during explosions, flames, flocks of animals, and in some panning shots, the high-frequency detail leads to quality decreases or to increases in the variable bitrate.

Hybrid block-based transform formats

[edit]
Processing stages of a typical video encoder

Many commonly used video compression methods (e.g., those in standards approved by the ITU-T or ISO) share the same basic architecture that dates back to H.261 which was standardized in 1988 by the ITU-T. They mostly rely on the DCT, applied to rectangular blocks of neighboring pixels, and temporal prediction using motion vectors, as well as nowadays also an in-loop filtering step.

In the prediction stage, various deduplication and difference-coding techniques are applied that help decorrelate data and describe new data based on already transmitted data.

Then rectangular blocks of remaining pixel data are transformed to the frequency domain. In the main lossy processing stage, frequency domain data gets quantized in order to reduce information that is irrelevant to human visual perception.

In the last stage statistical redundancy gets largely eliminated by an entropy coder which often applies some form of arithmetic coding.

In an additional in-loop filtering stage various filters can be applied to the reconstructed image signal. By computing these filters also inside the encoding loop they can help compression because they can be applied to reference material before it gets used in the prediction process and they can be guided using the original signal. The most popular example are deblocking filters that blur out blocking artifacts from quantization discontinuities at transform block boundaries.

History

[edit]

In 1967, A.H. Robinson and C. Cherry proposed a run-length encoding bandwidth compression scheme for the transmission of analog television signals.[72] The DCT, which is fundamental to modern video compression,[73] was introduced by Nasir Ahmed, T. Natarajan and K. R. Rao in 1974.[17][74]

H.261, which debuted in 1988, commercially introduced the prevalent basic architecture of video compression technology.[75] It was the first video coding format based on DCT compression.[73] H.261 was developed by a number of companies, including Hitachi, PictureTel, NTT, BT and Toshiba.[76]

The most popular video coding standards used for codecs have been the MPEG standards. MPEG-1 was developed by the Motion Picture Experts Group (MPEG) in 1991, and it was designed to compress VHS-quality video. It was succeeded in 1994 by MPEG-2/H.262,[75] which was developed by a number of companies, primarily Sony, Thomson and Mitsubishi Electric.[77] MPEG-2 became the standard video format for DVD and SD digital television.[75] In 1999, it was followed by MPEG-4/H.263.[75] It was also developed by a number of companies, primarily Mitsubishi Electric, Hitachi and Panasonic.[78]

H.264/MPEG-4 AVC was developed in 2003 by a number of organizations, primarily Panasonic, Godo Kaisha IP Bridge and LG Electronics.[79] AVC commercially introduced the modern context-adaptive binary arithmetic coding (CABAC) and context-adaptive variable-length coding (CAVLC) algorithms. AVC is the main video encoding standard for Blu-ray Discs, and is widely used by video sharing websites and streaming internet services such as YouTube, Netflix, Vimeo, and iTunes Store, web software such as Adobe Flash Player and Microsoft Silverlight, and various HDTV broadcasts over terrestrial and satellite television.

Genetics

[edit]

Genetics compression algorithms are the latest generation of lossless algorithms that compress data (typically sequences of nucleotides) using both conventional compression algorithms and genetic algorithms adapted to the specific datatype. In 2012, a team of scientists from Johns Hopkins University published a genetic compression algorithm that does not use a reference genome for compression. HAPZIPPER was tailored for HapMap data and achieves over 20-fold compression (95% reduction in file size), providing 2- to 4-fold better compression and is less computationally intensive than the leading general-purpose compression utilities. For this, Chanda, Elhaik, and Bader introduced MAF-based encoding (MAFE), which reduces the heterogeneity of the dataset by sorting SNPs by their minor allele frequency, thus homogenizing the dataset.[80] Other algorithms developed in 2009 and 2013 (DNAZip and GenomeZip) have compression ratios of up to 1200-fold—allowing 6 billion basepair diploid human genomes to be stored in 2.5 megabytes (relative to a reference genome or averaged over many genomes).[81][82] For a benchmark in genetics/genomics data compressors, see [83]

Outlook and currently unused potential

[edit]

It is estimated that the total amount of data that is stored on the world's storage devices could be further compressed with existing compression algorithms by a remaining average factor of 4.5:1.[84] It is estimated that the combined technological capacity of the world to store information provides 1,300 exabytes of hardware digits in 2007, but when the corresponding content is optimally compressed, this only represents 295 exabytes of Shannon information.[85]

See also

[edit]

References

[edit]
  1. ^ Wade, Graham (1994). Signal coding and processing (2 ed.). Cambridge University Press. p. 34. ISBN 978-0-521-42336-6. Retrieved 2011-12-22. The broad objective of source coding is to exploit or remove 'inefficient' redundancy in the PCM source and thereby achieve a reduction in the overall source rate R.
  2. ^ a b Mahdi, O.A.; Mohammed, M.A.; Mohamed, A.J. (November 2012). "Implementing a Novel Approach an Convert Audio Compression to Text Coding via Hybrid Technique" (PDF). International Journal of Computer Science Issues. 9 (6, No. 3): 53–59. Archived (PDF) from the original on 2013-03-20. Retrieved 6 March 2013.
  3. ^ Pujar, J.H.; Kadlaskar, L.M. (May 2010). "A New Lossless Method of Image Compression and Decompression Using Huffman Coding Techniques" (PDF). Journal of Theoretical and Applied Information Technology. 15 (1): 18–23. Archived (PDF) from the original on 2010-05-24.
  4. ^ Salomon, David (2008). A Concise Introduction to Data Compression. Berlin: Springer. ISBN 9781848000728.
  5. ^ Tank, M.K. (2011). "Implementation of Lempel-ZIV algorithm for lossless compression using VHDL". Thinkquest 2010: Proceedings of the First International Conference on Contours of Computing Technology. Berlin: Springer. pp. 275–283. doi:10.1007/978-81-8489-989-4_51. ISBN 978-81-8489-988-7.
  6. ^ Navqi, Saud; Naqvi, R.; Riaz, R.A.; Siddiqui, F. (April 2011). "Optimized RTL design and implementation of LZW algorithm for high bandwidth applications" (PDF). Electrical Review. 2011 (4): 279–285. Archived (PDF) from the original on 2013-05-20.
  7. ^ Document Management - Portable document format - Part 1: PDF1.7 (1st ed.). Adobe Systems Incorporated. July 1, 2008.{{cite book}}: CS1 maint: date and year (link)
  8. ^ Stephen, Wolfram (2002). New Kind of Science. Champaign, IL. p. 1069. ISBN 1-57955-008-8.{{cite book}}: CS1 maint: location missing publisher (link)
  9. ^ a b Mahmud, Salauddin (March 2012). "An Improved Data Compression Method for General Data" (PDF). International Journal of Scientific & Engineering Research. 3 (3): 2. Archived (PDF) from the original on 2013-11-02. Retrieved 6 March 2013.
  10. ^ a b Lane, Tom. "JPEG Image Compression FAQ, Part 1". Internet FAQ Archives. Independent JPEG Group. Retrieved 6 March 2013.
  11. ^ G. J. Sullivan; J.-R. Ohm; W.-J. Han; T. Wiegand (December 2012). "Overview of the High Efficiency Video Coding (HEVC) Standard". IEEE Transactions on Circuits and Systems for Video Technology. 22 (12). IEEE: 1649–1668. doi:10.1109/TCSVT.2012.2221191. S2CID 64404.
  12. ^ "How to choose optimal archiving settings – WinRAR".
  13. ^ "(Set compression Method) switch – 7zip". Archived from the original on 2022-04-09. Retrieved 2021-11-07.
  14. ^ Wolfram, Stephen (2002). A New Kind of Science. Wolfram Media, Inc. p. 1069. ISBN 978-1-57955-008-0.
  15. ^ Arcangel, Cory. "On Compression" (PDF). Archived (PDF) from the original on 2013-07-28. Retrieved 6 March 2013.
  16. ^ a b Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform". Digital Signal Processing. 1 (1): 4–5. Bibcode:1991DSP.....1....4A. doi:10.1016/1051-2004(91)90086-Z.
  17. ^ a b c Nasir Ahmed; T. Natarajan; Kamisetty Ramamohan Rao (January 1974). "Discrete Cosine Transform" (PDF). IEEE Transactions on Computers. C-23 (1): 90–93. doi:10.1109/T-C.1974.223784. S2CID 149806273. Archived (PDF) from the original on 2016-12-08.
  18. ^ CCITT Study Group VIII und die Joint Photographic Experts Group (JPEG) von ISO/IEC Joint Technical Committee 1/Subcommittee 29/Working Group 10 (1993), "Annex D – Arithmetic coding", Recommendation T.81: Digital Compression and Coding of Continuous-tone Still images – Requirements and guidelines (PDF), pp. 54 ff, retrieved 2009-11-07{{citation}}: CS1 maint: numeric names: authors list (link)
  19. ^ Marak, Laszlo. "On image compression" (PDF). University of Marne la Vallee. Archived from the original (PDF) on 28 May 2015. Retrieved 6 March 2013.
  20. ^ Mahoney, Matt. "Rationale for a Large Text Compression Benchmark". Florida Institute of Technology. Retrieved 5 March 2013.
  21. ^ Shmilovici A.; Kahiri Y.; Ben-Gal I.; Hauser S. (2009). "Measuring the Efficiency of the Intraday Forex Market with a Universal Data Compression Algorithm" (PDF). Computational Economics. 33 (2): 131–154. CiteSeerX 10.1.1.627.3751. doi:10.1007/s10614-008-9153-3. S2CID 17234503. Archived (PDF) from the original on 2009-07-09.
  22. ^ I. Ben-Gal (2008). "On the Use of Data Compression Measures to Analyze Robust Designs" (PDF). IEEE Transactions on Reliability. 54 (3): 381–388. doi:10.1109/TR.2005.853280. S2CID 9376086.
  23. ^ D. Scully; Carla E. Brodley (2006). "Compression and Machine Learning: A New Perspective on Feature Space Vectors". Data Compression Conference (DCC'06). p. 332. doi:10.1109/DCC.2006.13. ISBN 0-7695-2545-8. S2CID 12311412.
  24. ^ Gary Adcock (January 5, 2023). "What Is AI Video Compression?". massive.io. Retrieved 6 April 2023.
  25. ^ Mentzer, Fabian; Toderici, George; Tschannen, Michael; Agustsson, Eirikur (2020). "High-Fidelity Generative Image Compression". arXiv:2006.09965 [eess.IV].
  26. ^ "What is Unsupervised Learning? | IBM". www.ibm.com. 23 September 2021. Retrieved 2024-02-05.
  27. ^ "Differentially private clustering for large-scale datasets". blog.research.google. 2023-05-25. Retrieved 2024-03-16.
  28. ^ Edwards, Benj (2023-09-28). "AI language models can exceed PNG and FLAC in lossless compression, says study". Ars Technica. Retrieved 2024-03-07.
  29. ^ Korn, D.; et al. (July 2002). "RFC 3284: The VCDIFF Generic Differencing and Compression Data Format". Internet Engineering Task Force. Retrieved 5 March 2013.
  30. ^ Korn, D.G.; Vo, K.P. (1995). B. Krishnamurthy (ed.). Vdelta: Differencing and Compression. Practical Reusable Unix Software. New York: John Wiley & Sons, Inc.
  31. ^ Claude Elwood Shannon (1948). Alcatel-Lucent (ed.). "A Mathematical Theory of Communication" (PDF). Bell System Technical Journal. 27 (3–4): 379–423, 623–656. doi:10.1002/j.1538-7305.1948.tb01338.x. hdl:11858/00-001M-0000-002C-4314-2. Archived (PDF) from the original on 2011-05-24. Retrieved 2019-04-21.
  32. ^ David Albert Huffman (September 1952), "A method for the construction of minimum-redundancy codes" (PDF), Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, doi:10.1109/JRPROC.1952.273898, archived (PDF) from the original on 2005-10-08
  33. ^ Pratt, W.K.; Kane, J.; Andrews, H.C. (1969). "Hadamard transform image coding". Proceedings of the IEEE. 57: 58–68. doi:10.1109/PROC.1969.6869.
  34. ^ "T.81 – DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES" (PDF). CCITT. September 1992. Retrieved 12 July 2019.
  35. ^ "The JPEG image format explained". BT.com. BT Group. 31 May 2018. Archived from the original on 5 August 2019. Retrieved 5 August 2019.
  36. ^ Baraniuk, Chris (15 October 2015). "Copy protections could come to JPEGs". BBC News. BBC. Retrieved 13 September 2019.
  37. ^ "What Is a JPEG? The Invisible Object You See Every Day". The Atlantic. 24 September 2013. Retrieved 13 September 2019.
  38. ^ "The GIF Controversy: A Software Developer's Perspective". 27 January 1995. Retrieved 26 May 2015.
  39. ^ L. Peter Deutsch (May 1996). DEFLATE Compressed Data Format Specification version 1.3. IETF. p. 1. sec. Abstract. doi:10.17487/RFC1951. RFC 1951. Retrieved 2014-04-23.
  40. ^ Hoffman, Roy (2012). Data Compression in Digital Systems. Springer Science & Business Media. p. 124. ISBN 9781461560319. Basically, wavelet coding is a variant on DCT-based transform coding that reduces or eliminates some of its limitations. (...) Another advantage is that rather than working with 8 × 8 blocks of pixels, as do JPEG and other block-based DCT techniques, wavelet coding can simultaneously compress the entire image.
  41. ^ Taubman, David; Marcellin, Michael (2012). JPEG2000 Image Compression Fundamentals, Standards and Practice: Image Compression Fundamentals, Standards and Practice. Springer Science & Business Media. ISBN 9781461507994.
  42. ^ Unser, M.; Blu, T. (2003). "Mathematical properties of the JPEG2000 wavelet filters". IEEE Transactions on Image Processing. 12 (9): 1080–1090. Bibcode:2003ITIP...12.1080U. doi:10.1109/TIP.2003.812329. PMID 18237979. S2CID 2765169.
  43. ^ Sullivan, Gary (8–12 December 2003). "General characteristics and design considerations for temporal subband video coding". ITU-T. Video Coding Experts Group. Retrieved 13 September 2019.
  44. ^ Bovik, Alan C. (2009). The Essential Guide to Video Processing. Academic Press. p. 355. ISBN 9780080922508.
  45. ^ Swartz, Charles S. (2005). Understanding Digital Cinema: A Professional Handbook. Taylor & Francis. p. 147. ISBN 9780240806174.
  46. ^ Cunningham, Stuart; McGregor, Iain (2019). "Subjective Evaluation of Music Compressed with the ACER Codec Compared to AAC, MP3, and Uncompressed PCM". International Journal of Digital Multimedia Broadcasting. 2019: 1–16. doi:10.1155/2019/8265301.
  47. ^ The Olympus WS-120 digital speech recorder, according to its manual, can store about 178 hours of speech-quality audio in .WMA format in 500 MB of flash memory.
  48. ^ Coalson, Josh. "FLAC Comparison". Retrieved 2020-08-23.
  49. ^ "Format overview". Retrieved 2020-08-23.
  50. ^ a b Jaiswal, R.C. (2009). Audio-Video Engineering. Pune, Maharashtra: Nirali Prakashan. p. 3.41. ISBN 9788190639675.
  51. ^ a b c Faxin Yu; Hao Luo; Zheming Lu (2010). Three-Dimensional Model Analysis and Processing. Berlin: Springer. p. 47. ISBN 9783642126512.
  52. ^ US patent 2605361, C. Chapin Cutler, "Differential Quantization of Communication Signals", issued 1952-07-29 
  53. ^ Cummiskey, P.; Jayant, N. S.; Flanagan, J. L. (1973). "Adaptive Quantization in Differential PCM Coding of Speech". Bell System Technical Journal. 52 (7): 1105–1118. doi:10.1002/j.1538-7305.1973.tb02007.x.
  54. ^ Cummiskey, P.; Jayant, Nikil S.; Flanagan, J. L. (1973). "Adaptive quantization in differential PCM coding of speech". The Bell System Technical Journal. 52 (7): 1105–1118. doi:10.1002/j.1538-7305.1973.tb02007.x. ISSN 0005-8580.
  55. ^ a b c Schroeder, Manfred R. (2014). "Bell Laboratories". Acoustics, Information, and Communication: Memorial Volume in Honor of Manfred R. Schroeder. Springer. p. 388. ISBN 9783319056609.
  56. ^ Gray, Robert M. (2010). "A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol" (PDF). Found. Trends Signal Process. 3 (4): 203–303. doi:10.1561/2000000036. ISSN 1932-8346. Archived (PDF) from the original on 2010-07-04.
  57. ^ Guckert, John (Spring 2012). "The Use of FFT and MDCT in MP3 Audio Compression" (PDF). University of Utah. Archived (PDF) from the original on 2014-01-24. Retrieved 14 July 2019.
  58. ^ Luo, Fa-Long (2008). Mobile Multimedia Broadcasting Standards: Technology and Practice. Springer Science & Business Media. p. 590. ISBN 9780387782638.
  59. ^ Britanak, V. (2011). "On Properties, Relations, and Simplified Implementation of Filter Banks in the Dolby Digital (Plus) AC-3 Audio Coding Standards". IEEE Transactions on Audio, Speech, and Language Processing. 19 (5): 1231–1241. doi:10.1109/TASL.2010.2087755. S2CID 897622.
  60. ^ Brandenburg, Karlheinz (1999). "MP3 and AAC Explained" (PDF). Archived (PDF) from the original on 2017-02-13.
  61. ^ Princen, J.; Johnson, A.; Bradley, A. (1987). "Subband/Transform coding using filter bank designs based on time domain aliasing cancellation". ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 12. pp. 2161–2164. doi:10.1109/ICASSP.1987.1169405. S2CID 58446992.
  62. ^ Princen, J.; Bradley, A. (1986). "Analysis/Synthesis filter bank design based on time domain aliasing cancellation". IEEE Transactions on Acoustics, Speech, and Signal Processing. 34 (5): 1153–1161. doi:10.1109/TASSP.1986.1164954.
  63. ^ "Ricardo Sametband, La Nación Newspaper "Historia de un pionero en audio digital"" (in Spanish).
  64. ^ Zwicker, Eberhard; et al. (1967). The Ear As A Communication Receiver. Melville, NY: Acoustical Society of America. Archived from the original on 2000-09-14. Retrieved 2011-11-11.
  65. ^ "Summary of some of Solidyne's contributions to Broadcast Engineering". Brief History of Solidyne. Buenos Aires: Solidyne. Archived from the original on 8 March 2013. Retrieved 6 March 2013.
  66. ^ "Anuncio del Audicom, AES Journal, July-August 1992, Vol 40, # 7/8, pag 647".
  67. ^ "File Compression Possibilities". A Brief guide to compress a file in 4 different ways. 17 February 2017.
  68. ^ Dmitriy Vatolin; et al. (Graphics & Media Lab Video Group) (March 2007). Lossless Video Codecs Comparison '2007 (PDF) (Report). Moscow State University. Archived (PDF) from the original on 2008-05-15.
  69. ^ Chen, Jie; Koc, Ut-Va; Liu, KJ Ray (2001). Design of Digital Video Coding Systems: A Complete Compressed Domain Approach. CRC Press. p. 71. ISBN 9780203904183.
  70. ^ Li, Jian Ping (2006). Proceedings of the International Computer Conference 2006 on Wavelet Active Media Technology and Information Processing: Chongqing, China, 29-31 August 2006. World Scientific. p. 847. ISBN 9789812709998.
  71. ^ "Video Coding". CSIP website. Center for Signal and Information Processing, Georgia Institute of Technology. Archived from the original on 23 May 2013. Retrieved 6 March 2013.
  72. ^ Robinson, A. H.; Cherry, C. (1967). "Results of a prototype television bandwidth compression scheme". Proceedings of the IEEE. 55 (3). IEEE: 356–364. doi:10.1109/PROC.1967.5493.
  73. ^ a b Ghanbari, Mohammed (2003). Standard Codecs: Image Compression to Advanced Video Coding. Institution of Engineering and Technology. pp. 1–2. ISBN 9780852967102.
  74. ^ Reader, Cliff (2016-08-31). "Patent landscape for royalty-free video coding". In Tescher, Andrew G (ed.). Applications of Digital Image Processing XXXIX. Applications of Digital Image Processing XXXIX. Vol. 9971. San Diego, California: Society of Photo-Optical Instrumentation Engineers. pp. 99711B. Bibcode:2016SPIE.9971E..1BR. doi:10.1117/12.2239493. Archived from the original on 2016-12-08. Lecture recording, from 3:05:10.
  75. ^ a b c d "The History of Video File Formats Infographic — RealPlayer". 22 April 2012.
  76. ^ "Patent statement declaration registered as H261-07". ITU. Retrieved 11 July 2019.
  77. ^ "MPEG-2 Patent List" (PDF). MPEG LA. Archived (PDF) from the original on 2019-05-29. Retrieved 7 July 2019.
  78. ^ "MPEG-4 Visual - Patent List" (PDF). MPEG LA. Archived (PDF) from the original on 2019-07-06. Retrieved 6 July 2019.
  79. ^ "AVC/H.264 – Patent List" (PDF). MPEG LA. Retrieved 6 July 2019.
  80. ^ Chanda P, Bader JS, Elhaik E (27 Jul 2012). "HapZipper: sharing HapMap populations just got easier". Nucleic Acids Research. 40 (20): e159. doi:10.1093/nar/gks709. PMC 3488212. PMID 22844100.
  81. ^ Christley S, Lu Y, Li C, Xie X (Jan 15, 2009). "Human genomes as email attachments". Bioinformatics. 25 (2): 274–5. doi:10.1093/bioinformatics/btn582. PMID 18996942.
  82. ^ Pavlichin DS, Weissman T, Yona G (September 2013). "The human genome contracts again". Bioinformatics. 29 (17): 2199–202. doi:10.1093/bioinformatics/btt362. PMID 23793748.
  83. ^ Hosseini, Morteza; Pratas, Diogo; Pinho, Armando (2016). "A Survey on Data Compression Methods for Biological Sequences". Information. 7 (4): 56. doi:10.3390/info7040056.
  84. ^ "Data Compression via Logic Synthesis" (PDF).
  85. ^ Hilbert, Martin; López, Priscila (1 April 2011). "The World's Technological Capacity to Store, Communicate, and Compute Information". Science. 332 (6025): 60–65. Bibcode:2011Sci...332...60H. doi:10.1126/science.1200970. PMID 21310967. S2CID 206531385.
[edit]