Comparison of analog and digital recording

Analog recording versus digital recording compares the two ways in which sound is recorded and stored. Actual sound waves consist of continuous variations in air pressure. Representations of these signals can be recorded using either digital or analog techniques.

An analog recording is one where a property or characteristic of a physical recording medium is made to vary in a manner analogous to the variations in air pressure of the original sound. Generally, the air pressure variations are first converted (by a transducer such as a microphone) into an electrical analog signal in which either the instantaneous voltage or current is directly proportional to the instantaneous air pressure (or is a function of the pressure). The variations of the electrical signal in turn are converted to variations in the recording medium by a recording machine such as a tape recorder or record cutter—the variable property of the medium is modulated by the signal. Examples of properties that are modified are the magnetization of magnetic tape or the deviation (or displacement) of the groove of a gramophone disc from a smooth, flat spiral track.

A digital recording is produced by converting the physical properties of the original sound into a sequence of numbers, which can then be stored and read back for reproduction. Usually (virtually always), the sound is transduced (as by a microphone) to an analog signal in the same way as for analog recording, and then the analog signal is digitized, or converted to a digital signal, through an Analog-to-Digital converter (an electronic device) either integrated into the digital audio recorder or separate and connected between the recorder and the analog source. An electrical digital signal has variations in voltage and/or current which represent discrete numbers instead of being continuously mathematically related as a function to the air pressure variations of sound. There are two chief distinctions between an analog and a digital signal. The first is that the analog signal is continuous in time, meaning that it varies smoothly over time no matter how short a time period you consider, whereas the digital signal, in contrast, is discrete in time, meaning it has distinct parts that follow one after another with definite, unambiguous division points (called signal transitions) between them.

Each numerical value measured at a single instant in time for a single signal is called a sample; samples are measured at a regular periodic rate to record a signal. The accuracy of the conversion process depends on the sampling rate (how often the sound is sampled and a related numerical value is recorded) and the sampling depth, also called the quantization depth (how much information each sample contains, which can also be described as the maximum numerical size of each sampled value). However, unlike analog recording in which the quality of playback depends critically on the "fidelity" or accuracy of the medium and of the playback device, the physical medium storing digital samples may somewhat distort the encoded information without degrading the quality of playback so long as the original sequence of numbers can be recovered.

Main differences

It is a subject of debate whether analog audio is superior to digital audio or vice versa. The question is highly dependent on the quality of the systems (analog or digital) under review, and other factors which are not necessarily related to sound quality. Arguments for analog systems include the absence of fundamental error mechanisms which are present in digital audio systems, including aliasing, quantization noise, [1] and the absolute limitation of dynamic range. Advocates of digital point to the high levels of performance possible with digital audio, including excellent linearity in the audible band and low levels of noise and distortion (Sony Europe 2001).

Accurate, high quality sound reproduction is possible with both analog and digital systems. Excellent, expensive analog systems may outperform digital systems, and vice versa; in theory any system of either type may be surpassed by a better, more elaborate and costly system of the other type, but in general it tends to be less expensive to achieve any given standard of technical signal quality with a digital system, except when the standard is very low. One of the most limiting aspects of analog technology is the sensitivity of analog media to minor physical degradation; however, when the degradation is more pronounced, analog systems usually perform better, often still producing recognizable sound, while digital systems will usually fail completely, unable to play back anything from the medium. (See digital cliff.) The principal advantages that digital systems have are very uniform source fidelity, inexpensive media duplication (and playback) costs, and direct use of the digital 'signal' in today's popular portable storage and playback devices. Analog recordings by comparison require comparatively bulky, high-quality playback equipment to capture the signal from the media as accurately as digital.

Early in the development of the Compact Disc, engineers realized that the perfection of the spiral of bits was critical to playback fidelity. A scratch the width of a human hair (100 micrometres) could corrupt several dozen bits, resulting in at best a pop, and far worse, a loss of synchronization of the clock and data, giving a long segment of noise until resynchronized. This was addressed by encoding the digital stream with a multi-tiered error-correction coding scheme which reduces CD capacity by about 20%, but makes it tolerant to hundreds of surface imperfections across the disk without loss of signal. In essence, "error correction" can be thought of as "using the mathematically encoded backup copies of the data that was corrupted." Not only does the CD use redundant data, but it also mixes up the bits in a predetermined way (see CIRC) so that a small flaw on the disc will affect fewer consecutive bits of the decoded signal and allow for more effective error correction using the available backup information.

Error correction allows digital formats to tolerate quite a bit more media deterioration than analog formats. That is not to say poorly produced digital media are immune to data loss. Laser rot was most troublesome to the Laserdisc format, but also occurs to some pressed commercial CDs, and was caused in both cases by inadequate disc manufacture. (Note that Laserdisc, despite using a laser optical system that has become commonly associated with digital disc formats, is an old analog format, except for its optional digital audio tracks; the video image portion of the content is always analog.) There can occasionally be difficulties related to the use of consumer recordable/rewritable compact discs. This may be due to poor-quality CD recorder drives, low-quality discs, or incorrect storage, as the information-bearing dye layer of most CD-recordable discs is at least slightly sensitive to UV light and will be slowly bleached out if exposed to any amount of it. Most digital recordings rely at least to some extent on computational encoding and decoding and so may become completely unplayable if not enough consecutive good data is available for the decoder to synchronize to the digital data stream, whereas any intact fragment of any size of an analog recording is usually playable.

Unlike analog duplication, digital copies are usually exact replicas, which can be duplicated indefinitely without degradation, unless imposed DRM restrictions apply or mastering errors occur. Digital systems often have the ability for the same medium to be used with arbitrarily high or low quality encoding methods and number of channels or other content, unlike practically all analog systems which have mechanically pre-fixed speeds and channels. Most higher-end analog recording systems offer a few selectable recording speeds, but digital systems tend to offer much finer variation in the rate of media usage.

There are also several non-sound related advantages of digital systems that are practical. Digital systems that are computer-based make editing much easier through rapid random access, seeking, and scanning for non-linear editing. Most digital systems also allow non-audio data to be encoded into the digital stream, such as information about the artist, track titles, etc., which is often convenient. (However, it is technically possible, and not difficult, to implement analog systems with integrated digital metadata channels. In fact, it is possible to record digital metadata onto one track of an analog multitrack recording using any home computer from the 1980s, such as a Commodore 64, that can record data on cassettes.)

Noise and distortion

In the process of recording, storing and playing back the original analog sound wave (in the form of an electronic signal), it is unavoidable that some signal degradation will occur. This degradation is in the form of linear errors (consistent changes of the amplitude or phase within a specified passband) and non-linear errors (noise and distortion). Noise is unrelated in time to the original signal content, while distortion is in some way related in time to the original signal content.

Digital fundamentals

A digital recorder firstly requires the input of an analog signal; this signal may come directly from a microphone pre-amp, but any analog audio signal can be converted. Measurements of the signal intensity are then made at regular intervals (sampling) by the analog-to-digital converter. At each sampling point, the signal must be assigned a specific intensity from a set range of values (quantization). For doing this, the original sound wave can now be described using only numbers—as digital information. Each sample can be given an ordinal number which signifies the point in relative time that it represents, and the magnitude of the sample is an analog of pressure at the microphone (Watkinson 1994) (or, for an artificial sound signal, the pressure that would be at the microphone to correspond to that sound.) When the original signal is converted into numbers (usually binary numbers, 1's and 0's, called 'bits') further additions of noise and distortion, provided they are not great enough to cause digital errors, can be rejected at every stage of processing; this is what is referred to as the regenerative nature of digital signals. Digital errors, called bit errors in binary digital systems, are events of noise and/or distortion which cause one number (or bit) to appear more like another number than like the number it started out as. As long as a digital symbol appears closer to being what it began as than to anything else, it can be regenerated. When raw digital errors cannot be avoided, error correction coding can allow some of them to be detected and fixed. Error correction, essential when transferring digital audio over noisy channels, helps to eliminate bit errors by comparing extra data against the main data to detect limited numbers of digital errors, figure out which digital symbols (numbers) were changed, and change them back. When playing back a digital recording, the digital information is converted back into a continuous, analog signal by a digital-to-analog converter. This electronic signal is then amplified and converted back into a sound wave by a loudspeaker (just as would be done with the analog signal produced by an analog machine from an analog recording).

Noise performance

For electronic audio signals, sources of noise include (unavoidable) mechanical, electrical and thermal noise in the recording and playback cycle (from mechanical transducers (microphones, loudspeakers), amplifiers, recording equipment, the mastering process, reproduction equipment, etc.). Whether an audio signal is, at some stage, converted into a digital form will affect how much effective noise is added, due to the partial immunity to noise that the digital regenerative effect provides. The actual process of digital conversion will always add some noise, however small in intensity; the bulk of this in a high-quality system is quantization noise, which cannot be theoretically avoided, but some will also be electrical, thermal, etc. noise from the analog-to-digital converted device.

The amount of noise that a piece of audio equipment adds to the original signal can be quantified. Mathematically, this can be expressed by means of the signal to noise ratio (SNR). Sometimes the maximum possible dynamic range of the system is quoted instead. In a digital system, the number of quantization levels, in binary systems determined by and typically stated in terms of the number of bits, will have a bearing on the level of noise and distortion added to that signal. The 16-bit digital system of Red Book audio CD has 2¹⁶= 65,536 possible signal amplitudes, theoretically allowing for an SNR of 98 dB (Sony Europe 2001) and dynamic range of 96 dB.

In order to meet the theoretical maximum performance of a 16 bit digital system, for a 0.5 V peak to peak input line signal, a PCM (pulse code modulation) quantizer would require an equivalent minimum input sensitivity of just 7.629 microvolts. For an analog recorder, this is equivalent to a 15.3 ppm sensitivity for the whole recording system and medium.^{[citation needed]} With digital systems, the quality of reproduction depends on the analog-to-digital and digital-to-analog conversion steps, and does not depend on the quality of the recording medium, provided it is adequate to retain the digital values without an excessive error rate (exceeding the capacity of any error correction mechanisms used in the system).

Typically anything below 14 bits can lead to perceptibly reduced sound quality, with 80 dB of SNR considered as an informal "minimum" for Hi-Fi audio. However, it is uncommon to find digital media specified for less than 14 bits, except for older 12-bit PCM Camcorder audio (or DAT in long-play, 32 kHz mode) and the output from older or lower-cost computer software, sound cards/circuitry, consoles and games (typically 8 bit as a minimum and standard, though trick sample output methods for generally non-PCM hardware [e.g. FM synthesis cards including the Adlib card] gave SNR performances closer to that of an ideal "6" or "4" bit PCM digital converter).

Each additional quantization bit theoretically adds 6 dB in possible dynamic range, e.g. 24 x 6 = 144 dB for 24 bit quantization, 126 dB for 21-bit, and 120 dB for 20-bit.

Analog systems

Consumer analog cassette tapes may have a dynamic range of 60 to 70 dB. Analog FM broadcasts rarely have a dynamic range exceeding 50 dB. The dynamic range of a direct-cut vinyl record may surpass 70 dB. Analog studio master tapes using Dolby-A noise reduction can have a dynamic range of around 80 dB.

Rumble

"Rumble" is a form of noise characteristic of poor or worn turntables. Because of imperfections in the bearings of turntables, the platter tends to have a slight amount of motion besides the desired rotation—the turntable surface also moves up-and-down and side-to-side slightly. This additional motion is added to the desired signal as noise, usually of very low frequencies, creating a "rumbling" sound during quiet passages. Very inexpensive turntables sometimes used ball bearings which are very likely to generate audible amounts of rumble. More expensive turntables tend to use massive sleeve bearings which are much less likely to generate offensive amounts of rumble. Increased turntable mass also tends to lead to reduced rumble. A good turntable should have rumble at least 60 dB below the specified output level from the pick-up (Driscoll 1980:79-82).

Wow and flutter

Wow and flutter are the result of imperfections in the mechanical performance of analog devices. Wow and flutter are most noticeable on signals which contain pure tones. As an example, 0.22% (rms) wow may be detectable by listeners with piano music, but this increases to 0.56% with jazz music. For LP records, the quality of the turntable will have a large effect on the level of wow and flutter. A good turntable will have wow and flutter values of less than 0.05%, which is the speed variation compared to the mean value (Driscoll 1980). Wow and flutter can also be present in the recording, as a result of the imperfect operation of the recorder, but in commercial published recordings the inbuilt wow and flutter is usually very low, much lower than the wow and flutter generated by the playback equipment.

Frequency response

The frequency response of audio CD is sufficiently wide to cover the entire normal audible range, which roughly extends from 20 Hz to 20 kHz. (Hearing varies among individuals, and some can hear frequencies slightly beyond these limits, and older people or those with hearing damage may hear less of the high freqs.) Commercial and industrial digital recorders record higher frequencies, while consumer systems inferior to the CD record a more restricted frequency range. Analog audio is unrestricted in its possible frequency response, but the limitations of the particular analog format will provide a cap.

For digital systems, the maximum audio frequency response is "hardcoded" by the sampling frequency. The choice of sample rate used in a digital system is based on the Nyquist-Shannon sampling theorem. This states that a sampled signal can be reproduced exactly as long as it is sampled at a frequency greater than twice the bandwidth of the signal. Therefore a sampling rate of 40 kHz would be enough to capture all the information contained in a signal having frequency bandwidth up to 20 kHz. The difficulty arises in removing all the signal content above 20 kHz, and unless this is done, aliasing of these higher frequencies may occur. The result then is that these higher, inaudible frequencies alias to frequencies which are in the audible range, producing a kind of distortion. To prevent aliasing, it is not necessary to design a brick-wall anti-aliasing filter - that is a filter which perfectly removes all frequency content above (or below) a certain cutoff frequency. (It is impossible to build a filter with a perfectly square cutoff characteristic, as the filter would have an impulse response which is a sinc function and so is not causal.) Instead, a sample rate is usually chosen which is above the theoretical requirement. This is called oversampling, and allows a less severe (and less expensive) anti-aliasing filter to be used.

High quality open-reel tape frequency response can extend from 10 Hz to well above 20 kHz. The linearity of the response may be indicated by providing information on the level of the response relative to a reference frequency. For example, a system component may have a response given as 20 Hz to 20 kHz +/- 3 dB relative to 1 kHz. Some analog tape manufacturers specify frequency responses up to 20 kHz, but these measurements may have been made at low signal levels (Driscoll 1980). High-quality metal-particle compact cassettes may have a response extending up to 14 kHz at full (0 dB) recording level (Stark 1989). At lower levels, cassettes typically are limited at the upper end to around 17 kHz for the best machines, due to the nature of the tape media and the tape speed chosen by Philips for the format (which was originally designed for dictation.)

The frequency response for a conventional LP player might be 20 Hz - 20 kHz +/- 3 dB. Unlike the audio CD, vinyl records (and cassettes) do not require a cut-off in response above 20 kHz. The low frequency response of vinyl records is restricted by rumble noise (described above). The high frequency response of vinyl depends on the record itself and on the cartridge. CD4 records contained frequencies up to 50 kHz, while some high-end turntable cartridges have frequency responses of 120 kHz while having flat frequency response over the audible band (e.g. 20 Hz to 15 kHz +/-0.3 dB).^[1] In addition, frequencies of up to 122 kHz have been experimentally cut on LP records.^[2]

In comparison, the CD system offers a frequency response of 20 Hz – 20 kHz ±0.5 dB, with a superior dynamic range over the entire audible frequency spectrum (Sony Europe 2001).

With vinyl records, there will be some loss in fidelity on each playing of the disc. This is due to the wear of the stylus in contact with the record surface. A good quality stylus, matched with a correctly set up pick-up arm, should cause minimal surface wear. Magnetic tapes, both analog and digital, wear from friction between the tape and the heads, guides, and other parts of the tape transport as the tape slides over them. The brown residue deposited on swabs during cleaning of a tape machine's tape path is actually particles of magnetic coating shed from tapes. Tapes can also suffer creasing, stretching, and frilling of the edges of the plastic tape base, particularly from low-quality or out-of-alignment tape decks. When a CD is played, there is no physical contact involved, and the data is read optically using a laser beam. Therefore no such media deterioration takes place, and the CD will, with proper care, sound exactly the same every time it is played (discounting aging of the player and CD itself); however, this is a benefit of the optical system, not of digital recording, and the Laserdisc format enjoys the same non-contact benefit with analog optical signals. Recordable CDs slowly degrade with time, called disc rot, even if they are not played, and are stored properly.^[3]

Analog advantages

It can be argued that analog formats retain some inherent advantages over digital formats. The relevance of these advantages depends on the quality of specific digital or analog equipment. The advantages of analog systems are summarised below:

Absence of aliasing distortion
Absence of quantization noise
Behaviour in overload conditions

Aliasing

Unlike digital audio systems, analog systems do not require filters for bandlimiting. These filters act to prevent aliasing distortions in digital equipment. Early digital systems may have suffered from a number of signal degradations related to the use of analog anti-aliasing filters, e.g., time dispersion, nonlinear distortion, temperature dependence of filters etc. (Hawksford 1991:8).

Jitter

One aspect that may prevent the performance of practical digital systems from meeting their theoretical performance is jitter. This is the name given to the phenomenon of the variations in spacing of the discrete samples in time within the stream of samples that make up a (decoded) digital signal. This can be due to timing inaccuracies of the digital clock. Ideally a digital clock should produce a timing pulse at exactly regular intervals. Other sources of jitter within digital electronic circuits are data-induced jitter, where one part of the digital stream affects a subsequent part as it flows through the system, and power supply induced jitter, where DC ripple on the power supply output rails causes irregularities in the timing of signals in circuits powered from those rails.

The accuracy of a digital system is dependent on the sampled values, known as quantised values, which exist in the amplitude realm, but it is also dependent on the timing regularity of the discrete values which exist in the temporal realm. This dependency on accuracy of discrete values in the temporal realm is inherent to digital recording and playback and has no analog equivalent, though analog systems have their own temporal distortion effects (pitch error and wow-and-flutter).

Periodic jitter produces modulation noise and can be thought of as being the equivalent of analog flutter (Rumsey & Watkinson 1995). Random jitter alters the noise floor of the digital system. The sensitivity of the converter to jitter depends on the design of the converter. It has been shown that a random jitter of 5 ns (nanoseconds) may be significant for 16 bit digital systems (Rumsey & Watkinson 1995). For a more detailed description of jitter theory, refer to Dunn (2003).

Jitter can degrade sound quality in digital audio systems. In 1998, Benjamin and Gannon researched the audibility of jitter using listening tests (Dunn 2003:34). They found that the lowest level of jitter to be audible was around 10 ns (rms). This was on a 17 kHz sine wave test signal. With music, no listeners found jitter audible at levels lower than 20 ns. A paper by Ashihara et al. (2005) attempted to determine the detection thresholds for random jitter in music signals. Their method involved ABX listening tests. When discussing their results, the authors of the paper commented that:

'So far, actual jitter in consumer products seems to be too small to be detected at least for reproduction of music signals. It is not clear, however, if detection thresholds obtained in the present study would really represent the limit of auditory resolution or it would be limited by resolution of equipment. Distortions due to very small jitter may be smaller than distortions due to non-linear characteristics of loudspeakers. Ashihara and Kiryu [8] evaluated linearity of loudspeaker and headphones. According to their observation, headphones seem to be more preferable to produce sufficient sound pressure at the ear drums with smaller distortions than loudspeakers.' [2]

On the Internet-based hi-fi website TNT Audio, Pozzoli (2005) describes some audible effects of jitter. His assessment appears to run contrary to the earlier papers mentioned:

'In my personal experience, and I would dare say in common understanding, there is a huge difference between the sound of low and high jitter systems. When the jitter amount is very high, as in very low cost CD players (2ns), the result is somewhat similar to wow and flutter, the well known problem that affected typically compact cassettes (and in a far less evident way turntables) and was caused by the non perfectly constant speed of the tape: the effect is similar, but here the variations have a far higher frequency and for this reasons are less easy to perceive but equally annoying. Very often in these cases the rhythmic message, the pace of the most complicated musical plots is partially or completely lost, music is dull, scarcely involving and apparently meaningless, it does not make any sense. Apart for harshness, the typical "digital" sound, in a word... In lower amounts, the effect above is difficult to perceive, but jitter is still able to cause problems: reduction of the soundstage width and/or depth, lack of focus, sometimes a veil on the music. These effects are however far more difficult to trace back to jitter, as can be caused by many other factors.' [3]

Quantization noise

Analog systems do not have discrete digital levels in which the signal is encoded. Consequently, the original signal can be preserved to an accuracy limited only by the intrinsic noise-floor and maximum signal level of the media and the playback equipment, i.e., the dynamic range of the system. With digital systems, noise added due to quantization into discrete levels is more audibly disturbing than the noise-floor in analog systems. This form of distortion, sometimes called granular or quantization distortion, has been pointed to as a fault of some digital systems and recordings (Knee & Hawksford 1995, Stuart n.d.:6). Knee & Hawksford (1995:3) drew attention to the deficiencies in some early digital recordings, where the digital release was said to be inferior to the analog version. The quantization noise level is directly determined by the number of bits of quantization resolution, decreasing exponentially with it (or linearly in dB units), and with an adequate number of true bits of quantization, random noise from other sources will dominate and completely mask the quantization noise.

Overload conditions and dynamic range

There are some differences in the behaviour of analog and digital systems when high level signals are present, where there is the possibility that such signals could push the system into overload. With high level signals, analog magnetic tape approaches saturation, and high frequency response drops in proportion to low frequency response. While undesirable, the audible effect of this can be reasonably unobjectionable (Elsea 1996). In contrast, digital PCM recorders show non-benign behaviour in overload (Dunn 2003:65); samples that exceed the peak quantization level are simply truncated, clipping the waveform squarely, which introduces distortion in the form of large quantities of higher-frequency harmonics. The 'softness' of analog tape clipping allows a usable dynamic range that can exceed that of some PCM digital recorders. (PCM, or pulse code modulation, is the coding scheme used in Compact Disc, DAT, PC sound cards, and many studio recording systems.)

Counter-arguments

Aliasing distortion

The mentioned disadvantages of digital audio systems have been the subject of discussion. With regard to aliasing distortion, Hawksford (1991:18) highlighted the advantages of digital converters which operate at higher than the Nyquist rate (i.e., oversampling converters). Using an oversampling design and a modulation scheme called sigma-delta modulation (SDM), analog anti-aliasing filters can effectively be replaced by a digital filter. This approach has several advantages. The digital filter can be made to have a near-ideal transfer function, with low in-band ripple, and no aging or thermal drift.

Quantization

It is possible to make quantization noise more audibly benign by applying dither. To do this, a noise-like signal is added to the original signal before quantization. Dither makes the digital system behave as if it has an analog noise-floor. Optimal use of dither (triangular probability density function dither in PCM systems) has the effect of making the rms quantization error independent of signal level (Dunn 2003:143), and allows signal information to be retained below the least significant bit of the digital system (Stuart n.d.:3).

Overload conditions and dynamic range

In principle, PCM digital systems have the lowest level of nonlinear distortion at full signal amplitude. The opposite is usually true of analog systems, where distortion tends to increase at high signal levels. A study by Manson (1980) considered the requirements of a digital audio system for high quality broadcasting. It concluded that a 16 bit system would be sufficient, but noted the small reserve the system provided in ordinary operating conditions. For this reason, it was suggested that a fast-acting signal limiter or 'soft clipper' be used to prevent the system from becoming overloaded (Manson 1980:8).

With many recordings, high level distortions at signal peaks may be audibly masked by the original signal, thus large amounts of distortion may be acceptable at peak signal levels. The difference between analog and digital systems is the form of high-level signal error. Some early analog-to-digital converters displayed non-benign behaviour when in overload, where the overloading signals were 'wrapped' from positive to negative full-scale. Modern converter designs based on sigma-delta modulation may become unstable in overload conditions. It is usually a design goal of digital systems to limit high-level signals to prevent overload (Dunn 2003:65). To prevent overload, a modern digital system may compress input signals so that digital full-scale cannot be reached (Jones et al. 2003:4).

The dynamic range of digital audio systems can exceed that of analog audio systems. Typically, a 16 bit analog-to-digital converter may have a dynamic range of between 90 to 95 dB (Metzler 2005:132), whereas the signal-to-noise ratio (roughly the equivalent of dynamic range, noting the absence of quantization noise but presence of tape hiss) of a professional reel-to-reel 1/4 inch tape recorder would be between 60 and 70 dB at the recorder's rated output (Metzler 2005:111).

The benefits of using digital recorders with greater than 16 bit accuracy can be applied to the 16 bits of audio CD. Stuart (n.d.:3) stresses that with the correct dither, the resolution of a digital system is infinite, and that it is possible, for example, to resolve sounds at -110 dB (below digital full-scale) in a well-designed 16 bit channel.

Sound quality

Subjective evaluation

Subjective evaluation attempts to measure how well an audio component performs according to the human ear. The most common form of subjective test is a listening test, where the audio component is simply used in the context for which it was designed. This test is popular with hi-fi reviewers, where the component is used for a length of time by the reviewer who then will describe the performance in subjective terms. Common descriptions include whether the component has a 'bright' or 'dull' sound, or how well the component manages to present a 'spatial image'.

Another type of subjective test is done under more controlled conditions and attempts to remove possible bias from listening tests. These sorts of tests are done with the component hidden from the listener, and are called blind tests. To prevent possible bias from the person running the test, the blind test may be done so that this person is also unaware of the component under test. This type of test is called a double-blind test. This sort of test is often used to evaluate the performance of digital audio codecs.

There are critics of double-blind tests who see them as not allowing the listener to feel fully relaxed when evaluating the system component, and can therefore not judge differences between different components as well as in sighted (non-blind) tests. Those who employ the double-blind testing method may try to reduce listener stress by allowing a certain amount of time for listener training (Borwick et al. 1994:481-488).

Early digital recordings

Early digital audio machines had disappointing results, with digital converters introducing errors that the ear could detect (Watkinson 1994). Record companies released their first LPs based on digital audio masters in the late 1970s. CDs became available in the early 1980s. At this time analog sound reproduction was a mature technology.

There was a mixed critical response to early digital recordings released on CD. Compared to vinyl record, it was noticed that CD was far more revealing of the acoustics and ambient background noise of the recording environment (Greenfield et al. 1986). For this reason, recording techniques developed for analog disc, e.g., microphone placement, needed to be adapted to suit the new digital format (Greenfield et al. 1986).

Some analog recordings were remastered for digital formats. Analog recordings made in natural concert hall acoustics tended to benefit from remastering (Greenfield et al. 1990). The remastering process was occasionally criticised for being poorly handled. When the original analog recording was fairly bright, remastering sometimes resulted in an unnatural treble emphasis (Greenfield et al. 1990).

Higher sampling rates

CD quality audio is sampled at 44.1 kHz (Nyquist frequency = 22.05 kHz) and at 16 bits. Sampling the waveform at higher frequencies and allowing for a greater number of bits per sample allows noise and distortion to be reduced further. DAT can store audio at up to 48 kHz, while DVD-Audio can be 96 or 192 kHz and up to 24 bits resolution. With any of these sampling rates, signal information is captured above what is generally considered to be the human hearing range.

Work done in 1980 by Muraoka et al. (J.Audio Eng. Soc., Vol 29, pp2–9) showed that music signals with frequency components above 20 kHz were only distinguished from those without by a few of the 176 test subjects (Kaoru & Shogo 2001). Later papers, however, by a number of different authors, have led to a greater discussion of the value of recording frequencies above 20 kHz. Such research led some to the belief that capturing these ultrasonic sounds could have some audible benefit. Audible differences were reported between recordings with and without ultrasonic responses. Dunn (1998) examined the performance of digital converters to see if these differences in performance could be explained [4]. He did this by examining the band-limiting filters used in converters and looking for the artifacts they introduce.

A perceptual study by Nishiguchi et al. (2004) concluded that "no significant difference was found between sounds with and without very high frequency components among the sound stimuli and the subjects... however, [Nishiguchi et al] can still neither confirm nor deny the possibility that some subjects could discriminate between musical sounds with and without very high frequency components." ^[4]

Super Audio CD and DVD Audio

The Super Audio CD (SACD) format was created by Sony and Philips, who were also the developers of the earlier standard audio CD format. SACD uses Direct Stream Digital, which works quite differently from the PCM format discussed in this article. Instead of using a greater number of bits and attempting to record a signal's precise amplitude for every sample cycle, a Direct Stream Digital recorder works by encoding a signal in a series of PWM pulses of fixed amplitude but variable duration and timing. The competing DVD-Audio format uses standard, linear PCM at variable sampling rates and bit depths, which at the very least match and usually greatly surpass those of a standard CD Audio (16 bits, 44.1 kHz).

A Direct Stream Digital (DSD) recorder uses sigma-delta modulation. Originally DSD recorders operated at 64 times the Nyquist rate (44.1 kHz), at around 3 MHz. The output from a DSD recorder alternates between levels representing 'on' and 'off' states, and is a binary signal (called a bitstream). The long-term average of this signal is proportional to the original signal. In principle, the retention of the bitstream in DSD allows the SACD player to use a basic (one bit) DAC design which incorporates a low-order analog filter.

There are fundamental distortion mechanisms present in the conventional implementation of DSD (Hawksford 2001). These distortion mechanisms can be alleviated to some degree by using digital converters with a multibit design. Historically, state-of-the-art ADCs were based around sigma-delta modulation designs. Oversampling converters are frequently used in linear PCM formats, where the ADC output is subject to bandlimiting and dithering (Hawksford 1995). Many modern converters use oversampling and a multibit design.

In the popular Hi-Fi press, it has been suggested that linear PCM "creates [a] stress reaction in people", and that DSD "is the only digital recording system that does not [...] have these effects" (Hawksford 2001). A double-blind subjective test between high resolution linear PCM (DVD-Audio) and DSD did not reveal a statistically significant difference [5]. Listeners involved in this test noted their great difficulty in hearing any difference between the two formats.

Analog warmth

Some audio enthusiasts prefer the sound of vinyl records over that of a CD. Founder and editor Harry Pearson of The Absolute Sound journal says that "LPs are decisively more musical. CDs drain the soul from music. The emotional involvement disappears" [6]^{[dead link‍]}. Dub producer Adrian Sherwood has similar feelings about the analog cassette tape, which he prefers because of its warm sound [7].

Those who favour the digital format point to the results of blind tests, which demonstrate the high performance possible with digital recorders [8], [9]. The assertion is that the 'analog sound' is more a product of analog format inaccuracies than anything else. One of the first and largest supporters of digital audio was the classical conductor Herbert von Karajan, who said that digital recording was "definitely superior to any other form of recording we know". He also pioneered the unsuccessful Digital Compact Cassette and conducted the first recording ever to be commercially released on CD: Richard Strauss's Eine Alpensinfonie.

Was it ever entirely analog or digital?

Complicating the discussion is that recording professionals often mix and match analog and digital techniques in the process of producing a recording. Analog signals can be subjected to digital signal processing or effects, and inversely digital signals are converted back to analog in equipment that can include analog steps such as vacuum tube amplification.

For modern recordings, the controversy between analog recording and digital recording is becoming moot. No matter what format the user uses, the recording probably was digital at several stages in its life. In case of video recordings it is moot for one other reason; whether the format is analog or digital, digital signal processing is likely to have been used in some stages of its life, such as digital timebase correction on playback.

An additional complication arises when discussing human perception when comparing analog and digital audio in that the human ear itself, is an analog-digital hybrid. The human hearing mechanism begins with the tympanic membrane transferring vibrational motion through the middle-ear's mechanical system—three bones (malleus, incus and stapes)—into the cochlea where hair-like nerve cells convert the vibrational motion stimulus into nerve impulses. Auditory nerve impulses are best described as "clicks" which result when synapses release neuro-transmitting chemicals (see here) The auditory information thus entering the brain, is strictly speaking, digital in nature. The brain then processes the incoming information and produces an impression of the original analog input to the ear canal.

It is also worth noting two issues that impact perception of sound playback. The first is human ear dynamic range which for practical and hearing safety reasons might be regarded as 120 decibels, from barely audible sound received by the ear situated within an otherwise silent environment, to the threshold of pain or onset of damage to the ear's delicate mechanism. The other critical issue is manifestly more complex; the presence and nature of background noise in any listening environment. Background noise subtracts useful hearing dynamic range, in any number of ways that depend on the nature of the noise from the listening environment: noise spectral content, noise coherence or periodicity, angular aspects such as localization of noise sources with respect to localization of playback system sources and so on.

Hybrid systems

While the words analog audio usually imply that the sound is described using a continuous time, continuous amplitudes approach in both the media and the reproduction/recording systems, and the words digital audio imply a discrete time, discrete amplitudes approach, there are methods of encoding audio that fall somewhere between the two, e.g. continuous time, discrete levels and discrete time, continuous levels.

While not as common as "pure analog" or "pure digital" methods, these situations do occur in practice. Indeed, all analog systems show discrete (quantized) behaviour at the microscopic scale [10], and e.g. asynchronously operated class-D amplifiers even consciously incorporate continuous time, discrete amplitude designs. Continuous amplitude, discrete time systems have also been used in many early analog-to-digital converters, in the form of sample-and-hold circuits. The boundary is further blurred by digital systems which statistically aim at analog-like behavior, most often by utilizing stochastic dithering and noise shaping techniques. While vinyl records and common compact cassettes are analog media and use quasi-linear physical encoding methods (e.g. spiral groove depth, tape magnetic field strength) without noticeable quantization or aliasing, there are analog non-linear systems that exhibit effects similar to those encountered on digital ones, such as aliasing and "hard" dynamic floors (e.g. frequency modulated hi-fi audio on videotapes, PWM encoded signals).

Although those "hybrid" techniques are usually more common in telecommunications systems than in consumer audio, their existence alone blurs the distinctive line between certain digital and analog systems, at least for what regards some of their alleged advantages or disadvantages.

References

^ Technics EPC-100CMK4
^ http://www.positive-feedback.com/Issue2/mastering.htm
^ http://www.clir.org/pubs/reports/pub121/pub121.pdf
^ http://www.nhk.or.jp/strl/publica/labnote/lab486.html

Ashihara, K. et al. (2005). "Detection threshold for distortions due to jitter on digital audio", Acoustical Science and Technology, Vol. 26 (2005), No. 1 pp. 50–54.
Blech, D. & Yang, M. (2004). "Perceptual Discrimination of Digital Coding Formats", Audio Engineering Society Convention Paper 6086, May 2004.
Croll, M. (1970). "Pulse Code Modulation for High Quality Sound Distribution: Quantizing Distortion at Very Low Signal Levels", Research Department Report No. 1970/18, BBC.
Driscoll, R. (1980). Practical Hi-Fi Sound, 'Analogue and digital', pages 61–64); 'The pick-up, arm and turntable', pages 79–82). Hamlyn. ISBN 0 600 34627 7.
Dunn, J. (1998). "The benefits of 96 kHz sampling rate formats for those who cannot hear above 20 kHz", Preprint 4734, presented at the 104th AES Convention, May 1998.
Dunn, J. (2003). "Measurement Techniques for Digital Audio", Audio Precision Application Note #5, Audio Precision, Inc. USA. Retrieved March 9, 2008.
Elsea, P. (1996). "Analog Recording of Sound". Electronic Music Studios at the University of California, Santa Cruz. Retrieved 9 March 2008.
Ely, S. (1978). "Idle-channel noise in p.c.m. sound-signal systems". BBC Research Department, Engineering Division.
Greenfield, E. et al. (1986). The Penguin Guide to Compact Discs, Cassettes and LPs. Edited by Ivan March. Penguin Books, England.
Greenfield, E. et al. (1990). The Penguin Guide to Compact Discs. Edited by Ivan March. Preface, viii-ix. Penguin Books, England. ISBN 0 14 046887 0.
Hawksford, M. (1991). "Introduction to Digital Audio", Images of Audio, Proceedings of the 10th International AES Conference, London, September 1991. Retrieved March 9, 2008.
Hawksford, M. (1995). "Bitstream versus PCM debate for high-density compact disc", ARA/Meridian web page, November 1995.
Hawksford, M. (2001). "SDM versus LPCM: The Debate Continues", 110th AES Convention, paper 5397.
Hicks, C. (1995). "The Application of Dither and Noise-Shaping to Nyquist-Rate Digital Audio: an Introduction", Communications and Signal Processing Group, Cambridge University Engineering Department, United Kingdom.
Jones, W. et al. (2003). "Testing Challenges in Personal Computer Audio Devices". Paper presented at the 114th AES Convention. Audio Precision, Inc., USA. Retrieved March 9, 2008.
Kaoru, A. & Shogo, K. (2001). "Detection threshold for tones above 22 kHz", Audio Engineering Society Convention Paper 5401. Presented at the 110th Convention, 2001.
Knee, A. & Hawksford, M. (1995). "Evaluation of Digital Systems and Digital Recording Using Real Time Audio Data". Paper for the 98th AES Convention, February 1995, preprint 4003 (M-2).
Lesurf, J. "Analog or Digital?", The Scots Guide to Electronics. Retrieved October 2007.
Libbey, T. "Digital versus analog: digital music on CD reigns as the industry standard", Omni, February 1995.
Lipshitz, S. "The Digital Challenge: A Report", The BAS Speaker, Aug-Sept 1984.
Lipshitz, S. (2005). "The Rise of Digital Audio: The Good, the Bad, and the Ugly". Abstract of Heyser Memorial Lecture given by Prof. Stanley Lipshitz at the 118th AES Convention.
Liversidge, A. "Analog versus digital: has vinyl been wrongly dethroned by the music industry?", Omni, February 1995.
Manson, W. (1980). "Digital Sound: studio signal coding resolution for broadcasting". BBC Research Department, Engineering Division.
Nishiguchi, T. et al. (2004). "Perceptual Discrimination between Musical Sounds with and without Very High Frequency Components", NHK Laboratories Note No. 486, NHK (Japan Broadcasting Corporation).
Pozzoli, G. "DIGITabilis: crash course on digital audio interfaces. Part 1.4 - Enemy Interception. Effects of Jitter in Audio", "TNT-Audio - online HiFi review", 2005.
Pohlmann, K. (2005). Principles of Digital Audio 5th edn, McGraw-Hill Comp.
Rathmell, J. et al. (1997). "TDFD-based Measurement of Analog-to-Digital Converter Nonlinearity", Journal of the Audio Engineering Society, Volume 45, Number 10, pp. 832–840; October 1997.
Rumsey, F. & Watkinson, J. (1995). The Digital Interface Handbook, 2nd edition. Sections 2.5 and 6. Pages 37 and 154-160. Focal Press.
Sony Europe (2001). Digital Audio Technology 4th edn, edited by J. Maes & M. Vercammen. Focal Press.
Stark, C. (1989). Encyclopædia Britannica, 15th edition, Volume 27, Macropaedia article 'Sound', section: 'High-fidelity concepts and systems', page 625.
Stuart, J. (n.d.). "Coding High Quality Digital Audio". Meridian Audio Ltd, UK. Retrieved 9 March 2008. This article is substantially the same as Stuart's 2004 JAES article "Coding for High-Resolution Audio Systems", Journal of the Audio Engineering Society, Volume 52 Issue 3 pp. 117–144; March 2004.
Watkinson J. (1994). An Introduction to Digital Audio. Section 1.2 'What is digital audio?', page 3; Section 2.1 'What can we hear?', page 26. Focal Press. ISBN 0 240 51378 9.

[1] Technics EPC-100CMK4

[2] ttp://www.positive-feedback.com/Issue2/mastering.htm

[3] ttp://www.clir.org/pubs/reports/pub121/pub121.pdf

[4] ttp://www.nhk.or.jp/strl/publica/labnote/lab486.html

[1]

[2]

[3]

[4]

Main differences

Noise and distortion

Digital fundamentals

Noise performance

Analog systems

Rumble

Wow and flutter

Frequency response

Analog advantages

Aliasing

Jitter

Quantization noise

Overload conditions and dynamic range

Counter-arguments

Aliasing distortion

Quantization

Overload conditions and dynamic range

Sound quality

Subjective evaluation

Early digital recordings

Higher sampling rates

Super Audio CD and DVD Audio

Analog warmth

Was it ever entirely analog or digital?

Hybrid systems

See also

References