|
[3.6] What are the audio details?
DVD comes in two home-entertainment flavors: DVD-Video
and DVD-Audio. Each supports high-definition multichannel audio,
but DVD-Audio includes higher-quality PCM audio.
[3.6.1] Details of DVD-Audio and SACD
LPCM is mandatory in DVD-Audio discs, with up to
6 channels at sample rates of 48/96/192 kHz (also 44.1/88.2/176.4
kHz) and sample sizes of 16/20/24 bits. This allows theoretical
frequency response of up to 96 kHz and dynamic range of up to 144
dB. Multichannel PCM is downmixable by the player, although at 192
and 176.4 kHz only two channels are available. Sampling rates and
sizes can vary for different channels by using a predefined set
of groups. The maximum data rate is 9.6 Mbps.
The DVD Forum's Working Group for audio (WG4) decided
to include lossless compression, and on August 5, 1998 approved
Meridian's MLP (Meridian Lossless
Packing) scheme, licensed by Dolby. MLP removes redundancy from
the signal to achieve a compression ratio of about 2:1 while allowing
the PCM signal to be completely recreated by the MLP decoder that's
required in all DVD-Audio players. MLP allows playing times of about
74 to 135 minutes of 6-channel 96-kHz/24-bit audio on a single layer
(compared to 45 minutes without packing). Two-channel 192-kHz/24-bit
playing times are about 120 to 140 minutes (compared to 67 minutes
without packing).
Other audio formats of DVD-Video (Dolby Digital,
MPEG audio, and DTS, described below) are optional on DVD-Audio
discs, although Dolby Digital is required for audio content that
has associated video. A subset of DVD-Video features (no angles,
no seamless branching, etc.) is allowed. Most DVD-Audio players
are also "universal" players that play DVD-Video discs as well.
DVD-Audio includes specialized downmixing features
for PCM channels. Unlike DVD-Video, where the decoder determines
how to mix from 6 channels down to 2, DVD-Audio includes coefficent
tables to control mixdown and avoid volume buildup from channel
aggregation. Up to 16 tables can be defined by each Audio Title
Set (album), and each track can be identified with a table. Coefficients
range from 0dB to 60dB. This feature goes by the horribly contrived
name of SMART (system-managed audio resource technique). (Dolby
Digital, supported in both DVD-Audio and DVD-Video, also includes
downmixing information that can be set at encode time.)
DVD-Audio can provide up to 99 still images per
track (at typical compression levels about 20 images fit into the
2 MB buffer in the player), with a set of limited transitions (cut
in/out, fade in/out, dissolve, and wipe). Unlike DVD-Video, the
user can move at will through the slides without interrupting the
audio as it plays: this is called a browsable slideshow. On-screen
displays can be used for synchronized lyrics and navigation menus.
A special simplified navigation mode can be used on players without
a video display.
Sony and Philips are promoting SACD, a competing
DVD-based format using Direct Stream Digital (DSD) encoding with
sampling rates of 2.8224 MHz. DSD is based on the pulse-density
modulation (PDM) technique that uses single bits to represent the
incremental rise or fall of the audio waveform. This supposedly
improves quality by removing the brick wall filters required for
PCM encoding. It also makes downsampling more accurate and efficient.
DSD provides a frequency response from DC to over 100 kHz with a
dynamic range of over 120 dB. DSD includes a lossless encoding technique
that produces approximately 2:1 data reduction by predicting each
sample and then run-length encoding the error signal. The maximum
data rate is 2.8 Mbps.
SACD includes a physical watermarking feature,
pit signal processing (PSP), which modulates the width of pits on
the disc to store a digital watermark (data is stored in the pit
length). The optical pickup must contain additional circuitry to
read the PSP watermark, which is then compared to information on
the disc to make sure it's legitimate. Because of the requirement
for specialized watermark detection circuitry, protected SACD discs
are not playable in standard DVD-ROM drives.
SACD includes text and still graphics, but no video.
Sony says the format is aimed at audiophiles and is not intended
to replace the audio CD format.
See 1.12
for more general info on DVD-Audio and SACD.
The following details are for audio tracks in DVD-Video.
Some DVD manufacturers such as Pioneer are developing audio-only
players using the DVD-Video format. Some DVD-Video discs contain
mostly audio with only still pictures.
A DVD-Video disc can have up to 8 audio tracks
(streams) associated with each video track (or each video angle).
Each audio track can be in one of three formats:
- Dolby Digital (AC-3): 1 to 5.1 channels
- MPEG-2 audio: 1 to 5.1 or 7.1 channels
- PCM: 1 to 8 channels.
Two additional optional formats are provided: DTS
and SDDS. Both require the appropriate decoders and are not supported
by all players.
The ".1" refers to a low-frequency effects (LFE)
channel that connects to a subwoofer. This channel carries an emphasized
bass audio signal.
Linear PCM is uncompressed
(lossless) digital audio, the same format used on CDs and most studio
masters. It can be sampled at 48 or 96 kHz with 16, 20, or 24 bits/sample.
(Audio CD is limited to 44.1 kHz at 16 bits.) There can be from
1 to 8 channels. The maximum bit rate is 6.144 Mbps, which limits
sample rates and bit sizes when there are 5 or more channels. It's
generally felt that the 120 dB dynamic range of 20 bits combined
with a frequency response of around 22,000 Hz from 48 kHz sampling
is adequate for high-fidelity sound reproduction. However, additional
bits and higher sampling rates are useful in audiophile applications,
studio work, noise shaping, advanced digital processing, and three-dimensional
sound field reproduction. DVD players are required to support all
the variations of LPCM, but many subsample 96 kHz down to 48 kHz,
and some may not use all 20 or 24 bits. The signal provided on the
digital output for external digital-to-analog converters may be
limited to less than 96 kHz and less than 24 bits.
Dolby Digital is multi-channel
digital audio, using lossy AC-3 coding technology from PCM source
with a sample rate of 48 kHz at up to 24 bits. The bitrate is 64
kbps to 448 kbps, with 384 or 448 being the normal rate for 5.1
channels and 192 being the typical rate for stereo (with or without
surround encoding). (Most Dolby Digital decoders support up to 640
kbps, so non-standard discs with 640 kbps tracks play on many players.)
The channel combinations are (front/surround): 1/0, 1+1/0 (dual
mono), 2/0, 3/0, 2/1, 3/1, 2/2, and 3/2. The LFE channel is optional
with all 8 combinations. For details see ATSC document A/52 <http://www.atsc.org/>. Dolby Digital is the
format used for audio tracks on almost all DVDs.
MPEG audio is multi-channel
digital audio, using lossy compression from original PCM format
with sample rate of 48 kHz at 16 or 20 bits. Both MPEG-1 and MPEG-2
formats are supported. The variable bit rate is 32 kbps to 912 kbps,
with 384 being the normal average rate. MPEG-1 is limited to 384
kbps. Channel combinations are (front/surround): 1/0, 2/0, 2/1,
2/2, 3/0, 3/1, 3/2, and 5/2. The LFE channel is optional with all
combinations. The 7.1 channel format adds left-center and right-center
channels, but is rare for home use. MPEG-2 surround channels are
in an extension stream matrixed onto the MPEG-1 stereo channels,
which makes MPEG-2 audio backwards compatible with MPEG-1 hardware
(an MPEG-1 system will only see the two stereo channels.) MPEG Layer
3 (MP3) and MPEG-2 AAC (also known as NBC or unmatrix) are not supported
by the DVD-Video standard. MPEG audio is not used much on DVDs,
although some inexpensive DVD recording software programs use MPEG
audio, even on NTSC discs, which goes against the DVD standard and
is not supported by all NTSC players.
DTS (Digital Theater Systems)
Digital Surround is an optional multi-channel digital audio format,
using lossy compression from PCM at 48 kHz at up to 24 bits. The
data rate is from 64 kbps to 1536 kbps, with typical rates of 754.5
and 1509.25 for 5.1 channels and 377 or 754 for 2 channels. (The
DTS Coherent Acoustics format supports up to 4096 kbps variable
data rate for lossless compression, but this isn't supported by
DVD. DVD also does not allow DTS sampling rates other than 48 kHz.).
Channel combinations are (front/surround): 1/0, 2/0, 3/0, 2/1, 2/2,
3/2. The LFE channel is optional with all combinations. DTS ES support
6.1 channels in two ways: 1) a Dolby Surround EX compatible matrixed
rear center channel, 2) a discrete 7th channel. DTS also has a 7.1-channel
mode (8 discrete channels), but no DVDs have used it yet. The 7-channel
and 8-channel modes require a new decoder. The DVD standard includes
an audio stream format reserved for DTS, but many older players
ignore it. The DTS format used on DVDs is different from the one
used in theaters (Audio Processing Technology's apt-X, an ADPCM
coder, not a psychoacoustic coder). All DVD players can play DTS
audio CDs, since the standard PCM stream holds the DTS code. See
1.32 for general DTS information. For more info visit <http://www.dtstech.com/> and read Adam Barratt's
article.
SDDS (Sony Dynamic Digital
Sound) is an optional multi-channel (5.1 or 7.1) digital audio format,
compressed from PCM at 48 kHz. The data rate can go up to 1280 kbps.
SDDS is a theatrical film soundtrack format based on the ATRAC compression
format that is also used by Minidisc. Sony has not announced any
plans to support SDDS on DVD.
THX (Tomlinson Holman Experiment)
is not an audio format. It's a certification and quality control
program that applies to sound systems and acoustics in theaters,
home equipment, and digital mastering processes. The LucasFilm THX
Digital Mastering program uses a patented process to track video
quality through the multiple video generations needed to make a
final format disc or tape, setup of video monitors to ensure that
the filmmaker is seeing a precise rendition of what is on tape before
approval of the master, and other steps along the way. THX-certified
"4.0" amplifiers enhance Dolby Pro Logic in the following ways:
a crossover that sends bass from front channels to subwoofer; re-equalization
on front channels (to compensate for high-frequency boost in theater
mix designed for speakers behind the screen); timbre matching on
rear channels; decorrelation of rear channels; a bass curve that
emphasizes low frequencies. THX-certified "5.1" amplifiers enhance
Dolby Digital and improve on 4.0 in the following ways: rear speakers
are full range, so the crossover sends bass from both front and
rear to the subwoofer; decorrelation is turned on automatically
when rear channels have the same audio, but not during split-surround
effects, which don't need to be decorrelated. More info at Home THX Program
Overview.
Discs containing 525/60 (NTSC) video must use PCM
or Dolby Digital on at least one track. Discs containing 625/50
(PAL/SECAM) video must use PCM or MPEG audio or Dolby Digital on
at least one track. Additional tracks may be in any format. A few
first-generation players, such as those made by Matsushita, can't
output MPEG-2 audio to external decoders.
The original DVD-Video spec required either MPEG
audio or PCM on 625/50 (PAL) discs. There was a brief scuffle led
by Philips when early discs came out with only two-channel MPEG
and multichannel Dolby Digital, but the DVD Forum clarified in May
of 1997 that only stereo MPEG audio was mandatory for 625/50 discs.
In December 1997 the lack of MPEG-2 encoders (and decoders) was
a big enough problem that the spec was revised to allow Dolby Digital
audio tracks to be used on 625/50 discs without MPEG audio tracks.
Because of the 4% speedup from 24 fps film to 25
fps PAL display, the audio must be adjusted to match before it is
encoded. Unless the audio is digitally processed to shift the pitch
back to normal it will be slightly high (about half a semitone).
For stereo output (analog or digital), all players
have a built-in 2-channel Dolby Digital decoder that downmixes
from 5.1 channels (if present on the disc) to Dolby Surround stereo.
That is, 5 channels are phase matrixed into 2 channels to
be decoded to 4 channels by a Dolby Pro Logic processor or 5 channels
by a Pro Logic II processor. PAL players also have an MPEG or MPEG-2
audio decoder. Both Dolby Digital and MPEG-2 support 2-channel Dolby
Surround as the source in cases where the disc producer can't or
doesn't want to remix the original onto discrete channels. This
means that a DVD labeled as having Dolby Digital sound may only
use the L/R channels for surround or "plain" stereo. Even movies
with old monophonic soundtracks may use Dolby Digital with only
1 or 2 channels. Some players can optionally downmix to non-surround
stereo. If surround audio is important to you, you will hear significantly
better results from multichannel discs if you have a Dolby Digital
system.
The new Dolby Digital Surround EX format (DD-EX),
which adds a rear center channel, is compatible with DVD discs and
players, and with existing Dolby Digital decoders. The new DTS-ES
Matrix format, which likewise adds a rear center channel, works
with existing DTS decoders and with DTS-compatible DVD players.
However, for full use of either new format you need a new decoder
to extract the rear center channel, which is phase matrixed into
the two standard rear channels in the same way Dolby Surround is
matrixed into standard stereo channels. Without a new decoder you'll
get the same 5.1-channel audio you get now. Because the additional
rear channel isn't a full-bandwidth discrete channel, it's appropriate
to call the new formats "5.2-channel" digital surround. There is
also DTS-ES Discrete, which adds a full-bandwidth discrete rear
center channel in an extension stream which is used by DTS ES Discrete
decoders but ignored by older DTS decoders. DTS-ES decoders include
DTS Neo:6, which is not an encoding format but a matrix decoding
process that provides 5 or 6 channels.
The Dolby Digital downmix process does not usually
include the LFE channel and may compress the dynamic range in order
to improve dialog audibility and keep the sound from becoming "muddy"
on average home audio systems. This can result in reduced sound
quality on high-end audio systems. The downmix is auditioned when
the disc is prepared, and if the result is not acceptable the audio
may be tweaked or a separate L/R Dolby Surround track may be added.
Experience has shown that minor tweaking is sometimes required to
make the dialog more audible within the limited dynamic range of
a home stereo system. Some disc producers include a separately mixed
stereo track rather than fiddle with the surround mix.
The Dolby Digital dynamic range compression
(DRC) feature, often called midnight mode, reduces the difference
between loud and soft sounds so that you can turn the volume down
to avoid disturbing others yet still hear the detail of quiet passages.
Some players have the option to turn off DRC.
Dolby Digital also includes a feature called dialog
normalization (DN), which should more accurately be called volume
standardization. DN is designed to keep the sound level the same
when switching between different sources. This will become more
important as additional Dolby Digital sources (digital satellite,
DTV, etc) become common. Each Dolby Digital track contains loudness
information so that the receiver can automatically adjust the volume,
turning it down, for example, on a loud commercial. (Of course the
commercial makers can cheat and set an artificially low DN level,
causing your receiver to turn up the volume during the commercial.)
Turning DN on or off on your receiver has no effect on dynamic range
or sound quality; its effect is no different than turning the volume
control up or down.
All five DVD-Video audio formats support karaoke
mode, which has two channels for stereo (L and R) plus an optional
guide melody channel (M) and two optional vocal channels (V1 and
V2).
A DVD-5 with only one surround stereo audio stream
(at 192 kbps) can hold over 55 hours of audio. A DVD-18 can hold
over 200 hours.
For more information about multichannel surround
sound, see Bobby Owsinski's FAQ at <www.surroundassociates.com/fqmain.html>.
[3.6.3] Can you explain this Dolby Digital,
Dolby Surround, Dolby Pro Logic, DTS stuff in plain English?
Almost every DVD contains audio in the Dolby
Digital (AC-3) format. DTS is an optional audio format
that can be added to a disc in addition to Dolby Digital audio.
Dolby Digital and DTS can store mono, stereo, and multichannel audio
(usually 5.1 channels).
Every DVD player in the world has an internal Dolby
Digital decoder. The built-in 2-channel decoder turns Dolby Digital
into stereo audio, which can be fed to almost any type of audio
equipment (receiver, TV, boombox, etc.) as a standard analog stereo
signal using a pair of stereo audio cables or as a digital PCM audio
signal using a coax or optical cable. See 3.2 for more information.
A standard audio mixing technique, called Dolby
Surround, "piggybacks" a rear channel and a center channel onto
a 2-channel signal. A Dolby Surround signal can be played on any
stereo system (or even a mono system), in which case the rear- and
center-channel sounds remain mixed in with the left and right channels.
When a Dolby Surround signal is played on a multichannel audio system
that knows how to handle it, the extra channels are extracted to
feed center speakers and rear speakers. The original technique of
decoding Dolby Surround, called simply Dolby Surround, extracts
only the rear channel. The improved decoding technique, Dolby
Pro Logic, also extracts the center channel. A brand new decoding
technology, Dolby Pro Logic II, extracts both the center
channel and the rear channel and also processes the signals to create
more of a 3D audio environment. Dolby Surround is independent of
the storage or transmission format. In other words, a 2-channel
Dolby Surround signal can be analog audio, broadcast TV audio, digital
PCM audio, Dolby Digital, DTS, MP3, audio on a VHS tape, etc.
Unlike Dolby Surround, Dolby Digital encodes each
channel independently. Dolby Digital can carry up to 5 channels
(left, center, right, left surround, right surround) plus an omnidirectional
low-frequency channel. The built-in, 2-channel Dolby Digital decoder
in every DVD player handles multichannel audio by downmixing
it to two channels using Dolby Surround (see 3.6.2). This allows the analog stereo outputs to be connected
to just about anything, including TVs and receivers with Dolby Pro
Logic capability. Most DVD players also output the downmixed 2-channel
Dolby Surround signal in digital PCM format, which can be connected
to a digital audio receiver, most of which do Dolby Pro Logic decoding.
Most DVD players also output the "raw" Dolby Digital
signal for connection to a receiver with a built-in Dolby Digital
decoder. Some DVD players have built-in multichannel decoders to
provide 6 (or 7) analog audio outputs to feed a receiver or amplifier
with multichannel analog inputs. See 3.1 for more info.
DTS is handled differently. Many DVD players have
a DTS Digital Out feature (also called DTS pass-through),
which sends the raw DTS signal to an external receiver with a DTS
decoder. A few players have a built-in 2-channel DTS decoder that
downmixes to Dolby Surround, just like a 2-channel Dolby Digital
decoder. Some players have a built-in multichannel DTS decoder with
6 (or 7) analog outputs. Some DVD players don't recognize DTS tracks
at all (see 1.32).
If you have a POS (plain old stereo), a Dolby Surround
receiver, or a Dolby Pro Logic receiver, you don't need anything
special in the DVD player. Any model will connect to your system.
If you have a Dolby Digital receiver, then you need a player with
Dolby Digital out (all but the cheapest players have this). If your
receiver can also do DTS, you should get a player with DTS Digital
Out. The only reason to get a player with 6-channel Dolby Digital
or DTS decoder output is if you want use multichannel analog connections
to the receiver (see the component analog section of 3.2).
[3.6.4] Why is the audio level from my
DVD player so low?
Many people complain that the audio level from
DVD players is too low. In truth the audio level is too high on
everything else. Movie soundtracks are extremely dynamic, ranging
from near silence to intense explosions. In order to support an
increased dynamic range and hit peaks (near the 2V RMS limit) without
distortion, the average sound volume must be lower. This is why
the line level from DVD players is lower than from almost all other
sources. So far, unlike on CDs and LDs, the level is much more consistent
between discs. If the change in volume when switching between DVD
and other audio sources is annoying, you may be able to adjust the
output signal level on some players or the input signal level on
some receivers, but other than that, there's not much you can do.
[3.6.5] Why is the dialog hard to hear?
Dialog (people speaking) is usually mixed into
the center channel, with music, effects, and ambience mixed into
other channels. If your audio system isn't hooked up correctly or
doesn't work properly, the center channel might not be properly
reproduced. If you have a system with only two speakers, make sure
it is connected to the stereo outputs, not the multichannel outputs
(see 3.2).
In some cases the movie sound was not mixed well
in the studio, making the dialog hard to hear. In this case there's
not much you can do other than curse the sound engineer who thought
sound effects were more important than understanding what people
are saying.
Try turning on dynamic range compression (see 3.6.2) or check the disc to see if there is a separate 2-channel
soundtrack mix.
|