Sound
Digital Delay For Application in Surround Sound
1.0 INTRODUCTION
1.1 MULTI-CHANNEL REPRODUCTION
1.1.1 Hafler Surround Sound
David Hafler [2] first proposed a system
that would both encode and decode information on existing two channel
stereo whilst still being compatible with normal recording and playback
techniques.
If information is encoded onto each channel with equal amplitude but
opposite phase it can be decoded by simply taking the difference between
the two. Similarly if the information is encoded with equal amplitude
and the same phase in both channels, then it can be decoded by summing
them both.
For a recording made using coincident-microphone techniques a signal
could be derived between the loudspeakers outputs on reproduction which
corresponded to the output of a sideways-facing figure of eight microphone.
This means that the illusions of sound sources could be enhanced beyond
the space between the main loudspeakers if this signal is replayed through
extra loudspeakers which are placed to the sides of the listener [3].
This method works particularly well on sources such as live classical
music recordings which are recorded in the natural surroundings of concert
halls. Sound energy from the orchestra will be reflected all around the
hall and eventually what arrives back at the microphone will be out of
phase information.
One extra speaker will suffice (if it is placed to the rear of the listener),
but if two are used, with one of the two wired in anti-phase, then a wider
image can be created.
For the ideal situation these extra speakers should be placed to the
rear of the listener, further apart laterally than the two main speakers
and slightly higher.
If the rear speakers signal is delayed with respect to the front main
speakers, then an effective shift can be obtained in the rear signal giving
greater depth to the overall sound experience.
As an approximation sound will travel three Metres every milli-second,
and so a delay of twenty milli-seconds will create a rear-ward image shift
of approximately sixty Metres.
The information contained in the signal for the rear speakers will in
general be lower in level than that of the main speakers and external
processing will be required to alter this, subject to the listening environment
and the listeners taste.
The frequency content of the signal will be predominantly in the mid-range,
typically 200Hz to 6kHz and so it is not necessary to have speakers capable
of reproducing the full audio bandwidth. In fact ensuring that the higher
frequencies are filtered out may be advantageous, as this would ensure
that important directional information appears only to come from the front,
otherwise the result may appear unnatural and disturbing to the listener.
1.1.2 Quadraphonics
The first commercial attempt to go beyond stereo reproduction was in
the early to mid 1970's when the concept of quadraphonic sound was unleashed
on the general public.
Quadraphonic sound never really caught on and there were a number of
reasons why this was so.
The first problem was that there were a number of different systems
on the market which were largely incompatible with each other. The main
three being: CBS's SQ, Sansui's QS and JVC's CD-4. The first two were
matrixed systems, and involved processing four channel sound information
into two channels. The latter one, CD-4, involved having two basebands
of stereo and modulating two more onto a high frequency carrier [4].
All of the systems suffered from technical difficulties: SQ and QS because
the correct way of handling the matrix operations was for more complicated
than it appeared, and CD4 because of the need to record and recover very
high frequencies from a vinyl disc records [3].
Despite all the incompatibility problems there was also the problem
of physically placing four speakers in a listening environment and still
being able to sit in a prime listening position. It is said that American
houses tend not to have door entrances in the corners of rooms unlike
their British counterparts, where the placement of a these extra speakers
always seemed to end up causing problems [3].
1.1.3
Ambisonics
The early 1970's saw much work done by Peter Fellgett and especially
Michael Gerzon [5] who through the NRDC (National
Research Development Corporation) [6] developed
the necessary mathematical theory which allowed a rational system of multichannel
surround sound. This system is known as AMBISONICS. It is able to reproduce
the directionality of indirect reverberant sounds, as well as direct sources.
All rights to ambisonics are now owned by Nimbus Records Ltd
[7] giving them the rights to all patents
for ambisonics, except those for microphones. Nimbus have been producing
C.D.s in ambisonics for a number of years. Live concerts are recorded
on a soundfield microphone, or equivalent, positioned approximately 3
Metres behind the conductor. Such a microphone has the capability to pick
up sound in all directions. It has a figure of eight directionality forward
facing with a gain of +1.414 for sound straight ahead, zero at the sides,
and -1.414 behind (the output of which is termed X). A similar arrangement
for the side (output Y), and an omnidirectional response (W). If height
information is to be acquired also then a figure of eight direction pattern
is used (Z).
For derived responses including height, the term periphonic is used [5].
Essentially only three channels are needed for 360° surround sound
in the horizontal plane.
Four different formats exist in the encoding and decoding of ambisonic
information [5], [6]
& [8].
A - FORMAT: Synthesis of signals representing both the sound wave and
direction (e.g. in a four channel system Lf = left front, Rf = right front,
Lb = Left back, and Rb = right back).
B - FORMAT: This is the matrixed information from the A format, or the
output of a soundfield mic.
X = 1/2(-Lb + Lf + Rf - Rb)
W = 1/2(Lb + Lf + Rf + Rb)
Y = 1/2(Lb + Lf - Rf - Rb)
Z = 1/2(-Lb + Lf - Rf + Rb)
(If height information is required)
C - FORMAT: An encoded version of the B format for consumer use. Predominantly
used is the UNIVERSAL H.J. (U.H.J) system which utilises two signals that
correspond to the left and right of conventional stereo, but contain all
the information for full surround sound extraction.
D - FORMAT: The C format decoding technique.
The encoding and decoding processes use both amplitude and phase matrices
to take account of the fact that the brain interprets sources differently
at different frequencies.
At low frequencies the perceived intensity is the sum of pressures, where
as at higher frequencies it is the sum of energies [9].
1.1.4 Film Surround Sound
One of the first films to attempt an all around sound was the Walt Disney's
1940 film "Fantasia", which had discrete six-channel sound [10],
as did the cinerama series of movies in the 1950's [11].
The main pioneers of cinema surround sound is the company Dolby
who first commercially exploited the idea, when in 1975 Dolby Stereo
was introduced.
Dolby Surround Sound Encoding
When a film is encoded with surround sound information the centre channel
is attenuated by 3dB. It is then combined in equal proportions to the
unaltered left and right channel information. For the surround signal
it too is attenuated by 3dB then band-limited at 100Hz and 7kHz before
passing to a modified Dolby B noise reduction encoder. It then passes
to a +90° wideband phase-shift network before being added to the
left-plus-centre signal, and to a -90° phase shifter before adding
to the right-plus-centre signal. The two composite signals are then recorded
onto the film [12], [13].
FIG 1.0 Dolby Surround Sound Encoding
Dolby Surround Sound Decoding
At this stage it is important to differentiate between the different
types of Dolby sound, beyond the scope of noise reduction circuits used
in optical and magnetic sound reproduction, Dolby sound can be defined
into two main sections:
1) Dolby Surround - This deals with home cinema entertainment
systems, for which the equipment is made by licensed consumer electronic
manufacturers [14]. It is most closely related
to the SQ matrix of the 1970's quadraphonic systems.
Dolby license two types of
products for home cinema entertainment. One is the DOLBY SURROUND DECODER
which is a passive device and simply derives the surround information
from the two channel stereo input. It also has a time delay, bandwidth
limitations and noise reduction.
The other is the DOLBY SURROUND PRO LOGIC DECODER which is an active device
and uses a variable matrix with additional enhancement to the directional
information to improve separation between all channels. Unlike the passive
decoder it can run with a centre speaker for locking dialogue to the screen
more precisely [13], [15].
2) Dolby Stereo - This concerns the cinema side of surround sound,
and of which there are four main systems. The newest is the Dolby Stereo
SR D which incorporates a six-channel digital optical soundtrack in
addition to a four-channel SR analogue track on the same 35mm prints [16].

FIG 1.1 Dolby Film Sound Classifications
The decoding principle is essentially the same for cinema and home cinema
systems. After encountering input buffers the two signals are fed to a
matrix which will derive the left and right channels, surround and centre
if required. The surround signal first encounters a delay. The next stage
is a 7kHz lowpass filter and then a modified Dolby B noise reduction
circuit. The latter two circuits serve to quietening delay line and optical
sound track noise and to reduce sibilant bleed-through of centre encoded
dialogue due to relative phase and amplitude errors in the original two
signals. These are most likely to occur from slight adjustment and positioning
errors of the two recording light valves and the two playback photocells.
For cinema surround sound an additional high-pass filter set to about
100Hz is applied to protect the surround speakers, which of necessity
are much smaller than the front speakers [10],
[17].
The time delay is principally to enforce the Haas effect. If the surround
speakers are delayed with respect to the front speakers then the sound
will appear more evident from the front. The delay time is typically within
the region of 20 - 50 milli-seconds, depending on the environment. This
may also help to reinforce certain sounds because of the intelligibility
aspect of early reflections within this period of time.
THX
THX is decoding system which relies on using material which has been
encoded with Dolby surround sound [11],
[12].
There are two types: i) LUCASFILM THX and ii) HOME THX.
LucasFilm THX affects the presentation of movies at several points.
For the home, a THX-certified processor starts with Dolby Pro-Logic and
adds:
* Surround channel decorrelation - a digital pitch shift is used to make
the (mono) surround signal "different" in the left and right
surround channels.
* Re-equalization of the front channels, to make the movie mix seem less
"bright" in the home.
* "Timbre Matching" - an equalization applied to the surround
channel to make effects sound consistent when panned between front and
surround speakers.
THX specifies front speakers with a reduced vertical dispersion (to minimize
ceiling reflections) and two side-mounted surround speakers configured
for dipole radiation.
THX also recommends equalization for the L-C-R channels. A THX equalizer
will have 1/3-octave bands from 80 to 800 Hz, implemented as "interpolating
constant-Q" circuits, and parametric equalization above 1000Hz and
for the subwoofer channel.
"THX" is a LucasFilm trademark for several things, two of
which are directly related to home surround:
1. "THX Theatre" - THX is a certification process. Theatres
bearing the logo are periodically tested to ensure that they meet LucasFilm
standards for audio environment and playback of surround-encoded film.
2. "Re-recorded in a THX theatre" - THX logos on films and recordings
indicate that the final Dolby MP-compatible mixdown was done with the
recording console and engineer located in an actual THX-certified theatre.
This is intended to ensure that the film audio will playback in a consistent
and predictable manner in all THX theatres (and in homes equipped with
THX certified components).
3. THX crossover - LucasFilm lists recommended audio components for THX
theatres. They also make a crossover, bearing the THX brand, which is
only used in actual motion picture theatres.
4. Home THX - LucasFilm has a testing and certification process for home
audio equipment. Those models which are submitted by the maker, and pass
the tests, may exhibit the branding. THX branded equipment provides the
promise of effective home theatre, but can still sound poor if improperly
set up and calibrated. Some THX-branded equipment includes dealer installation
and adjustment. For amplifiers, THX merely provides an assurance of high
quality.
5. THX certified surround decoders, equalizers, main speakers and surround
speakers, on the other hand, must provide specific THX required functions,
as well as high general quality [11].
1.2 STEREO IMAGE WIDENING TECHNIQUES
1.2.1 Holophonics
A technique which was used in the early 1980's was a system called holophonics.
It was principally a binaural recording process using a dummy head. The
effects, were by their very nature, only at there best on headphones.
There were two notable albums which used this technique which were attributed
on the albums to the pioneer Zuccarelli Labs Ltd [18],
[19].
The system had favourable reviews, but never really caught on in the commercial
sense. This is probably because of the requirement to use headphones in
order to obtain the best effects.
Addendum - 2nd April 2000
On the 29th March 2000 and 30th March 2000 I received emails from
Hugo Zuccarelli.
Mr Zuccarelli wishes for it to be know that '...holophonics is not
a binaural system and for that reason speaker playback is not a
problem...'.
Mr Zuccarelli has an official home page for Holophonics at www.holophonics.com.
He also cites several sources of information for those who wish
to learn more about the technology:
|
1.2.2 Q Sound and Roland Sound Space (R.S.S.)
These are both stereo widening techniques that rely on complex phase
transformations that give the illusion that a source is beyond the extremes
of the left or right speakers.
Because of the complex phase
transformations the B.B.C. discovered that material encoded with Q Sound
would give very poor mono-compatibility on their F.M. transmissions and
so it was effectively banned from radio broadcasts [20].
1.3 AUTHOR CONTRIBUTION
From previous experimentation it was discovered that by using the Hafler
technique for deriving surround sound a new listening experience could
be obtained from ordinary stereo sources. What was needed was a dedicated
processor in order to achieve this.
The major obstacle to overcome in creating a dedicated processor which
would carry out this operation, is the requirement for a delay. Although
there are a number of semiconductor chips for sale which have the ability
to delay audio signals (known as bucket-brigade chips) they tend to be:
i) rather noisy,
ii) have a restricted bandwidth,
iii) have a small dynamic range,
iv) can only create a relatively small delay time.
Despite these limitations popular electronics magazines have published
designs for such surround sound systems using this technique to generate
a delay [21].
It was therefore because of the aforementioned reasons that the decision
was made to create the delay digitally.
This had the following advantages:
i) the full audio bandwidth could be catered for.
ii) the signal to noise ratio would be a lot higher - provided sufficient
bit resolution was used.
iii) the system would be more versatile.
The delay time could be made theoretically as large or as small as desired
without any degradation of the signal, and once the signal was in the
digital domain it would be easy to carry out other transformations on
the signal if so desired.
Once again popular electronic magazines have used this approach [22],
but the main disadvantages of the one referenced is that it only uses
eight-bit resolution and can only incorporate two fixed delay times.
Pleasing surround sound effects can be generated from some sources and
types of music not specifically encoded for this purpose. A large amount
of recorded music is subject to signal processing before being mixed down
onto the final medium. This can be anything from adding artificial reverberation
to the use of Q Sound (section 1.2.2).
Each process changes the overall sound and quite often out of phase information
is generated on the recording. This can later be taken from the recording
and used for the surround sound channels.
It may depend upon the type of music as to which would benefit the most
from this effect. Experimentation with different types may yield unexpectedly
pleasant results.
Similarly live recordings
of music may recreate the ambience of the hall or audience upon reproduction.
The system which I have constructed before and have chosen to do now
has its origins in the Hafler (section 1.1.1)
and Dolby (section 1.1.4) designs.
FIG 1.2 outlines this.
FIG 1.2 Authors Surround Sound Processor Design
|