Producing a great narrative audio track involves a lot more than just good luck. Pre-production planning, proper microphone and recording technique, and of course, a well written script are all equally important to producing a good narrative or talking-head sound track. What is often ignored is proper voice processing during editing and final production.
How much Processing is Enough?
Good question! Processing is mainly a subjective thing. There are no absolute rights or wrongs, only degrees of correctness. Much is dependent on the total “character” of the A/V production and the final environment in which it will be heard and seen. For instance, listen to a Rick Dees radio broadcast. This is a good example of extremely processed voice. Now, imagine this type of processing used for a Civil War documentary. Doesn’t work at all. That’s because each type of production has certain audience preconceptions about what a voice should sound like. Shows like Wrestle-Mania and Lifestyles of the Rich and Famous use this extreme processing almost as a cliche’.
This “IN-YOUR-FACE” sound is due mostly to heavy compression/limiting and makes the talent sound like they’re shouting, even at low volume. (Please don’t confuse level compression of an analog signal with data bit reduction as applied to a digital sound file or data stream) On the other hand, a narration for a science or art program ideally should have no noticeable artifacts at all. The narrator should sound like they’re sitting in your living room. Of course, you can make them better than possible in real life. By eliminating mispronunciations, throat clearing, and stutters, a good voice can be raised to greatness. Now before the audio purists get all upset about “unnatural sound” and the videographers say “so what” let me explain my attitude. Narration can tell the story all by itself if properly done. Radio’s been doing it for decades. When coupled with good visuals, the production becomes more than the sum of its parts. But to be worthwhile, two very important processing rules must be followed.
#1:You’ve got to hear all the important elements and
#2:You shouldn’t hear any distracting artifacts.
These two rules are many times at odds with each other. And as in the Civil War/Lifestyles example, not always applied in equal proportions. The balance will swing back and forth, and as long as you understand the consequences of each effect or processing box, you won’t get into trouble.
TANSTAAFL (There Ain’t No Such Thing As A Free Lunch)
Nothing is free in the processing world. Everything you do has its consequences. The most important effect to understand is that most enhancements will boost not only the desirable audio, but any problems such as talent breaths, pops, bad mike technique, room imbalances and tape hiss. You need to start with a clean track to be able to boost the desirable elements without adding extra grunge.
Step #1: Clean it up!
If you’ve been following this column you already know that all major audio processing should be done at once, and as late in the editing chain as feasible. That means you should have nice, clean, unprocessed audio to start with. The only field processing I allow is individual channel limiting to prevent tape (or digital) saturation. Avoid equalization, enhancers, digital delay, reverb, and any other effects recorded in the raw tracks. It’s relatively easy to add these effects later, but if done in the original recording, you’re stuck with them forever.
Now, evaluate the raw narrative track in as quiet an environment as possible. Bring your own headphones to any video suite that doesn’t have the decks in a separate machine room or closet. Turn off the video monitor and LISTEN! Are the tracks really clean or are there recording problems that need to be fixed? Listen for things like wind noise, crickets, keys jingling, traffic noise, air conditioners, etc…. If you do have a background noise that’s constant throughout all the takes, it won’t be as noticeable in final production as say an air conditioner fan that’s on in one scene, only to be absent on the cutaway shoot. If the sound is pretty good but some room tone is evident, you can remove the echo “hang time” by using a downward expander or a single ended hiss gate such as a Hush II processor. I use this first in my processing chain so it has as much signal dynamics as possible to work with.
By adjusting the threshold control, you can get anything from simple “hiss” reduction all the way to truncating the beginning syllables. Each recording will need a different setting, so keep this box close to your mixing position. If you have other problems, like the whine of a motor, you can equalize that part of the spectrum, but this generally requires a very specialized, narrow-band parametric equalizer. The built-in EQ on your board doesn’t stand a chance but I’ve had some success with 1/3 octave graphic equalizers in that circumstance. Now listen carefully for “breaths” in the sound track. Some narrators record as breathless, but a good many that use close miking sound like a hurricane between phrases. If your track is an off-camera narrative, the best way is to use your digital editor to break the sound file into separate zones at each breath (figure 2).
It’s pretty easy to see the space between phrases and that beginning “gasp” that always leads-in the next sentence. Once each phrase is defined as a separate zone, you simply slide the zones together in the edit decision list. Since you didn’t select any of the breathing parts of the soundfile, the breaths are simply left out. Unfortunately, a track without natural pauses between phrases will probably result in an frantic get-down-in-the-mud pace, so make up a 3 frame (about 100 millisecond) zone of dead air or room tone to insert where the breaths came out. You can then play with more or less length of pause for proper effect. Just a few extra milliseconds of pause in the right place can transform ragged sounding speech into a thing of beauty. Remember, timing is everything! If your track is an on camera shot, then the best thing is to mute or attenuate the track during the breaths. If you’re using a Mackie 1604 with OTTO track automation or any other automated board, it’s a pretty simple thing to mute or change fader position on the audio track at each breath point. A MIDI track can be slaved to run from the SMPTE Time Code in chase mode. You can’t slip the track timing, but by reducing or eliminating the heavy breathing, you’ll get a much cleaner track that will tolerate heaver levels of audio compression without sounding like a your talent’s ready to take his or her last breath. You probably don’t want to entirely mute the sound track during the breaths for on-camera audio, since some breathing sounds are visually anticipated by the viewer.
Step #2: Compress It
Now that you’ve got a nice clean track you can use a compressor/limiter to get everything to a nice even level. A compressor is simply a gain riding device that can be set to various threshold levels and compression ratios. Some units have adjustable attack and release times, but if you don’t have the time to play with a bunch of knobs you’ll be better off with a simple but great sounding unit like a dbx or Urei. I’ve still not heard a digital compressor/limiter that I really like, so until then a dbx 160x with over-easy compression does my processing. Use about a 10:1 ratio for normal voice processing, and infinity limiting for a really BIG SOUND. Again, it’s a matter of taste. You can drive the compressor anywhere from 6 db to 20 db into gain reduction without too many artifacts if you’ve cleaned all the breaths as in step #1. If you haven’t done your homework, then that 20 db of compression will raise the breaths and room tone by an equal amount of decibels making an ugly sounding mess.
Step #3: Equalize It
Finally you can add some tizz and boom. I generally boost 8 khz (kilohertz) by 6 to 12 decibels, roll off everything below 30 hertz, and apply about 3 db boost around 80 hertz for male voices. Voice tracks from DAT don’t seem to need as much boost, maybe because there’s no appreciable high frequency droop as normally found in analog tape decks. Female voices get a little less 8 khz boost, and no bass boost unless we’re looking for that sexy “in your ear” kinda voice. I don’t like to boost the 3 khz “presence” band since most microphones do that anyway, and it can get very shrill rather quickly. A good parametric is just the ticket for this part of the chain. Since you can adjust both the center frequency and the width of the boost or cut, you can literally tune-in the sound you like.
Step #4: Final Processing
Not shown in the diagram, but certainly possible, is a final stage of compression or enhancement. Sometimes a last stage of compression can be used to really get the levels up, especially for noisy venues like AM radio or Video Kiosks at the mall. Another useful item is a Barcus Berry Enhancer (BBE). This is essentially a smart multi-band compressor that dips the mid frequencies whenever a high frequency comes along. This gives you more apparent highs without actually boosting them. You avoid tape saturation and can record at higher levels. BBE’s work very well, but aren’t a substitute for steps 1, 2 and 3. If you leave in breaths and other room noises, they’ll boost it all.
There you have it: the basics of voice processing. There are no magic “enhance” buttons on your audio gear. It’s up to you to evaluate the audio situation and pick the proper techniques to get the desired results. Remember that most of this is dependant on developing an “ear”. So listen to great narrative tracks. Feel the timing of the voice, and evaluate narration heard on the radio and television. Now listen to your own productions. Do the voice levels change constantly, forcing the final listener to juggle the volume control on his or her television? Does the voice stand out above the music track or are do you have to struggle to make out the words? Does the track project the desired image, be it authority, energy, sexiness, intelligence, excitement, patience, etc…. If it does, then you’ve got a winner. If not, then it’s time to try some different processing techniques. (Remember, this tape will selfdestruct in 10 seconds. Good luck, Mr. Phelps)
Copyright Mike Sokol 1995 – All Rights Reserved