Well, I’ve finally done it! After much mucking about, I’ve finally posted the first YouTube video on the Sawmill Recordings channel. I’ve taken careful note of my processes, and time required and worked out that I could conceivably make one or two of these a month and believe you me, I am very happy to have started on this leg of my career. What I thought I’d cover today is something that many first time YouTubers face when getting started and that’s audio issues. I see so many YouTubers with bad audio because their strong suit is elsewhere whether that is animation, video editing, or overall video presentation. The process to achieving good audio in videos is not that well known but I realize that I’m in the opposite position to most because I’m an audio engineer and amateur video maker. I’m going to attempt to make the process as simple as possible to understand for a layman (barring super basic stuff like how to plug in a microphone and how to set up your gain structure as that’s not rocket science and you can figure that out).
It’s no secret that having good audio quality has a massive effect on the perceived production value of your videos and fortunately is relatively easy to achieve with a few basic tools. Now, as with a lot in audio, the more money you throw at a problem the more likely it is to go away but in this case the bar of entry is pretty low and the fancy stuff that can cost thousands is a luxury and not a necessity. All of this stuff can scale to any price point from a small budget to literally tens of thousands so pick your battle.
Camera:iPhone 11 (always make sure to orient horizontally)
Microphone:Neumann TLM102 (Other popular choices are the SM7B, MD421, Focusrite CM25, RE20, Røde Podmic, Octava MC012, and the Røde NT2. You want it to be relatively robust in construction and as low gain as possible while still retaining detail. I went with the TLM102 because of its small size and quality of output. While it is more sensitive than other options, it is also further away and the tradeoff worked great)
Audio Interface:Focusrite 18i20 (this is my large format interface there are a multitude of options that you can get that aren’t as overpowered. I would getting recommend getting your hands on the Audient iD4 or if you’re on a budget the Focusrite Solo).
DAW:ProTools but you could also use anything really. Reaper is probably the best budget option.
Post-processing:Waves X-Noise, Waves API 2500. (I would’ve used an outboard compressor on the way in if I had one available but not only is it a pain in the ass to set up when not rack mounted but these aren’t a requirement at all. You can get by just fine with a plugin as long as you know what you’re doing).
So how do we actually go about this? Well let me lay it out: I set up the iPhone on a camera tripod with enough space around my face and found a spot that looked particularly nice. I then attached the microphone to a reasonably long stand and hoisted it up over the iPhone and positioned it so that the microphone is pointing at my face while still being out of frame of the shot. The mic should be about 7” to a foot away from your face. The hardest part of this for me was to banish the shadow cast by the mic stand, hence making the job of a boom mic operator totally legit. Thankfully the lights in my live room are articulated so I can point them anywhere I need them to be within reason. This doesn’t quite eliminate the problem but it definitely makes the situation a lot more workable than it would be otherwise.
Now it’s all set up you may notice a problem in that it’s very difficult to synchronize the video and the audio you’re recording to your computer. This was a problem that plagued me for a long time before a friend showed me the workaround that Hollywood has used since the concept of modern film sets. When you set both the camera and the audio rolling you clap in such a way as your camera sees your hands come together. When you go to edit the video together you simply line the spike in the waveform with the moment your hands come together and bingo! In synch. Hollywood has boards that display the film’s title, act, scene, and time code that clap in front of the camera for just this purpose. The real challenge comes when you have to edit all of it together in such a way as to not upset the audio’s positioning within your editing software but this can be worked out with practice.
Once you’ve delivered your lines, shot B roll, and done all your voiceover that you need to do, you can start processing your dialogue audio and fortunately it’s relatively simple comparatively to making sure there’s no shadow from the mic stand. The main problems you’ll run into are reflections, and the noise floor. If you haven’t checked for the reflectiveness of your shooting environment before you started filming you’ll be in for a rather unpleasant surprise when it comes time to process. The best way to combat this is pick your shooting locations where you absolutely can’t have voiceover with great care and change the character of the environment if you can with acoustic panels or gobos.
As to the noise floor, it’s a naturally occurring quiet noise that will show up in your mic, which comes from a number of different sources like thermal noise, atmospheric noise, residual electric noise, and the self-noise of the mic itself. There are plenty of plugins that do this but I enjoy the X-Noise plugin from Waves, as it’s simple to use, effective at its job, and easily tweakable to a number of different situations. You can even teach it the specific sound of the noise floor in your room and it will cancel it out pretty darn well. It can even help cut down on the reflected sound if you manage to catch the tale end of it right. If you don’t want a plugin to solve the problem and don’t have too many problems with reflections, you can simply take a sample of the noise and play it in reverse phase at the same volume as it appears in your mic and that will cancel it out. I’ve also used this trick with air conditioning and heating noise and it’s a fairly reliable method of solving those problems. I then slam it pretty hard with a compressor and EQ it only if absolutely necessary and then you’re pretty good to go.
(Compressors for those who don’t know are not used to make things louder. They decrease the difference between loud sounds and quiet sounds. When you dial it in, set your threshold so that the gain reduction needles are moving, set your attack time medium slow and your release time fast so that you can’t hear the compressor engage and disengage then select your ratio (4:1 or 6:1 is best for this application). Finally, turn the makeup gain dial so it sounds as loud as when you started. It helps to engage and disengage the plugin to check this. All these terms will be properly defined at the end).
As a final note: even though you should be using the same compressor and compressor settings across your dialogue it’s still beneficial to balance your volumes in the editing software so nothing is too loud or too quiet. Doing so in your DAW is quite counter productive as you don’t have as much perspective on the whole video as you will in your editing software especially with the addition of mastered music and all of those flubbed lines and pauses which you’re going to cut out in the edit.
I hope this has helped! If you did find this helpful please head over to the Sawmill Recordings YouTube channel and like, comment and subscribe and share the video around. Also if you have any ideas for future videos let me know!
Link to the video: https://www.youtube.com/watch?v=vtmkdPWvmNQ&t=39s
Threshold: The point at which the compressor starts working. If a part of a signal crosses the threshold the compressor will turn it down.
Ratio: The amount a signal that crosses the threshold is turned down i.e for every 1dB a signal goes over the threshold it’s turned down X dB (normally 2, 3, 4, 6, 8, and 10+).
Attack: The speed at which the compressor turns the signal down.
Release: The speed at which the compressor returns to normal.
Makeup Gain: When you compress a signal it will sound quieter than when you started. The makeup gain is used to return the compressed signal to the original volume and no more.