Advanced Podcast studio setup with Skype and Audio Hijack Pro


The diagram below shows the (software) elements of the setup.

Figure 1. Diagram of an advanced software studio setup. (Click for PDF, 413 KB)

Audio Hijack Pro is extensively used in this setup to route audio, add effects and to record the final result. This setup assumes that all audio is mixed to one (stereo or mono) audio track that is recorded in Audio Hijack Pro.

NOTE: Some of the advanced settings of Audio Hijack Pro described in this setup are only valid for Audio Hijack Pro version 2.5 or higher, which is currently still in Beta. Some details might have changed in the released version.

Figure 2 shows the flow diagram of the setup with audio chat and extra effects chains.

Figure 2. Flow diagram of an advanced setup. (Click for PDF, 28 KB)

Delays (latency) are added to the audio signal in each step of the process. In Figure 2, delay 1 - 4 indicates that there are 4 delays from microphone to headphone. Tips on how you can minimize the total latency are described in the paragraph "Reducing latency in the setup" below.

Hijack sessions

Three hijack sessions are required to make this setup work. See for example the sessions in Figure 4. (AA 0# in front of the session title is for sorting them together. Handy if you have more sessions defined.)

Figure 4. Hijack sessions for Skype recording.

Hijack Session AA 01 Mic -> SF2 captures the microphone input and routes it to the Soundflower (2ch) module. Any effect that you want to apply to the microphone signal can be added to the session's effects section.

• Set the input of the AA 01 Mic -> SF2 session to the audio device that your microphone is connected to.

• Set the output of the AA 01 Mic -> SF2 session to Soundflower (2ch).

Session AA 02 SF2 -> SF16 routes the mix of audio from Soundflower (2ch) to the Soundflower (16ch) module. See the (transport) box in Figure 2.

• Set the input of the AA 02 SF2 -> SF16 session to Soundflower (2ch).

• Set the output of the AA 02 SF2 -> SF16 session to Soundflower (16ch).

If you want to capture all audio played on the system for your recording, set the audio output in the System preferences to Soundflower (2ch).

Note that you can add more hijack sessions to the mix if you set their output to Soundflower (2ch). This can be used to add effects to audio coming from a specific application. In a podcast you could use an effect to increase the gain on a clip you're playing or add a level meter to (visually) monitor it's levels.

An alternative for an individual hijack session is the Application Mixer effect. You can add this effect to the AA 02 SF2 -> SF16 session and Hijack an individual application's output with it. This effect has the advantage of offering a cross-fader that enables you to easily fade the sound that is being recorded between the rest of the mix and the application's audio. (Refer to the Audio Hijack Pro Manual on how to use this effect.)

AA 03 SF16 - Mon. - REC is the session that captures the audio from Soundflower (16ch) and sends it to your headphones for monitoring. Recording this session records the Podcast into one file. You could leave this session out if you used another program to record the final mix. (Alternatively, use an external recorder to record the headphone signal sent to the headphones output using the Soundflowerbed utility or Line-In.

• Set the input of the AA 03 SF16 - Mon. - REC session to Soundflower (16ch).

• Set the output of the AA 03 SF16 - Mon. - REC session to the audio device that your headphones are connected to.
When you are ready to start recording the podcast, click the Record button of the AA 03 SF16 - Mon. - REC session.

Tip: Use Apple Lossless encoding for recording. This encoding type offers the best compromise between CPU usage and harddisk activity. Afterwards you can convert the recording to MP3 with your desired settings.

Local chat application

The goal is to record an audio chat whereby you record everything that happens on your side of the chat plus the voice of the person on the other side of the chat. Basically this is the same thing that you hear in your headphones. Also, the person on the other side has to hear the same thing except their own voice!

Figure 6. Skype's Audio settings.

In Skype's preferences, set the Audio output to Soundflower (16ch). As you can see in Figure 2 this mixes the audio from the other Skype call participant(s) with the final mix of sounds on your side of the call.

Set Audio input to Soundflower (2ch). It is only this mix of local audio that goes into the Skype application to be heard by your guest(s).

Leaving Gain control off produces the best result on the other end of the chat. Echo cancellation is usually not necessary but eats up a lot of CPU power, so it's best left off as well.

Remote Guest

If everything is configured correctly for the local chat application, any guest participating in the podcast will hear everything you say and play (everything that is being recorded) except for their own voice!

You can control the level of the audio coming from Skype for you recording by Hijacking the Skype audio instead of sending it's audio straight to Soundflower (16ch) through the Skype preferences. Make sure to set the Skype Hijack session target device to Soundflower (16ch) in the Advanced... settings under Audio Source. Adjust the Master Gain in the Skype hijack session -> Effects panel.

An alternative for an individual Skype hijack session is the Application Mixer effect. You can add this effect to the AA 02 SF2 -> SF16 session and Hijack Skype output with it.

Reducing latency in the setup

Your microphone signal goes through several processes using this setup. Each time the signal is processed or passed through in digital form, a delay is introduced. Delays introduced in this setup are indicated in Figure 2.

Audio Hijack Pro version 2.5 or higher offers the possibility to tweak the latency between audio devices in hijack sessions. For each hijack session you can adjust the buffering as it is called under the Advanced... settings in the Input pane.

Audio Hijack Pro does a decent job of keeping the buffer size optimal with the default setting. You may try to tweak the individual settings to find the smallest possible buffer sizes. Smaller buffer sizes mean less delay (latency). Sometimes a buffer size that is too small for the computer to keep up with causes a distorted sound. Therefore, tweak and test the buffer settings one-by one. If you set all buffers to the bare minimum before you start testing you have no way of knowing which buffer(s) cause the problem.

Keeping the number of effects low can also be beneficial to the total latency. It is also better for total performance since each effect requires processor power.

2005-05-16 Update: For a zero latency solution, see: Zero Latency Studio Setup.