Dynamics Processing

Har-Bal dynamics processing is based on the envelope information HB3 obtains during analysis: the time series constructed from peak and average power levels for each 50ms frame it analyses when I track is first opened. Those time series are what you see in the timeline control above the graph in the har-bal UI. The green is the average and the yellow is the peak. That same time series is also used to obtain the peak and average histograms in histogram view.

Now if you don't know what a histogram is, it is a graph representing the likelihood (probability) of the x- value (horizontal axis) occurring. In our case the horizontal axis is level in dB and it is binned in 1dB wide bins (hence the city line appearance of the graph). If you take any one of those tall rectangles, look at the dB value below it and the percent value corresponding to it's height you interpret it as follows. Let’s say the column I'm looking at is centered on -5 dB and it's height is 7%. That is telling you that for 7% of the time the level is between -5.5dB and -4.5dB (the bin is 1dB wide and centered on -5dB). Now, that could be either the peak level or the average level depending on whether you are looking at the green or the yellow trace.

If you happen to move the gain slider you'll note that the histogram marches up and down the horizontal axis by the amount of gain you apply. Now that you know what the histogram is you should see why that should be, because a level that was in the range from -5.5dB to -4.5dB after a 2 dB gain will now be in the range of -5.5 + 2 dB to -4.5 + 2 dB. However, you might be wondering why when you increase the gain the 0dB bin gets taller and taller. Well, if you think about it, the maximum level we can produce is 0dB. If something was in a -3dB bin and we apply a 6dB gain it can't go into the +3dB bin because there isn't one. What actually happens is the limiter cuts in and forces it into the 0dB bin. Thus, the more you push the gain higher the more the topmost bin in the histogram fills up as the values from other bins are forced into it.

This is a very useful thing in mastering because you can immediately see how over-zealous your limiting is. As an example, take any current generation master of popular music and look at the histogram of the peak level and you will almost certainly see a massive 0dB bin and little left in the other bins. That is what over-limiting does to the dynamics of your track. it squashes them into oblivion.

Back to dynamics processing:

Well, the dynamics processing in Har-Bal is based entirely on the average level envelope that you see in the time-line. You create a transfer function to map the input average level to a new output average level and Har-Bal uses it to figure out what gain to apply dynamically prior to the limiter at any point in time. Unlike conventional dynamics processing, it doesn't use attack - decay envelope extraction. The envelope it already has from the analysis. As such the gain change that tracks the music is in sync and symmetrical about the content. By that I mean with normal a normal compressor you would typically have a fast attach time and slow release which means that the gain quickly drops at the start of a transient and slow falls at the tail. In Har-Bal the dynamics processing is effectively equal attack and release.

Now you'll probably think that makes no sense and will sound awful. In the reality its not and here is the reason why. In traditional compression we principally have fast attack and slow release because a compressor cannot predict the future so it doesn't know when to pull down the level and if it does it slowly a very big transient will push through before it has a chance to bring it down. We have a slow release time because fast gain change will lead to a lot of distortion. Because Har-Bal knows the entire history of the track we don't need fast attack because we know when to pull the level down. The net result is that the dynamics processing in Har-Bal works more like the volume riding technique of recording engineers gone past where they adjust down the level while the music is being recorded because they know when the loud parts are about to arrive. This type of dynamics processing introduces very little distortion and as such, sounds quite different to what you might be used to expecting. I've heard it described as sounding more like modest limiting than compression.

That's enough of the theory behind it. How do you actually use it? Well, as hinted at before you design an input-output transfer function using the dynamics node editing tool in the histogram view. Why the histogram view? Because the histogram view shows you a summary of the dynamics content and to stick with a theme in Har-Bal, it will show you the effect your dynamics processing has on that histogram.

The process works like this. A transfer characteristic is made up of nodes. You create a node by clicking the dynamics node tool anywhere on the histogram graph. The node displays as two circles connected by lines, one filled and one not. Think of the unfilled one as an 'O' for output. It represents the level you are mapping the input to. The filled circle represents the input level. So a typical compression scheme would work like this. You look at the average level histogram (green trace) and look at the highest level it has (i.e. where the hill comes down to the flat on the right hand side). At that point you put a node with input and output equal. Why? Because we want the top of the histogram to stay put and just squash up the bottom (quieter) parts. Its saying map the input level to the output level for arguments sake lets say it is -8dB.

Now look at the other side of the hill. Where the left hand slope comes down to the flat we want to map that level to a higher level (compressing it) so we press down the left mouse button (that sets the input level) then drag it up to the level we want to map it to and release the mouse button (that is the output level). For example it might be and input of -40dB and an output of -27dB. You'll note that after releasing the mouse Har-Bal gives you an updated histogram showing the effect of the scheme on it and also the updated average track level figure of merit. You'll also note that if you happen to render those changes and re-analyze the result the prediction is pretty accurate.

That is basically all there is to using it. I'm sure it will seem strange at first until you understand the concepts but once you've gotten the hang of it I think you will find it a natural and informative way of dealing with dynamics.

A couple of observations you can make. When compressing the width of the histogram is compressed but that squeezing makes the peak higher. That is because the area under the curve is constant and sums to 100% (i.e. everything). You can think of it as a lump of plasticine you use to model a hill. If you squeeze the sides together all pushes out the top because the volume has to go somewhere. Similarly, if you stretch it, it falls down, which is dynamic range expansion.

Another thing to note is that this processing can be used for more than just compression. Another useful thing to use it for is noise gating. What I would typically do for that application is to provide a split at the end of the track where it fades away and apply dynamics processing to that split to gate the noise. The setting of the nodes is essentially exactly the same as with compression except you now map the input to a lower level on the left hand side to make the noise quieter. If you have additional dynamics processing applied to the overall track the two mappings are combined.

It all makes for a powerful and flexible architecture for managing the dynamic range of your track.

A histogram shows the frequency (as in how often something occurs) of an event. It comes from probability and statistics theory.

For a pure statistics explanation, let’s say I have a bag of ping-pong balls with numbers written on them. The numbers are in the range of 1 to 10. Some of the numbers are the same. Lets say you empty them out and you see these numbers. 1,3,4,5,3,3,1,4,6,7,9,9,9,9,10,7.

A histogram showing the frequency of the numbers from 1 through to 10 is constructed by counting how often each number occurs. So in the above case we have,

Number 1,2,3,4,5,6,7,8,9,10

Frequency 2,0,3,2,1,1,2,0,4,1

That is, there are 2 1's, no 2's, 3 3's, 2 4's, 1 5's and so on and so on.

A normalized histogram usually specifies things in percentages so to convert into a percentage we divide by the total number of balls in the set and multiply by 100. The frequency numbers therefore become,

100 * (2 / 16), 100 * (0 / 16), 100 * (3 / 16), ....

because there are 16 balls in the set. That is what a histogram is. You may note that for normalized histograms the sum of all the frequencies is always 100% because that is the set of all balls in the bag.

Now in the context of Har-Bal's histogram, the balls are each of the sample points on the time line (which are spaced at a nominally 50ms interval) and the number on the balls is the average value for the average histogram and the peak value for the peak histogram. The main difference with the above example is that the time line is a real (i.e. fractional) number so how do you group them. Well we group them in "bins" of 1dB width. What does that mean? Take a small subset of average values:

-0.3dB, -0.4dB, -0.7dB, -1.6dB, -2.4dB....

The bins are centered on exact dB values of 0dB,-1dB,-2dB,-3dB.... The boundary from one bin to the next is the mid-point between centers. The top bin 0dB is special because it doesn't have an upper boundary.

0dB bin ,lower boundary -0.5dB

-1dB bin ,lower boundary -1.5dB, upper boundary -0.5dB

-2dB bin ,lower boundary -2.5dB, upper boundary -1.5dB

-3dB bin ,lower boundary -3.5dB, upper boundary -2.5dB

.

.

.

Going back to the sequence of average values, all we do is figure out which bin the number fits into and when we find that bin we add 1 to it because this corresponds to a count of 1 value fitting in that bin. -0.3dB fits in the 0dB bin, so does -0.4dB, -0.7dB fits into the -1dB bin, -1.6dB fits into the -2dB bin, -and 2.4dB fits into the -2dB bin and so on and so on. After counting all the values in the bins they get normalized by converting to percentages (100 times the bin count / total number of time line samples).

A histogram, like an average spectrum is much more useful for judging the dynamics of a track because it presents it in a static single image summary. Show me a histogram of the time line and I can tell you immediately whether it is high or low dynamic range, what the limits of the dynamic range is, whether it has a bi-modal behavior, as in loud parts and quiet parts, whether it has been over limited and so on and so on. It is all there to see in the histogram.

If you find it hard to believe just open up a track you know to have high dynamic range and a track you know to have low dynamic range. Compare the histograms. Can you see the difference? The pattern is obvious and immediately revealing.

The histogram also contains within it the total average level within it. If you give me the histogram I can calculate the track average from the data it presents. It makes a great deal of mathematical sense to summarize dynamics with histograms.