1. Introduction
A number of handbooks introducing us in the world of sound, are stating that sound has three main characteristics time, frequency and intensity. My intention here is to give an introduction to these terms as deeply as necessary and then to proceed further to the more sophisticated topic of time frequency (TF) analysis and its interpretation in the context of TEOAEs.
Time and frequency are somewhat connected terms. In a mathematical or engineering context it might be said that the information carried by a sound signal can be represented in one of two equivalent forms - in a time-domain or in a frequency-domain. Although the time-domain representation is the natural form, one can transform the information to the frequency domain and back to the time domain without any loss of information. This transformation considers only the mathematical properties of information in the signal. However, for very simple periodic signals, the transition to the frequency domain allows a better signal inspection than the natural time representation. On the other hand, if a source emits acoustic signals of variable frequency or there are transient segments in the acoustic signal, the frequency representation has only a formal and technical value. In this case, the spectrum of the signal, being the frequency domain representation, helps in evaluating the properties of the acoustic signal.
2. Properties of a simple periodic acoustic signal
Let us go back to the considerations about the fundamental features of sound that is the time, intensity and frequency. I would like to focus only on practical aspects of these phenomena from the "engineering" point of view. And this approach will be particularly difficult if we try to describe time. A lot has been written about the term "time" in philosophical and physical aspects (i.e. theory of relativity), however these descriptions and subtleties have little practical meaning in consideration of the otoacoustic emission signals. It is more practical to follow a mathematical approach and say that time is a primitive term. For us, the most important feature of time is that it "runs" permanently.
As the time "runs" we can observe the changes around us. For the otoacoustic emissions, our observations consist in the recording of acoustic pressure changes produced by the activity of the outer hair cells in the inner ear. As the movement of the cells has two phases, in the process of expansion and contraction, the acoustic pressure changes its sign being once positive then negative. It should be noted that in this example we are referring to the instantaneous values of the otoacoustic emission signals, as they are commonly presented in the "Waveform" panels of the majority of commercial software packages which run the acquisition of the OAE responses.
The instantaneous value represents either the temporary phase of pressure or the intensity of the otoacoustic emission. We are going to separate these two aspects and focus only on the intensity. In evaluating the intensity, the "signal envelope" and "signal amplitude" terms are very helpful. In order to introduce definitions of these terms, we need to start with an easy case from the class of simple periodic signals. For these signals, nor the envelope neither the amplitude changes in time. The envelope consists of two straight parallel lines limiting all possible values of the signal. The upper line links all maximum values (peaks) of the waveform and the lower line links the minima. The reason why for simple signals both lines are straight is that nothing changes in the signal except of the phase of vibration. Amplitude is defined as the one half of the difference between the upper and lower line level of the envelope. For a large class of the simple periodic signals, the upper line is the mirror reflection of the lower line as both of them are symmetrically placed around the time axis. The otoacoustic emission signals roughly satisfy this condition. Anyway, in such cases, we can define the amplitude as the distance of the upper envelope line from the baseline.
3. More complex acoustic signals
The above definition of the envelope is consentient only for less-complicated waveforms. If one tries to apply these definitions on a wider, general class of signals, he will meet a lot of difficulties. One of the problems appears in the case for which many oscillations appear inside one period of the signal. Examples of these class of signals include the glottal vibrations or the simpler arterial pressure waveform with its dichotic notch. These are examples of quasi periodic signals, where many positive and negative peaks exist in the segment covering one period. Which peak is important for the envelope calculation? The answer is suggested in the statement defining the problem. If we know what is the period of the oscillations, we can take one maximum in the range of the period for the upper level line of the envelope and one minimum for the lower line. The two extreme deviations, positive and negative, are important in this concept.
The above method provides single points from particular periods as data for the signal's envelope. At this point certain questions arise: What about the rest of the continuous time scale? Are the values for the rest of the time scale important and how they can be evaluated? In order to find the answers to the above questions, let us recall that our intention is to get information about the sound intensity. The points which we are estimating, from the sequence of peaks of acoustic pressure, belong to a small subset of intensity samples. The full evaluation of the signal's envelope requires special processing methods.
Through an evaluation of the sound intensity we aim to reconstruct the energetic characteristics of the vibrational source. For example, in a pendulum, the temporary angle of swing changes. Despite these variations there is continuous mutual exchange in the form of the energy. In the lowest position, the pendulum ball has the largest speed and maximal kinetic energy. In contrary, when the ball reaches highest position the kinetic energy is equal to zero because all energy is accumulated as potential one in gravitational field. The key property of this phenomenon is that the balance, being the sum of both forms of energy, is preserved, athough the ball position changes in a continuous oscillation format. The total energy stored in the vibrating object is a good measure of the intensity of the oscillations.
The analogy with hair cells is direct, but in this context the oscillations have an active nature. This means, that vibrations are stimulated and energy is continuously provided to the cells. Here, the energy balance, calculated for a longer time period shows that the provided energy compensates any friction losses. We believe that the energy emitted from the hair cell vibrations is approximately proportional to the current energy of the cells. The observed otoacoustic emission differs from that emitted through some transformations occurring on the way between the source (cochlea) and external canal. Again we believe that these transformations are passive and that they modify the OAE signal slightly. If so, by investigating the energy of the "emitted wave" we can get access to the information about the current intensity of the vibration of the source. In the ideal pendulum, the energy (intensity) of the oscillations is constant and single measurement provides all characteristic of the phenomenon. The energy of vibrations of OAE source (or sources) depends on many factors and changes in time. The changes are the key goal of our experimental interest. The amplitude of vibrations, as we defined it above, reflects directly instantaneous power of the signal that is the intensity of OAE. The reconstruction of the amplitude variations in time is an important task leading indirectly to the knowledge about intensities of source sound vibrations.
4. Signal processing methods
Considering the statements above, we can say that the information carried by the sequence of peak values is important and… fragmentary. In the paragraphs below, we are going to review methods to fill the gaps between the peaks creating the envelope and the amplitude. The most popular and traditional method is called peak rectification. The method derives from a former radio broadcasting technique, when the signals were coded using amplitude modulation. The radio transmitter modulated the carrier of constant frequency by changing the intensity of the emitted radio waves. The task of the receiver was to reconstruct the amplitude and in this way to get information about the intensity. The analogy with our task is vary close, the hair cells are the transmitters and our probe-microphones are the receiver. Because the envelope is symmetrical, it is enough to take only positive part of the waveform (rectification) and connect peaks of such signal. To perform such a connection, the device uses memorizing elements (capacitors) to store value of the last peak.
The estimation of the instantaneous power offers more likelihood reflection of the amplitude. Values of the signal are squared (squared rectification), then the components being side-product of the highest rate of changes are removed by low-pass filtering. The resulted waveform is transformed by square rooting. For a signal obtained this way one can scale it to get the measure of the intensity.
The first method, peak rectifier, is similar to hand made determination of the amplitude which one can perform on paper connecting successive peaks with a straight line or slowly decaying line. This method does not give smooth waveform and the accuracy of the method is not high. The second method using gives a smoother but not ideal representation. The dilemma arises how to select the cutting frequency of the low-pass filter. If we "cut" too low the high frequency components, we will get false ripples. If we "cut" too much, we will get inertial effects and the result will be slower then real changes of intensity.
Another method of estimation of amplitude is the reconstruction of the analytical signal using Hilbert filtration. In the above sentence we have introduced two new terms - Hilbert filtration and analytical signal. In order to explain them we need to go back to the pendulum analogy. Let us recall that in the pendulum there are two forms of energy , the kinetic related to movement and the potential related to position. Sound has also two forms of the energy but by using microphones we record one of them called sound-velocity or pressure. In order to evaluate the energy of the acoustic signal one needs to know both forms of energy. One solution would be to use a second type of microphone, however such an approach is not very efficient. Instead of conducting complicated measurements, we can use a mathematical transformation to get the dual form of the signal. The transformation is called Hilbert filtration. A pair of signals consisted of real measured waveforms, obtained from the filtration, can be used to create a complex signal called analytical form. The amplitude of the analytical signal is calculated as the square root of instantaneous power of both components. Using this method the estimated amplitude provides a correct information about the source of the signal, but only in cases when the signal is simple. Mathematical proofs have verified that the envelope, determined by the amplitude of the analytical signal, passes through all peaks of the real component. So it satisfies the assumption from which we started - it crosses the amplitude peaks. The advantage of this method is that there is no questions on how to cut the high frequency oscillations because the this method has no inertia.
However, the Hilbert filtration has limitations. If the signal has many peaks in the range of the period then the envelope will automatically pass through all such peaks. And in such cases the method will give a lot of false ripples. The method is limited only to simple signals.