Contents of this section: Introduction

Distortion product otoacoustic emissions (DPOAE) are responses generated when the cochlea is stimulated simultaneously by two pure tone frequencies whose ratio is between 1.1 to 1.3.

Recent studies on the generation mechanism of DPOAEs have underlined the presence of two important components in the DPOAE response, one generated by an intermodulation "distortion" and one generated by a "reflection".

The prevalence of DPOAEs is 100% in normal adult ears. Responses from the left and right ears are often correlated (that is, they are very similar). For normal subjects women have higher amplitude DPOAEs. Aging processes have an effect on DPOAE responses by lowering the DPOAE amplitude and narrowing the DPOAE response spectrum ( i.e. responses at higher frequencies are gradually diminishing).The DPOAEs can be also recorded from other animal species used in clinical research such as lizards, mice, rats, guinea pigs, chinchilla, chicken, dogs,and monkeys.


  • The pure tones which stimulate the cochlea are called primaries and they are assigned as F1 and F2 and their corresponding amplitudes are assigned as L1 and L2.The lower tone is usually the F1 and the higher tone the F2.

  • In order to generate the intermodulation DPOAE component, the primaries should have frequencies which are close to one another. The ratio of the F2 / F1 frequencies is called frequency ratio FR. The choice of the FR has an effect on the amplitude of the DPOAEs at different tested frequencies.

  • Due to intermodulation the cochlea generates an long series of components which are not present in the input stimuli. These components are called distortion products. The most prominent and mostly used in clinical practice is the cubic difference distortion product denoted as 2F1 - F2.

  • The DPOAE protocols employed in clinical practice are divided in two categories. Protocols using primaries with equal intensities are called symmetric (L1 = L2), for example 70-70 dB SPL. Protocols using unequal primary intensities (L1 > L2) are called asymmetrical, for example 65-55 dB SPL. The latter can identify better cases with hearing impairment and they are used in most screening programs.

    When asymmetrical DPOAE protocols are used, the intermodulation components are generated close to the F2 primary tone. Therefore the DPOAE information is referenced to F2. When symmetrical protocols are used the DPOAE information is referenced to the geometric mean, which is defined a the square root of F1 * F2.

  • There are two ways to present the DPOAE information: In the DP-gram modality we measure the 2F1 - F2 amplitudes at various F2 frequencies, having fixed the stimulus intensities, for example F1=65 dB and F2=55 dB SPL. In the Input -Output (IO) modality, we measure the 2F1 - F2 at a fixed F2 frequency, varying the primary stimulus levels.

    DP-gram response from a Sprague-Dawley rat under ketamine anesthesia.

DPOAE generation mechanisms:

       For the moment, we will consider the cochlea simply as a black box and the ear-canal signal as representing the output of this system. Into this black box, two pure tones are applied which, traditionally, are referred to as the f1 and f2 primaries (f1<f2). If the cochlea acts in a linear manner, then we would expect that the output frequencies would be the same as the input frequencies. In other words, the function relating the input to the output signal is a straight line representing a linear function. However, if the function relating the input of the two sinusoids to the output is not a straight line, that is, the input/output (I/O) function is nonlinear, then new frequencies will be generated at the output. I/O functions that are typically used to represent the basilar membrane (BM) response are described in Fig 3 in Fahey et al (2000). One of these I/O plots (Fig 3b) is highly similar in shape to the hair cell receptor voltage versus stereocillia displacement function measured earlier by Hudspeth and Corey (1977) and Russell et al (1986). These types of nonlinear I/O functions obtained from various cochlear structures are relevant to the discussion of physical mechanism(s) within the cochlea that are capable of generating DPOAEs. If such functions exhibit both even- and odd-order symmetry, then all the DPOAEs that can be found in the ear-canal signal will be observed. Thus, combinations of the primaries that result in even-order DPOAEs, such as the simple difference tone, f2-f1, and many odd-order DPOAEs, the largest and most commonly studied one being the 2f1-f2 frequency, will be measured. Other DPOAEs often seen are the lower odd-order sideband 3f1-2f2 and the upper odd-order sideband DPOAE at the 2f2-f1 frequency.

        When the f1 and f2 primaries are presented to the ear canal, the first constraints that must be placed upon DPOAE generation can be appreciated from observations of the underlying BM mechanics [for a recent excellent review, see Robles and Ruggero, (2001)]. Presentation of a pure tone to the ear canal results in the well-known traveling wave of displacement on the BM, that peaks at its characteristic frequency (CF), and then rapidly dies out at more apical points that are lower in frequency. This displacement pattern defines the place on the BM where DPOAEs must be generated. That is, the only place where f1 and f2 can mix in the nonlinearity (often assumed to be based in the OHCs--see below) is in the tail of the BM displacement of the f1 primary. If f2 is placed at a much higher frequency, then, because of the steep apical cutoff of BM displacement, f2 cannot substantially interact with f1. Consequently, on theoretical grounds, DPOAEs must be produced at, or near to, the f2 place, where the two primaries can physically interact on the BM.

        This theoretical prediction is borne out by findings from suppression studies in which a third tone (f3) is used to interfere with DPOAE generation. By sweeping f3 in level and frequency, suppression tuning curves (STCs) can be produced, with their tips typically tuned near the f2 place for the 2f1-f2 DPOAE (eg, Brown & Kemp 1984; Martin et al 1998a). Much of this requirement also accounts for the much studied f2/f1 ratio effect, in which DPOAE levels decrease on either side of an optimum ratio value. In humans, this ideal f2/f1 ratio is approximately 1.22, and DPOAEs are largest at this ideal separation of the two primary tones. Some of this ratio effect, as the primary f1 and f2 tones come closer together, may be due to mutual suppression or interaction of multiple DPOAEs (Stover et al 1996a). It has also been proposed that this phenonmenon can be explained by a second-filter effect (Brown et al 1992).

        When DPOAEs are produced in the cochlea, they can be seen on the BM, and they propagate just as if they were external tones introduced into the ear canal (Robles & Ruggero, 2001). Because the 2f1-f2 is lower in frequency than the f2 place where it is generated, this combination tone will not be perceived, if someone is deaf at this lower frequency. Such an outcome occurs because the 2f1-f2 DPOAE travels to its characteristic place, where it then acts like an external tone.

        Basilar-membrane mechanics also explain why DPOAE are more effectively produced at lower primary-tone levels, when the level of f2, ie, L2, is lower than the level of f1, ie, L1. This is the familiar unequal-level primary tones protocol, typically 65/55 dB SPL, that is almost universally advocated in the clinical literature (Stover et al 1996b) for obtaining DPOAEs in humans. The rationale for lowering L2 is to equate the amplitudes of the vibration of the traveling waves representing the two primaries, where they interact on the BM. Because the BM response is highly compressive at the CF, assumed to be f2 for DPOAEs, and linear at the off-CF frequency of f1, then lowering the level of f2, where it is 'amplified' at low stimulus levels, helps to equate the two stimuli, where they interact at the f2 place [see Fig 4 in Kummer et al (2000) for a superb explanation of this phenomenon]. As primary-tone levels become higher, this L1-L2 difference is no longer needed to equate the two stimuli, a point often not appreciated in the clinical literature (Whitehead et al 1995).

        In short, DPOAEs are produced when the primary tones interact on the BM to stimulate nonlinear elements in the cochlea. There is now very convincing evidence that the OHCs are the site of this nonlinearity (Brownell 1990). Specifically, it has been proposed that OHC electromotility, first described by Brownell et al (1985), is the source of the 'cochlear amplifier'. That is, it is assumed that the OHC electromotility-based cochlear amplifier is responsible for the compressive BM response at CF, and the associated sharpness of nerve-fiber tuning seen in physiologically healthy preparations, but absent in damaged or dead animals (Robles & Ruggero 2001), along with the nonlinearity responsible for producing DPOAEs. However, other sources have been proposed for the cochlear amplifier including stereocillia motility (Martin et al 2000). Ultimately, it will probably be discovered that DPOAEs originate from a variety of nonlinear sources, besides OHC electromotility, that participate in the OHC-transduction process including opening and closing of transduction channels (Patuzzi 1998), nonlinearities in stereocillia-bundle motion (Jaramillo et al 1993), and asymmetries in stereocillia stiffness (Khanna & Hao 1993).

        Related to the question of how DPOAEs are generated is the issue of where do DPOAEs originate from with respect to a point(s) along the cochlear partition. As discussed above, it is generally assumed that DPOAEs come from the f2 place. However, once created, DPOAEs also propagate as traveling waves along the BM. Consequently, it is possible for a propagated DPOAE to stimulate the DPOAE place, ie, the 2f1-f2 frequency place, where other OAEs can be further produced by the mechanism of linear-coherent reflection (eg, Heitmann et al 1998; Kalluri & Shera 2001). These two sources (ie, the DPOAE generated at the f2 place and the emissions reflected from the 2f1-f2 DPOAE place) then mix to form the final ear-canal signal.

        Evidence also exists for basal DPOAE sources that may also contribute to the final DPOAE signal. These basal sources are revealed as secondary regions of suppression or enhancement above f2 during the collection of the STCs mentioned above. Such regions of suppression/enhancement are observed at frequencies that are more than an octave above f2 (Martin et al 1999; Mills 2000), where it is unreasonable for the f3, due to the steep apical cutoff of the traveling wave, to affect the f2 place. One possible explanation for these phenomena is that a harmonic of f1 (ie, 2f1) interacts with f2 to produce a simple difference-tone DPOAE. This emission will always have the same frequency as the 2f1-f2, so, depending upon the phase of the difference tone, either suppression or enhancement could result (Fahey et al 2000). Another possibility is that f3 acts as a catalyst to produce difference-tone DPOAEs by more complicated routes that can then interact with the 2f1-f2 DPOAE. Evidence for both possibilities seems to be present in the data.

        Another difficult-to-explain finding is the observation that the upper sideband 2f2-f1 DPOAE appears to originate from its characteristic place on the BM (Martin et al 1998b). As discussed above, this finding contrasts with the notion that all DPOAEs must be generated at the f2 place, where the two traveling waves representing f1 and f2 optimally interact. One possibility is that the 2f2-f1 observed in the ear canal comes largely from a difference-tone DPOAE based upon the interaction of a harmonic of f2 (ie, 2f2) and f1, which of course, will be at the 2f2-f1 frequency.

        A final issue that must be discussed regarding DPOAEs is the notion that there are 'active' versus 'passive' DPOAEs. This conceptualization originated from earlier studies like Norton and Rubel (1990) and Whitehead et al (1992a,b). In these investigations, administration of loop diuretics, such as ethacrynic acid or fursosemide, eliminated low-level DPOAEs, while DPOAEs evoked by high-level tones remained relatively unaffected [see Fig 3 in Whitehead et al (1992)]. Results like these led to the notion that DPOAEs evoked by high-level tones were not relevant to cochlear function, and many clinical studies focused on low-level primaries in the 55- to 65-dB SPL range. However, early studies in humans (Lonsbury-Martin et al 1990) clearly indicate that 75/75 dB SPL equilevel primaries can accurately track the pattern of hearing loss in individuals with impaired hearing. More recently, studies in mice with age-related hearing loss (Jimenez et al 1999) indicate that all levels of primaries accurately follow the progressive degeneration of high-frequency OHCs observed in these animals. Similarly, a brief exposure to damaging levels of noise will affect, not only low-level DPOAEs, but high-level DPOAEs as well (Howard et al 2001). Thus, more recent thinking assumes that there are not two sources of DPOAEs, that is, a low-level 'active' one along with a high-level 'passive' source. Rather, low-level DPOAEs are based upon a functional cochlear amplifier, whereas high-level DPOAEs arise when stimulation is sufficient to move the BM without amplification, in turn, stimulating remaining nonlinear elements to evoke DPOAEs.

Readers who wish to get additional information on the cochlear mechanisms responsible for the DPOAE generation, might consult the following editorial

  • DPOAE generation Mechamisms by Glen K Martin Ph.D. (USA, 2002)

  • •  Introduction   •  Test Procedures   •  Threshold and DPOAEs  
    •  Main