Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Proceedings of Meetings on Acoustics

A simple model of the Erhu soundbox

--Manuscript Draft--

Manuscript Number: POMA-D-18-00289

Full Title: A simple model of the Erhu soundbox

Article Type: ASA Meeting Paper

Corresponding Author: Christopher Waltham

University of British Columbia
Vancouver, BC CANADA

Order of Authors: Christopher Waltham

Abstract: The erhu is a common, bowed, two-string instrument from China. Unlike the Western
violin, it has received scant attention from the musical acoustics community, in spite of
having comparable versatility and power. This paper outlines the structure of the erhu
and a vibro-acoustic model suggested nearly 30 years ago in the Chinese literature.
The model is developed and supported by admittance and acoustic pressure data. The
evidence is that the radiation spectrum of the erhu is dominated by pairs of coupled
membrane-cavity resonances that resemble the formants of the human voice. The
power of the instrument arises from the large admittance of the membrane, and also
the fact that half the membrane-cavity modes are radiation-efficient breathing modes.

Section/Category: Musical Acoustics

Additional Information:

Question Response

Please provide the JASA Scitation DOI

link to the published Meeting Abstract.
(Published Abstracts can be accessed
through the Journal of the Acoustical
Society of America (JASA's) Scitation pg.-
Search by original title using the magnifier
at the top right. An example DOI is -
(Please copy and paste your DOI into the
space provided.)

Provide the JASA Volume and Issue #'s Vol. 144, (3) (2018)
associated with this DOI link, e.g.,
Vol. 144, Issue 3.

Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation
Volume 35

176th Meeting of Acoustical Society of America

2018 Acoustics Week in Canada
Victoria, Canada
5-9 Nov 2018

Musical Acoustics: Paper 2aMU8

A simple model of the Erhu soundbox

Christopher Waltham
Physics & Astronomy, University of British Columbia, Vancouver, BC, V6T 1Z1, CANADA;

The erhu is a common, bowed, two-string instrument from China. Unlike the Western violin, it has
received scant attention from the musical acoustics community, in spite of having comparable versatility
and power. This paper outlines the structure of the erhu and a vibro-acoustic model suggested nearly 30
years ago in the Chinese literature. The model is developed and supported by admittance and acoustic
pressure data. The evidence is that the radiation spectrum of the erhu is dominated by pairs of coupled
membrane-cavity resonances that resemble the formants of the human voice. The power of the instrument
arises from the large admittance of the membrane, and also the fact that half the membrane-cavity modes
are radiation-efficient breathing modes.

Published by the Acoustical Society of America

© 2018 Acoustical Society of America.

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 1
C. Waltham A simple model of the Erhu soundbox

The erhu is a two-string bowed instrument whose soundbox is a rigid wooden cylinder open at one end and
capped at the other with python skin that is excited by string vibrations transmitted through a small central bridge.
In the Chinese system of organology, the erhu is classified as a silk instrument 1, because that is the traditional
string material. Since the 1950s, strings have been made of steel. In Hornbostel and Sachs’ system, the erhu is
labeled as a “spike fiddle”. 2 The er (二) character in the name means ‘two’, referring to the two strings; the hu (
胡) character refers to the Hu people, suggesting an origin among the northern nomadic tribes. The erhu has
been an important instrument in Chinese music since at least the late eleventh century1, and in modern Chinese
orchestras it occupies a similar role to that taken by the violin in Western orchestras.
While the violin has been under intense scrutiny by physicists and acousticians for at least two centuries,
the erhu has received scant attention. In spite of searches by Chinese-speaking students, this author is only aware
of three papers, published in 1991 and 1992, two of which are in Chinese. 3,4,5 The authors of these papers set
out to explain the action of the erhu soundbox in terms of an analytical model, but the results were limited by
the technology available at the time and few comparisons with data are shown. However, what is clear is the
tantalizing possibility of a simple model giving a fairly complete description of the sound radiation from the
instrument over a wide frequency range (to a few kHz). A comparably simple model of the guitar only covers
the first two or three significant modes below 250 Hz. 6 No simple model exists for the Western violin, owing
to the complex structure of the soundbox, although impressive advances have recently been made using
sophisticated fluid-structure coupling finite-element software. 7
This paper will describe the structure of the erhu and outline Chen et al.’s model of the soundbox.
Admittance and acoustic pressure data are presented that support the model, showing that the main features of
the instrument’s spectrum are coupled membrane-cavity modes. Finally the sound of the erhu will be compared
to that of a Western violin, and the similarities and differences discussed in light of the soundbox model results.

The load-bearing part of the erhu is a wooden neck, which holds the tuning pegs close to one end and the
wooden soundbox, pierced by the neck, at the other (Fig.1). The wood can be padouk, ebony, or sandalwood.
The front of the soundbox is covered with a pre-tensioned python skin; the rear of the soundbox is open and
often decorated with lattice-work.1 The strings transmit vibrations to the python skin via a bridge. The vibrating
length is defined by the distance between the bridge and a multiple loop of thread that pulls the strings toward
the neck and presses them against the bridge, holding it onto the python skin. The bottom ends of the strings are
fixed on the underside of the soundbox base. The erhu is traditionally played in a seated position with the
soundbox placed on the performer’s left thigh. The fingers of the left hand stop the strings, and the right hand
holds the bow with an underhand grip. 8
The standard erhu is 80 cm in length in total. The face of the soundbox can be hexagonal, octagonal or
circular and is about 90 mm in diameter, with a length of 130 mm. The base of the bridge is roughly circular,
with a diameter of around 14 mm. The strings are 5 mm apart at the bridge and the hairs of the bow are threaded
between them. The strings are generally tuned to D4 and A4.

Figure 1. An erhu and bow (separated).

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 2

C. Waltham A simple model of the Erhu soundbox



The work of Chen et al.3,4,5 presents a circuit analog of the soundbox of the erhu and related jinghu. The
authors assume Bessel function mode shapes for the membrane and make extensive calculations of the radiation
impedance of these modes. However, the data presented are mostly sound spectra with some bridge velocities,
not of the quality that is now readily obtainable (2018) and apparently not normalized to the driving force. The
data are not compared directly with model results.
In 2018 Pfeifle 9 gave a talk at an Acoustical Society of America meeting on the organological and acoustical
similarities of the erhu and violin.


Consider the erhu soundbox as a rigid circular tube (Fig.2), length 𝐿𝐿 and radius 𝑎𝑎, with one end open and
the other closed by a light uniform membrane of tension 𝑇𝑇 (N/m) and area density 𝜎𝜎. In the center of the
membrane is a small bridge with mass 𝑚𝑚𝑏𝑏 , held in place by two taught steel strings. The two strings lie
approximately in the plane of the membrane, with a small change in angle as they pass over the bridge. The
distortion of the membrane, due to the downward force of the strings, is small (of the order of 1 mm). The
instrument used for measurements had the following dimensions: the length of the box, 𝐿𝐿 = 131 mm; its cross-
sectional shape was octagonal with maximum width 88 mm.

Figure 2. Model of erhu soundbox. The effective radiating area of the membrane, 𝑺𝑺𝒎𝒎 , is generally much smaller
than its physical area, 𝑺𝑺.

In this present work, the model is of the soundbox alone; the strings are not considered.


Analytical forms for the modal shapes of a mass-loaded octagonal membrane are not available. As a starting
point the mode shapes 𝜑𝜑(𝑟𝑟) are taken to be the same as that of an unloaded uniform membrane (i.e. in vacuo).
Initially, only the radially symmetric modes are considered (as in refs.3,4,5); to first order these will be the only
ones excited via a central bridge (Eqn.1). Later, the model is adjusted to account for real measured mode shapes.
The modal shape for the nth mode is given by the follow equation.

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 3

C. Waltham A simple model of the Erhu soundbox

𝜑𝜑𝑛𝑛 = 𝐶𝐶𝑛𝑛 𝐽𝐽0 (𝑘𝑘𝑛𝑛 𝑟𝑟) 𝑘𝑘𝑛𝑛 𝑎𝑎 = 2.405, 5.520, 8.654,11.792 … (1)

The normalization constants 𝐶𝐶𝑛𝑛 for the Bessel functions 𝐽𝐽0 are chosen such that 𝜑𝜑𝑛𝑛 (0) = 1. The
admittance of the membrane of mass 𝑀𝑀 is 𝑌𝑌(𝒓𝒓1 , 𝒓𝒓2 ) driven at point 𝒓𝒓1 , measured at point 𝒓𝒓2 and measured in
m/s per N or s/kg, is given by:

𝑗𝑗𝑗𝑗𝜑𝜑𝑛𝑛 (𝒓𝒓1 )𝜑𝜑𝑛𝑛 (𝒓𝒓2 )
𝑌𝑌(𝒓𝒓1 , 𝒓𝒓2 ) = �
2 2 2𝜔𝜔𝑛𝑛 𝜔𝜔 (2)
𝑛𝑛=1 𝑀𝑀(𝜔𝜔𝑛𝑛 − 𝜔𝜔 +
𝑄𝑄𝑛𝑛 )

Therefore the driving point admittance at the center (denoted by the subscript “0”), is given by:

𝑌𝑌𝑏𝑏0 (𝜔𝜔) = � = � 𝑌𝑌𝑏𝑏0,𝑛𝑛 (𝜔𝜔)
2𝜔𝜔 𝜔𝜔 (3)
𝑛𝑛=1 𝑀𝑀(𝜔𝜔𝑛𝑛 − 𝜔𝜔 2 + 𝑄𝑄𝑛𝑛 ) 𝑛𝑛=1

Here, 𝑌𝑌𝑏𝑏0,𝑛𝑛 (𝜔𝜔) is a convenient way of writing the contribution of a single mode 𝑛𝑛.


To couple the membrane to the air in the soundbox, the specific volume flow 𝑈𝑈, in m3/(s∙Pa), must be
calculated. It can be shown that the flow is essentially uniform across radial sections of the soundbox so long as
the distance from the membrane is more than ~10% of the length of the box.

𝑁𝑁 2
𝑈𝑈 = � 𝑌𝑌𝑏𝑏0,𝑛𝑛 �� 𝜑𝜑𝑛𝑛 𝑑𝑑𝑑𝑑� ; � 𝜑𝜑𝑛𝑛 𝑑𝑑𝑑𝑑 = 𝑆𝑆𝑚𝑚,𝑛𝑛 (4)

The numerical coefficients allow the effective membrane areas 𝑆𝑆𝑚𝑚,𝑛𝑛 to be calculated for each radial mode
𝑛𝑛; eventually these values will be replaced with those obtained from mode shape measurements. When
considering the volume flow in relation to force on the bridge, there is an additional sign term needed which
accounts for the fact that pushing the bridge one way for even-numbered modes causes a flow in the opposite
As the erhu membrane is excited by the strings via a central bridge, it can be assumed that the modes that
are primarily excited will be radially-symmetric ones. However, the bridge is several mm wide with neither
string passing over the exact center, and thus modes with nodal diameters will not be absent. The excitation of
such modes depends on a rocking motion of the bridge which makes analysis more complicated and puts the
matter outside the scope of this study.
The principal modes in the cavity will be largely longitudinal in nature. The first transverse mode for pipe
with a diameter of 88 mm is expected to be at 3.9 kHz, the upper end the current region of interest. Thus, if the
acoustic pressure 𝑝𝑝(𝑥𝑥) and volume flow 𝑈𝑈(𝑥𝑥) inside the erhu cavity are assumed to be associated with a plane
wave when more than 2 cm from the membrane, they can be written in terms of the longitudinal position 𝑥𝑥 and
the reflection coefficient of the hole, 𝑟𝑟:

𝑝𝑝(𝑥𝑥) = 𝑝𝑝(0)�𝑒𝑒 −𝑗𝑗𝑗𝑗𝑗𝑗 + 𝑟𝑟𝑒𝑒 𝑗𝑗𝑗𝑗𝑗𝑗 �; 𝑈𝑈(𝑥𝑥) = 𝑝𝑝(0)�𝑒𝑒 −𝑗𝑗𝑗𝑗𝑗𝑗 − 𝑟𝑟𝑒𝑒 𝑗𝑗𝑗𝑗𝑗𝑗 �/(𝜌𝜌𝜌𝜌); (5)

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 4

C. Waltham A simple model of the Erhu soundbox


The open end of the erhu soundbox is neither flanged nor the end of an infinite pipe (although 𝑘𝑘𝑘𝑘 > 1
above 500 Hz), so without complicated boundary-element calculations, approximations must be made. For the
end of an infinite pipe radius 𝑎𝑎 and area 𝑆𝑆, Levine and Schwinger’s result 10 can be approximated as:

𝑍𝑍𝑟𝑟𝑟𝑟𝑟𝑟 ≈ 𝑍𝑍0 � + 0.61𝑗𝑗𝑗𝑗𝑗𝑗� ; 𝑍𝑍0 = 𝜌𝜌𝜌𝜌/𝑆𝑆 (6)

For the erhu, 𝑆𝑆 can be replaced by 𝑆𝑆𝑚𝑚 for the membrane and 𝑆𝑆ℎ for the hole, with 𝑎𝑎 scaled accordingly.
Another approach is to start with a velocity map of the membrane and propagate the wavefront 11,12,
calculating the acoustic pressures at the surface. For a uniform velocity field, the result is essentially the same
as the analytically-derived flanged pipe radiation impedance. In any case, the effect of radiation impedance is
sufficiently small that the two approaches give very similar results.


Initially, consider the membrane to be a rigid piston of area 𝑆𝑆𝑚𝑚 and mean radius 𝑎𝑎𝑚𝑚 , with a frequency-
dependent force 𝐹𝐹(𝜔𝜔) applied normally to the bridge. The flow at the membrane 𝑈𝑈(0) can now be expressed in
terms of the force and the total impedance of the membrane 𝑍𝑍𝑚𝑚 (which is measured in Pa per m3/s).

𝐹𝐹 1
𝑈𝑈(0) = � − 𝑝𝑝(0)� (7)
𝑆𝑆𝑚𝑚 𝑍𝑍𝑚𝑚

1 1 − 𝑟𝑟
𝑍𝑍𝑚𝑚 = 2 𝑌𝑌
+ 𝑍𝑍𝑟𝑟𝑟𝑟𝑟𝑟 (𝑘𝑘𝑎𝑎𝑚𝑚 ) + 𝑍𝑍0 � �
𝑆𝑆𝑚𝑚 𝑏𝑏0 1 + 𝑟𝑟 (8)

The reflection coefficient 𝑟𝑟 can be found from the radiation impedance of the hole:

1 − 𝜌𝜌𝜌𝜌/(𝑆𝑆ℎ 𝑍𝑍𝑟𝑟𝑟𝑟𝑟𝑟 (𝑘𝑘𝑎𝑎ℎ ))

𝑟𝑟 = 𝑒𝑒 −2𝑗𝑗𝑗𝑗𝑗𝑗 (9)
1 + 𝜌𝜌𝜌𝜌/(𝑆𝑆ℎ 𝑍𝑍𝑟𝑟𝑟𝑟𝑟𝑟 (𝑘𝑘𝑎𝑎ℎ ))

Now the pressure inside the cavity (Eqn. 6) can be found in terms of 𝑝𝑝(0), which can be found from Eqns.

𝐹𝐹 𝜌𝜌𝜌𝜌
𝑝𝑝(0) = � � (10)
𝑆𝑆𝑚𝑚 𝜌𝜌𝜌𝜌 + 𝑆𝑆𝑚𝑚 𝑍𝑍𝑚𝑚 + 𝑟𝑟(𝜌𝜌𝜌𝜌 − 𝑆𝑆𝑚𝑚 𝑍𝑍𝑚𝑚 )

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 5

C. Waltham A simple model of the Erhu soundbox

Two erhus, labeled AT and CW, were used for the measurements in this work. Erhu AT was used for most
of the model comparisons because, unlike erhu CW, its back had a sufficiently open lattice to allow access to a
small microphone.
A new erhu bridge was fashioned from light paulownia wood with a small hole in it just big enough to hold
a 0.2 gram Endevco 25B accelerometer. The combination had the same mass as a regular erhu bridge made of
hardwoods like ebony or walnut. The new bridge was attached to the center of the membrane with a thin layer
of mounting wax. For the purposes of the measurements reported here, the strings were not attached.
The bridge was struck with a small Isotron PCB 086E80 instrumented impact hammer. In principle it was
now simple to obtain the driving point admittance at the bridge and deflection shape maps of the membrane, but
in practice, the small mass of the membrane and bridge made getting a “clean hit” quite tricky. The impact force
spectrum fell rapidly above 3 kHz, with subsequent loss of signal quality.
To measure the spatial pressure distributions at the hole, a planar 30-microphone array (6 mm Panasonic
WM-61D electrets) with a 30 mm pitch, was employed.12 In addition, measurements at the hole center were
made by a less invasive single WM-61D. Radiativity measurements were made in an anechoic chamber, with
the instrument inside a 92 cm radius circular array of 30 microphones. Within this array, the instrument could
be rotated, allowing radiation measurements over 4π.12



The driving point admittance at the center of the bridge of a typical erhu is shown in Fig.3. The most striking
feature of the admittance, compared to a comparable measurement on a violin, is the magnitude. Peak values are
generally a few s/kg, compared to values around 0.1 s/kg for a violin 13. Even allowing for the small radiating
area, the erhu is not a quiet instrument.

Figure 3. Driving point admittance in the center of the bridge of an erhu.

Also notable is an almost formant-like structure in the frequency response, with split or broad peaks around
500 Hz, 1300 Hz, 2700 Hz, and also at higher frequencies that remain visible even as the data quality tails off.

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 6

C. Waltham A simple model of the Erhu soundbox

The word “formant” seems appropriate as the soundbox is of comparable length to that of the vocal tract.
Anticipating the analysis below, it is suggestive that both the erhu and human vocal tract (~17 cm long 14) have
features akin to cavity resonances in an open-closed tube.
The largest feature is a split mode, centered between 500 and 600 Hz. The model accounts for this feature
in terms of the lowest radial mode of the membrane coupled to the lowest cavity mode of the soundbox. In
anticipation of confirmation by the model, doublet members are labeled (0,1)- and (0,1)+. This pattern is followed
at higher frequencies; the next doublet being (0,2)-, (0,2)+ etc. The numbers in brackets indicate the membrane
mode (# nodal diameters, # nodal radii) and the +/- indicates the relative value of the frequency due to the
different phase relationships with the cavity air. Hence, the power of the erhu, small as it is, may be understood
not only in terms of the large membrane admittance, but also the fact that half the coupled membrane-cavity
modes will be radiation-efficient breathing modes.
The (0,1)+ mode appears to be further split into three, although on other occasions this same instrument has
been observed to have a simple (0,1)-/(0,1)+ doublet, the membrane tension being sensitive to changes in
humidity and temperature. An open-closed tube with the same dimensions would have the lowest cavity mode
at 550 Hz. While it is not easy to determine the tension and mass of the python skin (which cannot be expected
to be particularly uniform in surface density), one can vary the tension by attaching and tuning the strings of the
erhu. By measuring the frequencies of the split mode (0,1)-/(0,1)+ for two erhus both strung and unstrung, and
comparing values with the model one can observe classical reactively-coupled resonance behavior (Fig.4). If the
lowest mode of the membrane without air loading (𝑓𝑓𝑣𝑣𝑣𝑣𝑣𝑣 ) equals that of the lowest mode of the cavity, the two
soundbox modes are (a) closest together and (b) of equal strength. Unfortunately, 𝑓𝑓𝑣𝑣𝑣𝑣𝑣𝑣 is not a useful measure
for a luthier, who likely has only experience as a guide.

Figure 4. Lowest two modes for two erhus (labeled "AT" and "CW") strung and unstrung. The left plot shows
the frequencies; the right plot shows the peak admittances.


To calculate the radiativity of each membrane mode, and the coupling to the air in the cavity, the deflection
shapes were measured (Fig.5) and compared with the initial assumption that they would be similar to Bessel
functions. Thus the Bessel-calculated values of the effective area 𝑆𝑆𝑚𝑚 (which is the physical area scaled by the
mean admittance divided by the central driving point admittance) can be modified to reflect reality.

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 7

C. Waltham A simple model of the Erhu soundbox

Figure 5. Deflection shapes for the lowest two admittance peaks.

Clearly the two shapes in Fig.5 are not the same. At the lower frequency the cavity air is moving with the
membrane. At the higher frequency the cavity air is opposing the membrane, and, having greater influence
away from the center, is suppressing the deflection there and leaving only the center mobile (and then very
Similar mode-splitting occurs between higher radial membrane modes and higher longitudinal cavity
modes, e.g. around 1300-1700 Hz and 2600-2800 Hz. Interestingly, the observed frequency ratios of the
sequence of mode-pairs lie between those expected from a closed-open cavity sequence of 1 : 3 : 5 etc. and
those of a Bessel function sequence 1 : 2.3 : 3.6 etc. Perhaps this is because the mass of air in the cavity is
comparable with the mass of membrane and bridge (both about a gram), and so neither dominates the coupling.
High membrane modes give rise to deflection shapes (Fig.6) that are both closer to classical Bessel shapes
and more similar to each other than the members of the lowest doublet.

Figure 6. Deflection shapes for members of the (0,2), (0,3) and (0,4) doublets.


Calculations of wavefront propagation down the length of the cavity indicate that transverse variations in
pressure caused by membrane deflection shapes damp out well before the end of the cavity. Measurements with
the planar microphone array also show that the acoustic pressure over the surface of the hole is consistent with
a uniform exit velocity, at least for frequencies below 3 kHz (Fig.7). The first transverse mode is seen at 3944

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 8

C. Waltham A simple model of the Erhu soundbox

Figure 7. Acoustic pressure maps of open end of an erhu at 466 Hz (left), 2429 Hz (center) and 3944 Hz (right).
The distributions below 3 kHz are consistent with a uniform velocity profile at the exit plane. The 3944 Hz map
shows the first transverse mode in the cavity. The colors represent magnitude and phase relative to the force on
the bridge.


Bridge admittance 𝑌𝑌𝑏𝑏 and hole pressure 𝑝𝑝ℎ are compared with a model that only includes radial membrane
modes. The surface density of the python skin was unknown but taken to be 0.18 kg/m2, a value taken from Chen
et al.4, i.e. a total membrane mass of about 1 gram. The mass of the paulownia bridge plus accelerometer was
0.6 grams. The only adjustable parameters in the model were the vacuum modes of the membrane 𝑓𝑓1,2,3,4 and
their quality factors 𝑄𝑄1−4 (Table 1). The effective membrane areas 𝑆𝑆𝑚𝑚 were considered to be a function of
frequency and taken from deflection shape data, fitted to fifth-order polynomial in frequency (Fig.8). This is
probably the only way of dealing with any non-uniformity in the membrane or the pre-tensioning that occurs in
making the instrument.
Table 1. Model input parameters

Membrane mode Frequency (Hz) Q

(0,1) 580 30
(0,2) 1400 10
(0,3) 2600 10
(0,4) 4800 5

Figure 8. Magnitudes of area factors Sm/S for shapes measured from the erhu membrane compared with Bessel
function shapes. The dotted line shows the polynomial fit to the data that was used in the model.

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 9

C. Waltham A simple model of the Erhu soundbox

Figure 9. Bridge admittance and acoustic pressure at the center of the hole (normalized to force on bridge)
compared with model with radial membrane modes.

Data and model results are shown in Fig.9. Structures seen in the data and not the model are largely due to
non-radial modes of the membrane excited by transverse rocking of the bridge. For example, the feature at 1250
Hz has been identified as the (1,1) mode of the membrane which interferes with the lower member of the (0,2)
doublet. Just how far one can go in fitting the data more accurately by including these non-radial modes remains
to be seen, as does the usefulness of such an exercise.
One potentially interesting effect not yet included in the model is the non-linear response of the membrane;
this can be heard as a distinct chirp by simply tapping the bridge.


To quantify the similarities and differences between the sound of the erhu and the violin, sound clips were
prepared of the well-known “Yellow River Ballad” by Xian Xinghai and played on both instruments (only using
the A and D strings of the violin). The spectrograms and time-averaged spectra are shown in Fig.10. The
envelopes of the time-averaged spectra both peak at about 600 Hz. Below that, the erhu spectrum falls more
rapidly, owing the fact that its lowest radiating mode (0,1)- lies around 450 Hz, whereas the violin’s A0 mode is
at ~280 Hz. Above 1000 Hz the violin has a broad excess (the bridge-island hill) above the 12 dB/octave line,
and the erhu has structure here too, corresponding to the “formants” noted above (note: this was a different
instrument from that used to produce Fig.9). A casual listener can easily distinguish the two instruments playing
lower notes (< 1000 Hz), but less easily for higher notes, except by relying on stylistic features like the more
pronounced vibrato that is possible on the erhu.

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 10

C. Waltham A simple model of the Erhu soundbox

Figure 10. Spectrograms and spectra of the Yellow River ballad played on a violin and an erhu. The dashed lines
on the lower plot indicate a fall-off of 12 dB/octave; the vertical scales are not normalized, and have been offset
for visibility.

To get an idea of the absolute acoustic power of the two instruments, their mean radiativities were
measured with a calibrated impact hammer and a 30-microphone circular array. The results in Fig.11 show that
the erhu dominates the violin between 400 and 1500 Hz, but on either side of this range, the violin is more
powerful. Violin power below 400 Hz is due to the A0 mode, and above 1500 Hz, it is due to the bridge-island
hill. Assuming the bow-string behavior is similar for the two instruments (recall that the erhu strings are made
of simple steel wire, while the violin D and A strings are wrapped metal on a synthetic or steel core), the erhu’s
power can be explained by the large bridge admittance and effective membrane radiating area at the lower

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 11

C. Waltham A simple model of the Erhu soundbox

Figure 11. Radiativities (i.e. angle-averaged sound pressure level at one meter radius in Pa/N of force on the
bridge) for the violin and erhu.

The erhu is deserving of more attention from the acoustical community than it has so far received. It does
seem to be one string instrument whose main acoustic features over a broad range of frequency (up to ~ 4 kHz)
may be understood in terms of a relatively simple mathematical model. The model predicts that the first
membrane mode should correspond closely to the first cavity mode of the soundbox. Its power derives from the
large admittance of the bridge-membrane combination, and from the fact that half of all the prominent modes
are breathing modes.
This author finds the erhu to be a perfect instrument with which to conclude an undergraduate “Physics of
Music” course, because analyzing it draws together so many elements previously introduced in the course: string
instruments, plucked and bowed, percussion (membranophones), wind instruments, and the human voice.

The author thanks Andrea Wong, University of British Columbia (UBC) undergraduate student and erhu
player, for the loan of her instrument, the sound clips used in this analysis, and especially for the initial spark of
interest provoked by her essay on the erhu, written for UBC’s Physics of Music course.8 Another undergraduate,
Laura Kim, took some of the membrane mode-shape data. Thanks also to Alan Thrasher, UBC ethnomusicologist
emeritus, for the loan of an instrument and illuminating discussions on Chinese musical culture. UBC graduate
student Yang Lan was very helpful in providing translations for Chinese language journal articles.

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 12

C. Waltham A simple model of the Erhu soundbox

1 J. Stock, “A historical account of the Chinese two-stringed fiddle erhu”. Galpin Society Journal, 46, 83-113 (1993).
2 E. M. von Hornbostel and C. Sachs, “Classification of Musical Instruments: translated from the original German by A. Baines
and K. P. Wachsmann”, Galpin Society Journal XIV 3-29 (1961).
3 T. Chen, X. Cai and M. Zheng, “Acoustical properties of Erhu”, Acta Acustica (China) 16, 73-76 (1991). In Chinese.
4 T. Chen, M. Zheng and X. Cai, Acoustics of Chinese bowed string instruments Jinghu and Erhu", Chinese J. Acoust. 10, 289-295

5 T. Chen, M. Zheng and X. Cai, “Resonators of Jinghu and Erhu”, Acta Acustica (China) 17, 73-76 (1992). In Chinese.
6 O. Christensen and B. Vistisen, “Simple model for low frequency guitar function,” J. Acoust. Soc. Am., 68, 758–766 (1980).
7 C. Gough, “Violin acoustics”, Acoustics Today, 12, 22-30 (2016).
8 A. Wong, “The erhu”,, retrieved 2018-11-23.
9 F. Pfeifle, “Organologic and acoustic similarities of the European violin and the Chinese erhu”, J. Acoust. Soc. Am. 140, 3143

10 H. Levine and J. Schwinger, “On the radiation of sound from an unflanged pipe”, Phys. Rev. 73, 383-406 (1948).
11 E. G. Williams, J. D. Maynard, and E. Skudrzyk, “Sound source reconstruction using a microphone array,” J. Acoust. Soc. Am.

68, 340–344 (1980).

12 C. E. Waltham, “Calibration for impact measurements of qin hole velocities”, Proceedings of ISMA 2017 conference, 60-63,
13 J. Woodhouse and R. S. Langley, “Interpreting the input admittance of violins and guitars”, Acta. Acust. 98, 611-628 (2012).
14 J. Sundberg, “The acoustics of the singing voice”, Sci. Am., March 1977, 82-89.

Proceedings of Meetings on Acoustics, Vol. 35, 035002 (2018) Page 13

You might also like