Professional Documents
Culture Documents
Eng 6 Audio Signals: Bevan Baas, Andre Knoesen
Eng 6 Audio Signals: Bevan Baas, Andre Knoesen
Audio Signals
0.6
clear;
fs = 44100; 0.4
len_time = 3;
volume = 0.5; 0.2
t = 1 : (len_time * fs);
len_t = length(t); 0
freq = 440;
w = sin(t/fs*2*pi*freq); -0.2
figure(1); -0.4
plot(w);
-0.6
axis([0 1000 -1.1 1.1]); xlabel('time');
-0.8
sound(w*volume, fs);
-1
3
0 100 200 300 400 500 600 700 800 900 1000
time
Sound
• To emphasize the point that the signal is sampled only at discrete
points in time, use the stem plot
• stem() has the same basic syntax as plot()
0.6
clear;
fs = 44100; 0.4
len_time = 3;
volume = 0.5; 0.2
t = 1 : (len_time * fs);
len_t = length(t); 0
freq = 440;
w = sin(t/fs*2*pi*freq); -0.2
figure(1); -0.4
stem(w);
-0.6
axis([0 70 -1.1 1.1]); xlabel('time');
-0.8
sound(w*volume, fs);
-1
4
0 10 20 30 40 50 60 70
time
Sound Analysis
• Many signal properties are best analyzed, modified, or
synthesized in the “frequency domain.” Ex: piano
• x_freq = fft(x_time) Fast Fourier Transform.
Transforms a time-domain signal into the frequency
domain
• The right edge of the plot is the sampling frequency
• In general, the output of the FFT is complex so we often
look at the magnitude of the complex numbers, abs()
x 10
4
clear;
fs = 44100; 5
len_time = 3;
volume = 0.5;
t = 1 : (len_time * fs); 4
len_t = length(t);
freq = 440;
w = sin(t/fs*2*pi*freq); 3
f1 = fft(w);
2
f2 = f1(1:(len_t/2)); % look at left half only
figure(1); plot(abs(f1));
figure(2); plot(abs(f2));
1
%axis([0 1000 -1.1 1.1]);
xlabel('frequency');
0
5
0 2 4 6 8 10 12 14
fsx 10
4
Sound Analysis
• The maximum value of the FFT is the sampling
frequency, fs, so we normally plot only the left
half which shows the signals that can be
analyzed by this sampling frequency (remember
the Nyquist Frequency?)
4
x 10
7
clear;
fs = 44100; 5
len_time = 3;
volume = 0.5;
t = 1 : (len_time * fs); 4
len_t = length(t);
freq = 440;
w = sin(t/fs*2*pi*freq); 3
f1 = fft(w);
2
f2 = f1(1:(len_t/2)); % look at left half only
figure(1); plot(abs(f1));
figure(2); plot(abs(f2));
1
%axis([0 1000 -1.1 1.1]);
xlabel('frequency');
fs/2
0
6
0 1 2 3 4 5 6 7
frequency 4
x 10
Sound Analysis
• x_time = ifft(x_freq)
Inverse Fast Fourier Transform
transforms a freq-domain signal into the
time domain
• In general, the output of the IFFT is
complex so we need to be prepared to
deal with a vector of complex numbers in
the time domain
7
Sound With Harmonics
• Pure tones sound sterile. Richer tones are made by adding in
harmonic frequencies whose frequencies are related by some factor
to the root or fundamental frequency
• When two sin waves are added, the sound changes if the phase
(offset in time) of one of the components is changed
• Ex: double frequency = octave higher
clear;
3
fs = 44100;
len_time = 3;
volume = 0.5; 2
figure(1);
plot(w); axis([0 len_t -3 3]); xlabel('time'); -2
figure(2);
plot(w(1:(freq*5)));
-3
8
sound(w*volume, fs); 0 500 1000 1500 2000 2500
time
Sound With Harmonics
• Taken over a very long sequence, the envelope
of such signals can have interesting patterns
• Ex: 3 seconds @ 44,100 Hz = 132,300 samples
-1
-2
-3
9
0 2 4 6 8 10 12
time 4
x 10
Sound Processing: Example #1
t = linspace(0,2,44100*2+1); % note: this is 44,1000 samples per second,
y = sin(440*2*pi*t)
p=audioplayer(y,44100); % note that we specify 44,100 samples per second
play(p)
10
Music in Matlab
• We can import sound into MATLAB using the wavread( ) function, passing it
just the file name of a wave file: wavread(‘filename')
• Better yet, explicitly ask for the sample frequency too:
[y, fs] = wavread(‘filename’);
• It reads in a vector of all of the y values, the amplitudes. (You don't need the
time values to sample at since they're evenly spaced, you just need to know
how many y points,or SAMPLES, per second were taken).
• wavread() requires a `.wav' file. Wav files are uncompressed sound files,
whereas mp3s are compressed; you can convert between the two using
free software.
0.5
clear; 0.4
-0.5
0 5 10 11 15
Sample number 5
x 10
Lady Gaga, © 2009 Interscope Records
Playing Music
• Play a waveform on your computer with the sound()
command:
sound(waveform, fs)
• clear playsnd
Stops playing the waveform
• junk = input(' <Press return>');
Handy way to pause execution
0.5
0.4
clear; 0.3
len_a = length(a);
fprintf('Sample rate fs = %f samples/sec\n', fs); 0
fprintf('Number of samples = %i\n', length(a));
fprintf('Seconds of music = %f\n', len_a/fs); -0.1
-0.5
0 5 10 12 15
Sample number 5
Lady Gaga, © 2009 Interscope Records x 10
Tricks with Sound Processing
“….. you wanna know what you get when you play a country song backwards? You get
your house back. You get your dog back. You get your best friend Jack back. You get
your truck back. You get your hair back. Ya get your first and second wives back. Your
front porch swing. Your pretty little thing……..” Rascal Flatts
Using vector and matrix functions, we can do quite a bit of sound processing.
First, let's think about how to play a sound file backwards. Quite simply, we just
need to reverse the vector that stores the samples. Noting that wavread gives us
a column vector, we use the flipud function:
% Play backwards
[y f] = wavread('gaga.wav');
p=audioplayer(flipud(y),f)
play(p)
13
Tricks with Sound Processing:
Speed-up
We can also speed up a sound file:
•One way to play a file at double speed is to trick MATLAB into using twice as
many samples per second as it should, effectively crunching all of the data into
half of the time (and thus double the frequency):
% Double Speed, Option 1
[y f] = wavread('gaga.wav');
p=audioplayer(y,f*2);
play(p)
•Similarly, we could use only half of the samples:
% Double Speed, Option 2
[y f] = wavread('gaga.wav');
p=audioplayer(y(1:2:end),f);
play(p)
14
Sound Processing: Fade-out
We can also create fade‐outs very easily:
•To create a `linear' fade, we could multiply the first sample by 1, the
second sample by a little less than one, the middle sample by 0.5, the
second to last sample by just over 0, and the final sample by 0.
•In other words, we'll multiply all of the samples by a vector that looks
something like [1 .999 .998 ... .002 .001 0].
•Since we know that these numbers should be evenly spaced, begin at 1,
and end at 0, we use linspace:
% Fade out
[y f] = wavread('gaga.wav');
y2 = y(1:(5*f)); % Take the first 5 seconds
p=audioplayer(y2.*linspace(1,0,length(y2))',f);
play(p)
15
Echo
• Echo is a repeated waveform added to the
original at a later point in time
• a_echo = [gap_zeros ; a] + [a ; gap_zeros];
echo1 = fs;
gap = zeros(echo1,1);
a2 = [a ; gap] + [gap ; a]; % basic 1 echo
a2 = [a;gap;gap] + [gap;a;gap] + [gap;gap;a]; % basic 2 echo
a2 = [a;gap;gap] + [gap;a;gap]/3 + [gap;gap;a]/9; % faded 2 echo
16
Auto Scaling
• Best audio performance if signal is scaled to
largest magnitude
• Often called automatic gain control
• MaxMagnitude = max(max(a), min(a));
MaxMagnitude = max(max(a), max(-a));
• ScalingFactor = MaxAllowable / MaxMagnitude;
• AudioScaled = Audio * ScalingFactor;
17
Analysis Over Time and Frequency
• In most signals, the frequency content changes over
time
• Plots of frequency vs. time are called spectrograms
• These are extremely useful in many fields such as voice
recognition
• Three things are displayed: time, frequency, intensity
1
0.9
0.8
0.7
0.6
specgram(a);
Frequency
0.5
spectrogram();
0.4
0.3
0.2
0.1
0 18
1 2 3 4 5 6 7
Time 5
x 10
Analysis Over Time and Frequency
19
http://en.wikipedia.org/wiki/Spectrogram
Analysis Over Time and Frequency
• Another common display method is to show
signal intensity on the vertical axis and use a
movie to show changes over time
400
350
400
300
350
250
400 300
200
350 250
150
300 200
100
250 150
50
200 100
0
0 100 200 300 400 500 600 700 800 900 1000
150 50
100 0
0 100 200 300 400 500 600 700 800 900 1000
50
0
20
0 100 200 300 400 500 600 700 800 900 1000
Digital Filtering
• Filtering is often cited as the most common
signal processing function
• Digital filtering can do all the frequency filtering
you are already familiar with:
– Treble
– Bass
– Bass boost
• It can also do many more:
– Notch filter. Remove a range of frequencies such as
noise or a singer’s voice.
– Bandpass filter. Extract a range of frequencies.
21
Digital Filtering
• We will focus on audio signal filtering but always
keep in mind the exact same techniques work on
any signal
– Earthquake waves
– Vibrations in a bridge
– Cardiac EKG
22
Digital Filtering
• Convolution is a very fundamental operation in signal
processing; maybe right after adds, subtracts, and
multiplication
• The most common type of digital filters (FIRs) are
implemented with convolution
• The calculation can be envisioned as a dot product while
the two sequences slide past each other
– When the two sequences do not fully overlap, “pad” either
sequence with zeros as needed—padding zeros does not
change the result
• Example:
x = [1 2 1]; h = [2 3];
out = conv(x,h)
= [2 7 8 3]
23
Digital Filtering
• FIR digital filters are implemented with the
convolution of two sequences:
– The data (think of the music)
– The filter coefficients
• The length of the filter’s output will be the length
of the data + length of the filter coefficients - 1
• The characteristics of the filter are entirely
determined by the coefficient values
24
Digital Filtering
• These coefficients produce a “low-pass”
filter which allows only low frequency
signals to pass through
– 51 coefficients in this case
0.4
0.3
0.2
0.1
-0.1
25
5 10 15 20 25 30 35 40 45 50
Digital Filtering
• This is the “frequency response” of this low-pass filter
– The right edge of the plot is fs/2, which is 44,100/2 = 22,050 Hz if
we are operating common music such as from CDs or MP3s
– The vertical scale is normally drawn on a log scale where every
20 db is a factor of 10 in signal size
• Attenuation or Gain (dB) = 20 * log10(Amplitude1/Amplitude2)
– This low-pass filter passes frequencies below 22,050 * 0.4 =
8820 Hz with no attenuation. On the other hand, signals above
22,050 * 0.5 = 11,025 Hz are passed with at least approximately
50 dB of attenuation—which is a factor of 10^(50/20) = 316.
0
Magnitude (dB)
-20
-40
-60
28
freqz()
• Plotting the response of a filter is very
easy
• freqz(coeffs);
axis([0 1 -70 10]);
• freqz() returns two subfigures in one
figure. Ignore the lower plot for ENG 6
(it is called the “phase response”)
29
Example
• Remove all signals that are below 950 Hz
– High-pass filter
– fs = 44,100 Hz
– fs/2 = 22,050 Hz
– firpm parameter: f=0 0 Hz
f=? 950 Hz
f=? 1000 Hz (reasonable?)
f=1 22,050 Hz
– Corresponding magnitudes
0
0
1
1
– N = Number of coefficients: try a few!
– coeffs = firpm(50,[0 - - 1], [0 0 1 1]);
stem(coeffs);
axis([- - - -]);
30