Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

PROGRAMMING WITH MATLAB

PROBABILITY AND STATISTICS

ANKARA UNIVERSITY
DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING
STATISTICS

 In general, a data set of k values is written as: 𝑥 = {𝑥1, 𝑥2, 𝑥3, … 𝑥𝑘}
 In MATLAB, it is represented by a row vector x.
 Statistics are useful to identify certain properties of this data.
 For example, a set of exam grades: {27, 55, 63, 72, 72, 77, 79, 83, 95, 97, 100}
 The average value (74.5455), the middle value (77), or the most frequently occurring value
(72) to make various assessments over the exam grades.
 In addition, it is useful to know how the data values in a data set are spread.

2
STATISTICS
 MATLAB has many built-in functions for statistics.
 For example, to find the minimum or maximum value in a data set:
>> x = [13 11 -5 3 39 17 7 27 19 23];
>> min(x)
ans = -5
>> max(x)
ans = 39
These functions also return the index of the largest or smallest value:
>> [maxVal, maxInds] = max(x)
maxVal = 39 3

maxInds = 5
STATISTICS

 In the case of matrices, these functions operate column-wise by default:


Example:
>> A = [1 11 5; 7 3 23; 17 19 13]
A = 1 11 5 7 3 23 17 19 13
>> [minVal, minInds] = min(A)
minVal = 1 3 5
minInds = 1 2 1

4
STATISTICS

MATLAB
 Mean → Average mean(x)
 Mode → The most frequently occurring value mode(x)
median(x)
 Median → Middle value

5
STATISTICS
 The calculation of a mean would be programmed by adding elements of a vector together using a loop, and then
dividing this sum by the number of elements:
>> x = [13 11 -5 3 39 17 7 27 19 23];
>> sumVal = 0;
for i=1:length(x)
sumVal = sumVal + x(i);
end
>> meanVal = sumVal/length(x)
meanVal = 15.4000
or we simply use the mean function:
>> mean(x) 6

ans = 15.4000
STATISTICS
 Mode: the most frequent value in a data set
 In MATLAB: m=mode(x) for vector x computes m as the sample mode, or most frequently occurring
value in x.
 For a matrix x, m is a row vector containing the mode of each column.
>> x = [27, 55, 63, 72, 72, 77, 79, 83, 95, 97, 100];
>> mode(x)
ans = 72
If no value is seen more often than any other, the smallest value will be the mode of that vector.
>> x = [1 3 5 7];
>> mode(x) 7

ans = 1
STATISTICS
 Median: The median is defined only for a data set that has been sorted first.
 In MATLAB: For vectors, median(x) is the median value of the elements in x.
 For matrices, median(x) is a row vector containing the median value of each.
>> x = [27, 55, 63, 72, 72, 77, 79, 83, 95, 97, 100];
>> median(x)
ans = 77
If the vector is not already sorted, the median function will automatically sort and give the correct result.
For the following vector, the mean of 63 and 72 in the middle will be returned.
>> x = [27, 55, 63, 72, 72, 77];
>> median(x)
8

ans = 67.5000
STATISTICS

 Sorting: Sorting in ascending or descending order.


 In MATLAB: For vectors, sort(x) sorts the elements of x in ascending order.
 For matrices, sort(x) sorts each column of x in ascending order.
 Sort order can be specified.
>> x = [17, 1, 7, 11, 3, 13, 19];
>> sort(x)
ans = 1 3 7 11 13 17 19
>> sort(x, 'descend’)
ans = 19 17 13 11 7 3 1
9
HISTOGRAM

 The way the data are spread around the mean can be described by a histogram
plot.
 A histogram is a plot of the frequency of occurrence of data values versus the
values themselves. It is a bar plot of the number of data values that occur within
each range, with the bar centered in the middle of the range.
 To plot a histogram, you must group the data into subranges, called bins.
 The shape of the histogram can be affected by the width of the bins.
10
HISTOGRAM
 MATLAB also provides the hist
command to generate a histogram.
 This command has several forms. Its basic
form is hist(y), where y is a vector
containing the data. This form aggregates
the data into 10 bins evenly spaced between
the minimum and maximum values in y.
 The second form is hist(y,n), where n
is a user-specified scalar indicating the
number of bins.
 The third form is hist(y,x), where x is
a user-specified vector that determines the
location of the bin centers; the bin widths 11

are the distances between the centers.


HISTOGRAM (EXAMPLE)

12
HISTOGRAM (EXAMPLE)

% Thread breaking strength data for 20 tests.


y = [92,94,93,96,93,94,95,96,91,93,...
95,95,95,92,93,94,91,94,92,93];
% The six possible outcomes are ... 91,92,93,94,95,96.
x = [91:96];
hist(y,x),axis([90 97 0 6]),...
ylabel(’Absolute Frequency’),...
xlabel(’Thread Strength (N)’),...
title(’Absolute Frequency Histogram for 20 Tests’)

13
HISTOGRAM

 The absolute frequency is the number of times a particular outcome occurs.


 For example, in 20 tests these data show that a 95 occurred 4 times.
 The absolute frequency is 4, and its relative frequency is 4/20, or 20 percent of the
time.

14
HISTOGRAM FUNCTIONS

15
CORRELATION
 Correlation refers to the statistical relationship between the two entities.
 It measures the extent to which two variables are linearly related.
For example,
 The height and weight of a person are related.
 Taller people tend to be heavier than shorter people.
 The correlation coefficient is a statistical measure of the strength of a linear relationship
between two variables. It is ranged from -1 to 1.
 A correlation coefficient of -1 describes a perfect negative, or inverse relationship.
 A coefficient of 1 shows a perfect positive correlation or a direct relationship.
16
 A correlation coefficient of 0 means there is no linear relationship.
RANDOM NUMBER GENERATION

• Random numbers are useful for a variety of purposes, such as


generating data encryption keys, simulating and modeling
processes involving randomness, and selecting random samples
from larger data sets.

• If a program is being written and the data is not yet available, it is


useful to test the program first by initializing the data to random
numbers.
17
RANDOM NUMBER GENERATION

 MATLAB can generate random numbers that are uniformly distributed or


normally distributed. These sets of random numbers can be used to analyze
outcomes.
 x = rand(n): generates uniformly distributed random numbers in the range [0,1].
To generate uniformly distributed random numbers over the interval [a,b]:
 x = randn(n): generates normally distributed random numbers that have a mean
of μ= 0 and a standard deviation of σ= 1.

18

Uniform distribution: Describes a form of probability distribution where every possible outcome has an equal likelihood of happening.
RANDOM NUMBER GENERATION

 Every time the rand function is used, a different result will be obtained.
For example,
rand
ans =
0.6161
rand
ans =
0.5184

19
RANDOM NUMBER GENERATION

Type rand(n) to obtain an n x n matrix of uniformly distributed random numbers


in the interval [0, 1].
Type rand(m,n) to obtain an m x n matrix of random numbers.
For example,
>> rand(5)

ans =

0.8147 0.0975 0.1576 0.1419 0.6557


0.9058 0.2785 0.9706 0.4218 0.0357
0.1270 0.5469 0.9572 0.9157 0.8491
0.9134 0.9575 0.4854 0.7922 0.9340
0.6324 0.9649 0.8003 0.9595 0.6787 20
RANDOM NUMBER GENERATION

The general formula for generating a uniformly distributed random number y in


the interval [a, b] is

y = (b - a) x + a

where x is a random number uniformly distributed in the interval [0, 1]. For
example, to generate a vector y containing 1000 uniformly distributed random
numbers in the interval [2, 10], you type y = 8*rand(1,1000) + 2.

21
RANDOM NUMBER GENERATION

If x is a random number with a mean of 0 and a standard deviation of 1, use the


following equation to generate a new random number y having a standard deviation
of s and a mean of m.

y=sx+m
For example, to generate a vector y containing 2000 random numbers normally
distributed with a mean of 5 and a standard deviation of 3, you type

y = 3*randn(1,2000) + 5. 22

You might also like