Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 42

STRINGS AND FILE I/O

Programming with Matlab


ENSC180 CALENDAR

Unit What
1 Introduction to MATLAB
2 Flow control & Data Structures
3 Plotting
4 Strings & File IO
5 Complex Numbers
6 Combinatorics
7 Linear Algebra
8 Statistics & Data Analysis
9 Polynomial Approximation,
Curve Fitting
10 Root Finding, Numerical
Differentiation
11 Numerical Integration
12 MATLAB GUI

2
CHARACTERS

• A character is represented in ASCII using a numeric code between 0 to 255.

>> s1 = 'h' >> s2 = 'H'


s1 = s2 =
h H
>> uint16(s1) >> uint16(s2)
ans = ans =
104 72

• We see above that the character h is represented by MATLAB as 104 and H is


represented as 72.
• In MATLAB’s default character encoding scheme, each character occupies two
bytes.

>> whos
Name Size Bytes Class Attributes
s1 1x1 2 char
s2 1x1 2 char

3
CHARACTERS

• Unlike C++, MATLAB is not a strongly-typed language. This means that you can
assign to an operator a value of a different type
• Since characters have this numeric representation, many of MATLAB’s operators
are valid with characters.

>> 'h' + 'H' >> 'h'*3


ans = ans =
176 312

>> 'b' > 'a' >> 'b' > 'c'


ans = ans =
1 0

• This sort of freedom may yield unexpected results, since it’s not intuitively obvious
what will occur in these situations.
• It’s generally unnecessary to operate on characters in this manner; proceed with
caution.

4
STRINGS

• A string is simply an array (row vector) of numeric codes that represent each of the
characters.

>> s1 = 'Example' >> s2 = 'String'


s1 = s2 =
Example String

• As a result, the general rules and syntax for matrices apply here.
• For example, in this case we may concatenate horizontally:

>> s3 = [s1 s2]


s3 =
ExampleString

• However, vertical concatenation [s1; s2] will yield an error because s1 is 1x7 and s2
is 1x6.
• If it’s necessary to store many strings into a matrix format, cell arrays are the best option:
{s1; s2}.

5
STRINGS

• Many numerical and logical operators can be applied to strings.

>> str = 'aardvark';


>> 'a' == str
ans =
1 1 0 0 0 1 0 0

>> str(str == 'a') = 'Z'


str =
ZZrdvZrk

• Note that the == operator produces a logical array instead of a boolean output. This
can have undesired consequences when testing the equality of two strings.
• To check equality, use the strcmp function:

>> strcmp('a', str)


ans =
0

6
QUIZ 1

• What is the output of the following command?


>> str = 'aardvark';
>> find(str=='a',5,‘last')

A. Syntax Error
B. 1
C. 126
D. 12600
E. ‘aaa’

7
QUIZ 1

• What is the output of the following command?


>> str = 'aardvark';
>> find(str=='a',5,‘last')

A. Syntax Error
B. 1
C. 126
D. 12600
E. ‘aaa’

8
QUIZ 2

• What is the output P of the following command?


>> str = 'aardvark';
>> P=str(str>'f');

A. Syntax Error
B. P='rvrk'
C. P='ffrfvfrk'
D. P=‘aafdfrk'
E. P='aada'

9
QUIZ 2

• What is the output P of the following command?


>> str = 'aardvark';
>> P=str(str>'f');

A. Syntax Error
B. P='rvrk'
C. P='ffrfvfrk'
D. P=‘aafdfrk'
E. P='aada'

10
QUIZ 3

• Write a simple MATLAB script that inverts any given string:

• S1=‘I like the letter E’  S2=‘E rettel eht ekil I’

11
QUIZ 3

• Write a simple MATLAB script that inverts any given string:

• S1=‘I like the letter E’  S2=‘E rettel eht ekil I’

% Pro Solution
S2=S1(length(S1):-1:1);

% Amateur solution
for ii=1:length(S1)
S2(length(S1)+1-ii)=S1(ii)
end

12
FPRINTF VS. SPRINTF

• FPRINTF and SPRINTF are String processing libraries, common to very many
environments, scripting languages, and programming languages
• They go all the way back to FORTRAN, COBOL, BASIC (1960) and then C (1970), Java
(1990) and any scripting language (Python, TCL, Perl , ….)
• FPRINTF stands for «Print Formatted on file»
• SPRINTF stands for «Print Formatted on String»
• Many languages also have PRINTF, that writes on screen
• In MATLAB fprintf writes on screen if no file is specified

• FPRINTF/SPRINTF are high level commands based on the same subset of routines. Only
the top level interface (or API, Application Program Interface) is different
• They are used to produce a «Formatted» string starting from many different data
types, that are processed to fit in the string format depending on the user directives

• The only difference is that fprintf writes on file or screen, sprintf on a string variable:
fprintf(‘I like the letter E’) 
s=sprintf(‘I like the letter E’); disp(s);

13
FORMATTING DATA INTO STRINGS

• The following syntax is used: sprintf(formatSpec, A1, ..., An).


• formatSpec is the format of the output field specified as a string.
• A1, ..., An represents the data arrays that will be placed inside the string.

data = [2.3, 1e-7, pi]; data(1) = 2.3


for ii = 1:length(data)
str = sprintf('data(%d) = %g\n', ... data(2) = 1e-007
ii, data(ii));
disp(str) data(3) = 3.14159
end

• Data is placed inside the string sequentially, starting with the integer ii that
represents the index of the array data.
• We place this output into the string using the % symbol followed by the conversion
character d, which formats output ii into a signed integer.
• The second output data(ii) corresponds to %g in our format string, which
formats floating-point outputs in a compact manner.
14
• \n is a special newline character.
QUIZ 4

• What is the output of the following MATLAB command?

fprintf('Not that I have anything against D');

A. Not that I have anything against D>>


B. Not that I have anything against D
C. 'Not that I have anything against D'

15
QUIZ 4

• What is the output of the following MATLAB command?

fprintf('Not that I have anything against D');

A. Not that I have anything against D>>


B. Not that I have anything against D
C. 'Not that I have anything against D'

• That is due to the missing \n at the end of the string.


• Note that disp is less sophisticated and will add the \n for you, while
(f/s)printf assumes you know what you are doing and if you don’t add \n
there is a reason!

16
QUIZ 5

• Write down the content of variable s :

dude(3).name='maude';
dude(3).gender='f';
dude(3).birth=1960;
s= sprintf(‘%s was born in %d \n',dude[3].name,
dude[3].birth);

17
QUIZ 5

• Write down the content of variable s :

dude(3).name='maude';
dude(3).gender='f';
dude(3).birth=1960;
s= sprintf(‘%s was born in %d \n',dude[3].name, dude[3].birth);

SYNTAX ERROR! dude[3] should be dude(3), as square brackets are used to define
arrays, round brackets to index arrays.
If we fix the error, then

s=‘maude was born in 1960’

18
FORMATTING OPERATORS

• When formatting floating-point data into a string, it may be necessary to alter the
notation or number of decimal places.
• The conversion characters %f and %e format data to fixed-point and exponential
notation, respectively.
• The field width and precision can be modified for these formats using the syntax:
%<field width>.<precision>f.

>> A = [100*pi, 2.09e-8];


sprintf('%.2f \n %12.5f \n %e \n %.2e', A(1),A(1),A(2),A(2))

ans =
314.16
314.15927
2.090000e-008
2.09e-008

19
REGULAR EXPRESSIONS

A regular expression is a string of characters that describes a


certain pattern within text.
Regular expressions can be used to find or alter pieces of text that
match the described pattern.

• The function used to match regular expression is regexp:

[out] = regexp(str, expr, outselect)

• str is the string of text that you want to search.


• expr is the regular expression that specifies the pattern you want to match.
• outselect is the type of output you want from the function, e.g., starting indices of each
substring that matches the desired pattern.

• If only starting indices or end indices of the matching substrings are desired, the
outselect argument can be omitted:

[startIndex, endIndex] = regexp(str, expr)

20
REGULAR EXPRESSIONS

• Consider the following string:

str = 'Food coloring, colours the colorless icing';

• We wish to search str for all instances of the word “color” and its derivatives, e.g.,
“colorful”, without discriminating between British and American spelling
(color/colour).

• Consider the following regular expression: expr = 'colou?r\w*';


• It defines the following pattern:
• The first 4 characters must be colo.
• This is optionally followed by the letter u which is then followed by the letter r.
• \w denotes an alphabetic, numeric, or underscore character (a to z, A to Z, 0
to 9, and _) and * denotes 0 or more times consecutively. Hence, it can end in
any number of word characters.

>> startInd = regexp(str, expr)


startInd =
6 16 28

21
REGULAR EXPRESSIONS

• Useful metacharacters:
• [c1c2c3] – any character contained within the brackets, e.g., '[kf#]ind'
matches 'kind', 'find', or '#ind'.
• [^c1c2c3] – any character not contained within the brackets, e.g.,
'[^kf]ind' can match 'mind' or '#ind' but not 'kind' or 'find'.
• \w – any alphabetic, numeric or underscore character.
• \W – any character that is not alphabetic, numeric, or underscore.
• \s – any whitespace character.
• . – any single character, including whitespace.

• Useful quantifiers:
• expr* - match 0 or more times consecutively.
• expr? – match 0 times or 1 time.
• expr{m,n} – match at least m times, but no more than n times consecutively,
e.g., '\w{3,6}ing[\s,.]' matches all 6 to 9 letter words that end in “ing”
followed by a whitespace , comma, or period

22
REGULAR EXPRESSIONS

• Grouping operators () can be used to search for consecutive patterns.


• For example, we wish to match two consecutive patterns of a vowel followed by a
nonvowel.

>> startIndex = regexp('refrigerator', '[aeiou][^aeiou]{2}')


startIndex =
2

• Without grouping, '[aeiou][^aeiou]{2}' matches one vowel, followed by two


nonvowels. Here, it matches “efr” in “refrigerator”.

>> startIndex = regexp('refrigerator', '([aeiou][^aeiou]){2}')


startIndex =
5 9

• Using the grouping operator, we obtain our desired match.


• '([aeiou][^aeiou]){2}' – a vowel followed by a nonvowel, twice in a row,
matches “iger” and “ator” in “refrigerator”.

23
REGULAR EXPRESSIONS

• To replace the text within a string that matches the desired regular expression use
the regexprep function.

newStr = regexprep(str, expr, replace)

• The parts of str that match expr are replaced by the string replace and the
resulting string is outputted in newStr.

• For example, replace all instances of “color” with the British spelling “colour”:

>> str = 'Food coloring, colours the colorless icing';


>> newStr = regexprep(str, 'color', 'colour')
newStr =
Food colouring, colours the colourless icing

24
FILE I/O

Work
MATLAB
Space

• MATLAB Variables are handled by the tool as part of its internal database. It may in the
computer memory, or it may be swapped to the hard disk but that is all handled by
MATLAB, that keeps those variable easily IMPLICITLY accessible for us at any time.

fileID=fopen(‘File’,…);
[…] File
fread/fwrite(fileID,…); MATLAB
[…]
System
fclose(fileID);

• At times, we may want to Create/Read/Write EXPLICITLY FILES on the file system. To


do that, we use explicit FILE HANDLING commands in our scripts

25
LOW LEVEL I/O

• To open a file for read/write access, use the command fopen:

fileID = fopen(filename, permission)

• permission is a string that can specify:


• r – open file for reading.
• w – open or create new file for writing (discard existing contents, if any).
• a – open or create new file for writing and append data to the end of file.
• Adding a + allows both reading and writing to the above specifications, e.g., 'r+'
opens a new file for reading and writing.
• Attaching the letter t to the permission, e.g. 'rt' or 'wt+', opens the file in text
mode.
• Similarly, attach the letter b to the permission to open the file in binary mode.

• Afterwards, close the file with fclose:

fclose(fileID) %closes the file with specific identifier


fclose('all') %closes all open files

26
LOW LEVEL I/O (TEXT FILES)

• You can save files in Binary form, but it may make sense to use TEXT for so that they
are human-readable. The only drawback of using text files is the SIZE of the file itself

• To read data from text files, use the fscanf function:

A = fscanf(fileID, format, sizeA)

• format specifies the type of field to read with specifies similar to the sprintf
function, e.g., %d reads in signed integers, %f reads in floating-point numbers.
• To skip fields, insert an * after the %, e.g., %*d skips integers.
• To skip a set of specific characters insert them in format, e.g., to only read the floating
point number in 'sin(45) = 0.7071', specify a format of 'sin(45) = %f'.

• sizeA specifies how many elements to read into A.


• Specifying an integer N will read at most N elements into A.
• Specifying [M, N] will read at most M*N elements in column order into A.

27
EXAMPLE: TEXT FILES

• Assume we have the following data in a text file called exampleData.txt:

Trajectory Data
numData = 5

Time XPos YPos


1.00 16.29 3.15
1.50 18.12 19.41
2.00 2.54 19.14
2.50 18.26 9.71
3.00 12.64 16.00

• We are aware of the file’s formatting ahead of time and we wish to read in all
numerical data contained within.
• First, open the file for reading in text mode and store the file identifier:

fid = fopen('exampleData.txt', 'rt');

28
EXAMPLE: TEXT FILES

Trajectory Data %Skip 2 strings and go to next line


numData = 5 %Skip 'numData = ' and read the integer

• The first task is to read the integer after numData.


• This involves skipping through the two strings at the start, moving to the next line,
and reading what comes after 'numData = '.

N = fscanf(fid, '%*s %*s \n numData = %d', 1)


N =
5

• Skip through the first two strings by specifying %*s twice and move to the next line
with special character \n.
• Specify the literal characters to skip 'numData = '.
• Data is only read when into N once the specifier %d is reached.
• N is a scalar; only one integer is read in.

29
EXAMPLE: TEXT FILES

numData = 5

Time XPos YPos


1.00 16.29 3.15

• The file pointer is positioned after the last element read, which is currently after 5.
• The next task is to advance the pointer so that it is positioned right before 1.00.

fscanf(fid, '\n \n %*s \n', 3)

• Skip two lines, bringing the pointer right behind Time.


• Skip three strings by specifying %*s and a sizeA of 3, and then go to next line.

30
QUIZ 6

Time XPos YPos


1.00 16.29 3.15
1.50 18.12 19.41
2.00 2.54 19.14
2.50 18.26 9.71
3.00 12.64 16.00

• Write the scanf operation that is capable of reading the data array above

A. For ii=1:N; data = fscanf(fid, '%f %f %f'); end;


B. data=fscanf(fid,‘%f %f %f\n %f %f %f\n %f %f %f\n %f %f %f\n %f %f %f\n’);
C. data = fscanf(fid, '%f', [3, N]);
D. data = fscanf(fid, '%f', [3, N])';
E. data = fscanf(fid, '%f %f %f‘, [3 N]);

31
QUIZ 6

Time XPos YPos


1.00 16.29 3.15
1.50 18.12 19.41
2.00 2.54 19.14
2.50 18.26 9.71
3.00 12.64 16.00

• Write the scanf operation that is capable of reading the data array above

A. For ii=1:N; data = fscanf(fid, '%f %f %f'); end;


B. data=fscanf(fid,'%f %f %f\n %f %f %f\n %f %f %f\n %f %f %f\n %f %f %f\n');
C. data = fscanf(fid, '%f', [3, N]);
D. data = fscanf(fid, '%f', [3, N])';
E. data = fscanf(fid, '%f %f %f', [3 N]);

Note: A will create an empty variable, as you keep rewriting on data but only the first round is significant.
The formats are all legal, even though C/D are preferable as more readable. Any solution without the [3
N] will create a single vector, rather than a matrix. The ' after round brackets in solution D is necessary
to transpose data, as values are stored in data by columns

32
EXAMPLE: TEXT FILES

Time XPos YPos


1.00 16.29 3.15
1.50 18.12 19.41
2.00 2.54 19.14
2.50 18.26 9.71
3.00 12.64 16.00

• Finally, read in the remaining numerical data into a matrix.


• Since we know the format of the data file ahead of time, we know there are three
columns of data and number of rows specified by numData.

data = fscanf(fid, '%f', [3, N])';


fclose(fid);

• Since the floating-point elements are read in column order, transpose data.
• Close the file after finishing.

33
LOW LEVEL I/O (BINARY FILES)

• The low-level file I/O functions fread and fwrite offer the most control over
reading and writing data to files.
• These functions operate directly at the bit-level.
• For example, we wish to read the first 10 bytes of our example_data.txt:

>> fid = fopen('exampleData.txt');


>> bytesData = fread(fid, 10, 'uint8')

bytesData =
84 114 97 106 101 99 116 111 114 121

• This corresponds to ASCII codes for the word “Trajectory”:

>> uint8('Trajectory')

ans =
84 114 97 106 101 99 116 111 114 121

34
EXCEL FILES

• Excel file are such a common format to describe dataset that MATLAB offers a specific
set of commands for them
• To import data from an Excel spreadsheet, use the xslread function:

[num, txt, raw] = xlsread(filename, sheet, xlRange)

• This reads data from the specified worksheet, sheet, and range, xlRange.
• Numeric data is returned in num, text fields are returned in cell array txt, and
unprocessed data (numbers and text) is returned in cell array raw.
• To export data to an excel spreadsheet, use the xslwrite function:

xlswrite(filename, A, sheet, xlRange)

• Writes the data in array A to the specified worksheet, sheet, and range, xlRange, of
Excel file filename.
• The array A can be a cell array if text data or mixed data (text and numbers) needs to
be written.
• Can only be a cell array of text and numbers. NOT A CELL ARRAY OF CELL ARRAYS!

35
EXAMPLE: EXCEL FILES

• Assume we have the data from the previous example formatted into a spreadsheet
called exampleData.xlsx:

• We wish to read in the data of the object’s position and calculate its distance from
the origin at each time.
• Afterwards, we wish to record this distance in the column to the right of “YPos”.

36
EXAMPLE: EXCEL FILES

• The desired data, XPos and YPos, lies in columns B to C and rows 5 to 9, i.e.,
range B5:C9.
• The data is on worksheet 1, so this time we may omit the sheet parameter.
• We only want to retrieve numerical data, so we only need to specify the first output.

>> pos = xlsread('exampleData.xlsx', 'B5:C9')

pos =

16.2900 3.1500
18.1200 19.4100
2.5400 19.1400
18.2600 9.7100
12.6400 16.0000

37
EXCEL FILES

• First, we need to label the new column we want to write containing distance data.
• Specify strings as a cell array, otherwise all characters are written in separate,
adjacent cells.
• The column label is placed in cell D4 and the data vector will begin in cell D5.

>> dist = sqrt(pos(:,1).^2 + ...


pos(:,2).^2)

dist =

16.5918
26.5534
19.3078
20.6812
20.3904

>> xlswrite('exampleData.xlsx', {'Dist'}, 1, 'D4')


>> xlswrite('exampleData.xlsx', dist, 1, 'D5')

38
IMAGES

• Images can be read into MATLAB using the imread function.

I = imread(filename)

• This reads an image with the specified filename into a matrix I.


• The argument filename can also contain a path to the desired image.

• Image data can be outputted using the imwrite function.

imwrite(I, filename, format)

• This writes the image data in matrix I to a file specified by filename.


• If a file extension is included in filename, the format of the output is inferred from it.
• Otherwise, specify the format of the output in the string argument format.

39
IMAGES

• Example: read an RGB image in jpeg format and display it:

>> img = imread('football.jpg');


>> figure, imshow(img)

• The matrix img is a 256x320x3 matrix.


• 256x320 corresponds to the image’s resolution.
• img is a 3D matrix because the image is colored; the 1st, 2nd, and 3rd indices of
the 3rd dimension correspond to red, blue, and green information, respectively.

• Set all red and blue values to 0 and write the result to an output image:

>> img(:,:,1) = 0;
>> img(:,:,3) = 0;
>> imwrite(img, 'footballGreen.jpg')
>> figure, imshow(img) %display result

40
VIDEOS

• Videos can be loaded to be played directly in MATLAB using the implay function.

implay('xylophone.mp4')

• Frames from a video can be read (along with other


information) using the VideoReader function.

videoObj = VideoReader('xylophone.mp4');
frame = read(videoObj, 130); %read frame 130

• Individual frames can be written to an output video file using:


writerObj = VideoWriter(filename, profile), where profile is the output
format (e.g., 'AVI', 'MPEG-4' etc.).
• Open the file for writing using the open method, e.g., open(writerObj).
• Write a frame using the writeVideo method, e.g., writeVideo(frame).
• Close the file using the close method, .e.g., close(writerObj).

41
SPECIFYING FILEPATHS
• When specifying a simple filename to a MATLAB function, the script that contains the
call to the function requires that the file be placed in script’s directory.
• It can be cumbersome and inefficient to place input files directly inside your current
folder.
• Use sprintf to create filenames that include their location:

filename = 'example.txt';
path = sprintf('C:\\Users\\Fener\\Desktop\\MatlabDrs\\EEE225\\
inputs\\%s', filename);
C:\Users\Fener\Desktop\MatlabDrs\EEE225\inputs\example.txt

Since \ is used to specify special characters, \\ is needed in formatSpec to


specifically type in a backslash.
• Using path, the file example.txt can be accessed without having to be moved.
• Files in subfolders can be accessed in a similar manner without a full path, for
example, if we are currently in folder EEE225:

path = sprintf('.\\inputs\\%s', filename);

42

You might also like