Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Unit 6. External Storage.

Use of files
Programming
Bachelor in Aerospace Engineering
Course 2018-2019

1
INTRODUCTION

2
Files
Introduction

Data storage on the hard disk


Permanent storage of program results:
– Data stored in the main memory of the computer is lost
when the program finishes.
Read data stored in a file, rather than asking the user
to introduce it from the keyboard:
– Data in files with a proper format can be read and
assigned to the variables and structures of the program.
Dealing with data larger than the memory size:
– Data is stored in the hard disk, which has larger storage
capabilities, and it is processed in parts.
4
Files
Introduction

• Storage of data in variables is temporary


– when the program is closed or
– when computer is switched off
– The data is gone!
• Data stored permanently (more or less) on
secondary storage devices
• flash memory
• magnetic disks
• optical disks

5
DEFINITION

6
Files
Definition

File
Data stored in a permanent read and write computer
peripheral
Usually, files are stored on a hard disk
Sequence of bytes
Position in the file

7
Files
Definition

File
By itself, a file is nothing more than a series of related
bytes of data on a disk, one after the other.

“A collection of data or information that has a name”

There are many different types of files:


data files,
text files ,
program files,

Different types of files store different types of information


8
BINARY AND TEXT FILES

9
Files
Binary and Text files

File
Binary files
Store a sequence of bits corresponding to the binary
representation of data values
E.g.: an integer value is represented with 4 bytes
135 >> 00000000 00000000 00000000 10000111
390 >> 00000000 00000000 00000001 10000110

10
Files
Binary and Text files

Text files
Store a sequence of bits corresponding to the textual
representation of a data value
E.g.: an integer value is represented with as many characters as
necessary (each character is 2 bytes)
135 >> '1' '3' '5' >> ASCII 49, ASCII 51, ASCII 53
00000000 00110001 00000000 00110011 00000000 00110101
390 >> '3' '9' '0' >> ASCII 51, ASCII 57, ASCII 48
00000000 00110011 00000000 00111001 00000000 00110000

11
Files
Binary and Text files

Eventually, all files contain just bytes


The difference appears when a file is read, since it is necessary
to correctly interpret the bytes:
– When it is a text file, each two bytes correspond to a
character
– When it is a binary file, it is necessary to know which kind
of values were stored and in which order

12
Files
Binary and Text files

Eventually, all files contain just bytes


By default, typical editors assume that files are text files:
– Text files can be created and edited with text editors. If a
text file is open with a text editor (e.g., Notepad), its
contents can be read by humans
– Binary files can be created from a C program. If a binary
file is open with a text editor (e.g., Notepad), its contents
cannot be read by humans

13
Files
Binary and Text files

Binary files
Are shorter than text files (with the same data)
It is not necessary to translate data from a text-based
representation to the internal byte-based representation (this is
a time-consuming task, in particular with float and double
values)

Text files
Are easy to interpret
Can be edited with external tools

14
Files
Binary and Text files

PLAIN TEXT FILES


Lowest level of data storage is bits (binary digits) : 0s and 1s -- bits

Bits grouped together into bytes

Bytes grouped into characters

BINARY FILES
Lowest level of data storage is bits (binary digits) : 0s and 1s -- bits

Bits grouped together into bytes, and bytes grouped into characters

Characters and/or bytes grouped together to form fields

Fields grouped together to form records

Records grouped to form files


15
Files
Binary and Text files

PLAIN TEXT FILE PLAIN TEXT FILE

16
Files
Binary and Text files

Binary file Binary file

17
Files
Binary and Text files

18
Files
Binary and Text files

Text files and binary files


Files can be categorized according to their format:
binary files or plain text files.
The difference between text and binary files is how they encode data.

Both binary and text files contain data stored as a series of bits,
however:

in plain text files each individual byte is/represent a character.

in binary files groups of bits represent custom data (numbers,


chars, arrays, structs, etc.).
While text files contain only textual data, binary files may contain both
textual and custom binary data.

19
BINARY AND TEXT FILES
BINARY FILES

20
Files
Binary files

Binary files
Binary files contain a sequence of bytes. When creating a
custom file format for a program, a programmer arranges
these bytes into a format that stores the necessary
information for the application.

This data has to be interpreted, typically by a supporting


program or hardware processor which understands in
advance exactly how the data is formatted.

21
Files
Binary files

Binary files
Image viewer Text editor

vader_01.jpg

22
Files
Binary files

Image: all binary? No, a few formats are … plain text files as ppm
Text editor
IrfanView

vader_01.ppm

23
Files
Text files

Text files
A file that holds a human-readable sequence of characters.

There is no standard definition of a text file,


but the term “text file” is often used as a synonym for ASCII file,
a file which characters are represented by their ASCII code.

Text files are more restrictive than binary files since they
can only contain textual data.

24
Files
Text files

25
Files
Text files

26
Files
Text files

MATLAB provides functions for the management of


streams
A stream is an abstract representation of any external source or
destination for data –including disk files

Operations
> Open the file
> Read from file
> Write to file
> Close the file
27
Files
Text files

1. OPEN

2. while is open: WRITE xor READ

3. CLOSE

28
Files
Text files

Low level operations


• open, read, write, close

High level operations (not allowed in the subject)


after the subject: read the Matlab® help

29
Files
Text files

open
fid = fopen(<file_name> [, <permission>]);

write
fprintf(fid,'%d %7.4f %s',a,b,name);

read

Close
result = fclose(<file_id_var>);

30
Files
Text files

Considerations about low level I/O


File mode
A file must be opened in reading or writing mode accordingly to the type
of operation we are going to perform on it.
Closing the file
Always close files that were opened. When writing (or appending) to a
file, typically it must be closed before we can see the changes inside the
file.
Pointer of the file
When we open a file the program defines a pointer in the file to read and
write. Such reading/writing point is initially placed at the beggining of the
file, except in the case of a file opened to append. When we read/write
from/to a file such pointer will move forward proportionally.

31
Files
Text files

Reading from files


fscanf. Read data from text file. Syntax:

variable = fscanf(fId, "argument format", size)

The function reads data from an open text interpreting values in the file
according to the format specified, and stores such data in the array
variable (with size size) . The function reapplies the format throughout
the entire file and positions the file pointer at the end-of-file marker.
If fscanf cannot match the format to the data, it reads only the portion
that matches and stops processing.

Tip: When including the * symbol into a format descriptor the program will
only match that data, without reading it. It is very useful to represent
pieces of string we want to skip, using the descriptor %*s

32
Files
Text files

33
Files
Text files

Reading from files


fscanf. Example:
myScript.m myFile.txt
filename = 'myFile.txt'; Hello World
fid = fopen(filename); N=3
N = fscanf(fid, '%*s %*s\nN=%d\n\n', 1);
fclose(fid); 4.21 6.55 6.78 6.55

N is a column vector with 1 element (one single value)


%*s skips characters until we reach a blank space or a new line

Hello World skip 2 strings + go to next line: %*s %*s\n


N=3 ignore 'N=', read integer: N=%d\n
go to next line: \n
4.21 6.55 6.78 6.55

after the statement the pointer is here

34
Files
Text files

11/24/2018 35
Files
Text files

not an error

11/24/2018 36
Files
Text files

11/24/2018 37
Files
Text files

11/24/2018 38
Files
Text files

11/24/2018 39
Files
Text files

11/24/2018 40
Files
Text files

11/24/2018 41
Files
Text files

Exercise
The next text file represents a series of records with three elements
each: time, date and matrix. Write a MATLAB program to read the
whole file and store all the data in a vector of structures. The length of
the file (and therefore the number of records) is unknown.
myFile.txt
12:00:00
01-Jan-1977
4.21 6.55 6.78 6.55
9.15 0.35 7.57 7.15
7.92 8.49 7.43 7.06
9.59 9.33 3.92 0.31
09:10:02
23-Aug-1990
2.76 6.94 4.38 1.86
0.46 3.17 3.11 4.89
0.97 9.50 7.65 4.45
8.23 0.34 7.95 6.46

42
OTHER KIND OF FILES: DATABASES

44
External data storage
Databases

From an abstract perspective, a database is a data


storage organized according to a logical structure

Relational databases: data is stored in tables


Record store database
Album table: album id, title, year, group id, song set id
Group table: group id, name, city, number of members
Song set table: song set id, song id
Songs table: song id, title, length minutes, length seconds

45
External data storage
Databases

From an abstract perspective, a database is a data


storage organized according to a logical structure

Database data is stored as files in the hard disk


These files are not directly accessed
A special program is used: the RDBMS (Relational Data Base Management
System)
Oracle DBMS, MySQL, etc.

Non-relational databases: documents (MongoDB), graph-based


(Neo4j), key-value (Cassandra), etc.

46
External data storage
Databases

47
External data storage
Databases

The RDBMS provides a query language to retrieve


data from tables

SQL language (Structure Query Language)


> Retrieve the title and year of all albums by “Arcade Fire”
SELECT
Title, Year FROM Album, Group
WHERE
Album.Group_ID=Group.Group_ID AND
Name = "Arcade Fire"

48

You might also like