Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22

Files - Python

Mrs.S.Karthiga
Files

 A file is a contiguous set of bytes used


to store data.
 This data is organized in a specific
format and can be anything as simple
as a text file or as complicated as a
program executable.
 In the end, these byte files are then
translated into binary 1 and 0 for easier
processing by the computer.
Opening a File
 In Python, there is no need for importing external library to read and
write files. Python provides an inbuilt function for creating, writing
and reading files.

Syntax:-
file = open(“a.txt”)

This is done by invoking the open() built in function.


Create a file
f= open(“hi.txt","w+")
 We declared the variable f to open a file named textfile.txt.
Open takes 2 arguments, the file that we want to open and a
string that represents the kinds of permission or operation we
want to do on the file
 Here we used "w" letter in our argument, which indicates write
and the plus sign that means it will create a file if it does not
exist in library
 The available option beside "w" are "r" for read and "a" for
append and plus sign means if it is not there then create it
Writing in a file

file = open("t.txt","w+")
file.write("hi")
file.close()

t.txt file contains the “hi” message


File open(), Close() and iteration
Hello.txt
def main(): This is line0
f= open("hello.txt","w+") This is line1
for p in range(10): This is line2
This is line3
f.write("this is line%d\n"
%p) This is line4
This is line5
f.close()
This is line6
if __name__=="__main__": This is line7
main() This is line8
hello.txt
This is line9
How to Append Data to a File
Hello.txt
def main(): This is line0
This is line1
f= open("hello.txt","a+") This is line2
This is line3
for p in range(3): This is line4

f.write("appended line%d\n" %p) This is line5


This is line6
f.close() This is line7
This is line8
if __name__=="__main__": This is line9
appended line0
main() appended line1
appended line2
 a plus sign in the code, it indicates that it will create a
new file if it does not exist. But in our case we already
have the file, so we are not required to create a new file.
How to Read a File
 Not only you can create .txt file from Python but you can also
call .txt file in a "read mode"(r). Hello.txt
This is line0
This is line1
 Ex:- This is line2
def main(): This is line3
This is line4
f= open("hello.txt","r")
This is line5
if f.mode=="r": This is line6

contents=f.read() This is line7


This is line8
print(contents)
This is line9
if __name__=="__main__": appended line0
appended line1
main()
appended line2
How to Read a File line by line

 You can also read your .txt file line by line if your data is too big to read.
 This code will segregate your data in easy to ready mode
Hello.txt
 Ex:-
This is line0
def main():
f= open("hello.txt","r") This is line1
f1=f.readlines()
This is line2
for x in f1:
print(x) This is line3
if __name__=="__main__":
main() This is line4

This is line5
…..
Writing multiple lines to a file at once
fh = open("hello.txt","w")
lines_of_text = ["One line of text here\n", "and another line here"]
fh.writelines(lines_of_text)
fh.close()

One line of text here


and another line here
With statment
 You can also work with file objects using the with
statement.
 It is designed to provide much cleaner syntax and
exceptions handling when you are working with code. That
explains why it’s good practice to use the with statement
where applicable. 
 One bonus of using this method is that any files opened
will be closed automatically after you are done. This
leaves less to worry about during cleanup. 
Ex:-1
with open("hello.txt", "w") as f:
f.write("Hello World")

Ex:-2 : To read a file line by line


with open("hello.txt", "w") as f:
data=f.readlines()
Splitting Lines in a Text File

with open("hello.text", "r") as f:


data = f.readlines()
for line in data:
[“hello”, “world”, “how”, “are”, “you”, “today?”]
words = line.split()
[“today”, “is”, “Saturday”]
print(words)
Word count example
wordstring = 'it was the best of times it was the worst of times '
wordstring += 'it was the age of wisdom it was the age of foolishness'

wordlist = wordstring.split()

wordfreq = []
for w in wordlist:
wordfreq.append(wordlist.count(w))

print("String\n" + wordstring +"\n")


print("List\n" + str(wordlist) + "\n")
print("Frequencies\n" + str(wordfreq) + "\n")
print("Pairs\n" + str(zip(wordlist, wordfreq))
Output
String

it was the best of times it was the worst of times it was the age of wisdom it was the
age of foolishness
List
['it', 'was', 'the', 'best', 'of', 'times', 'it', 'was','the', 'worst', 'of', 'times', 'it', 'was', 'the',
'age',
'of', 'wisdom', 'it', 'was', 'the', 'age', 'of','foolishness']
Frequencies
[4, 4, 4, 1, 4, 2, 4, 4, 4, 1, 4, 2, 4, 4, 4, 2, 4, 1, 4,4, 4, 2, 4, 1]
Pairs
[('it', 4), ('was', 4), ('the', 4), ('best', 1), ('of', 4),('times', 2), ('it', 4), ('was', 4), ('the', 4),
('worst', 1), ('of', 4), ('times', 2), ('it', 4),('was', 4), ('the', 4), ('age', 2), ('of', 4),
('wisdom', 1), ('it', 4), ('was', 4), ('the', 4),('age', 2), ('of', 4), ('foolishness', 1)]
Removing stop words Example
 The process of converting data to something a computer
can understand is referred to as pre-processing. One of
the major forms of pre-processing is to filter out useless
data. In natural language processing, useless words (data),
are referred to as stop words.
 Stop Words: A stop word is a commonly used word (such as
“the”, “a”, “an”, “in”) that a search engine has been
programmed to ignore
To check the list of stopwords you can type the following commands in the python shell.

import nltk
from nltk.corpus import stopwords
set(stopwords.words('english'))

{‘ourselves’, ‘hers’, ‘between’, ‘yourself’, ‘but’, ‘again’, ‘there’, ‘about’, ‘once’, ‘during’, ‘out’,
‘very’, ‘having’, ‘with’, ‘they’, ‘own’, ‘an’, ‘be’, ‘some’, ‘for’, ‘do’, ‘its’, ‘yours’, ‘such’, ‘into’,
‘of’, ‘most’, ‘itself’, ‘other’, ‘off’, ‘is’, ‘s’, ‘am’, ‘or’, ‘who’, ‘as’, ‘from’, ‘him’, ‘each’, ‘the’,
‘themselves’, ‘until’, ‘below’, ‘are’, ‘we’, ‘these’, ‘your’, ‘his’, ‘through’, ‘don’, ‘nor’, ‘me’,
‘were’, ‘her’, ‘more’, ‘himself’, ‘this’, ‘down’, ‘should’, ‘our’, ‘their’, ‘while’, ‘above’, ‘both’,
‘up’, ‘to’, ‘ours’, ‘had’, ‘she’, ‘all’, ‘no’, ‘when’, ‘at’, ‘any’, ‘before’, ‘them’, ‘same’, ‘and’,
‘been’, ‘have’, ‘in’, ‘will’, ‘on’, ‘does’, ‘yourselves’, ‘then’, ‘that’, ‘because’, ‘what’, ‘over’,
‘why’, ‘so’, ‘can’, ‘did’, ‘not’, ‘now’, ‘under’, ‘he’, ‘you’, ‘herself’, ‘has’, ‘just’, ‘where’, ‘too’,
‘only’, ‘myself’, ‘which’, ‘those’, ‘i’, ‘after’, ‘few’, ‘whom’, ‘t’, ‘being’, ‘if’, ‘theirs’, ‘my’,
‘against’, ‘a’, ‘by’, ‘doing’, ‘it’, ‘how’, ‘further’, ‘was’, ‘here’, ‘than’}

Note: You can even modify the list by adding words of your choice in the english .txt. file in the
stopwords directory.
Ex:- removing stop words
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
example_sent = "This is a sample sentence, showing off the stop words filtration."
stop_words = set(stopwords.words('english'))
word_tokens = word_tokenize(example_sent)
filtered_sentence = [w for w in word_tokens if not w in stop_words]
filtered_sentence = []
for w in word_tokens:
if w not in stop_words:
filtered_sentence.append(w)
print(word_tokens)
print(filtered_sentence)
Output

['This', 'is', 'a', 'sample', 'sentence', ',', 'showing', 'off', 'the',


'stop', 'words', 'filtration', '.']

['This', 'sample', 'sentence', ',', 'showing', 'stop', 'words',


'filtration', '.']
Performing the Stopwords operations
in a file
import io
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
#word_tokenize accepts a string as an input, not a file.
stop_words = set(stopwords.words('english'))
file1 = open("text.txt")
line = file1.read()# Use this to read file content as a stream:
words = line.split()
for r in words:
if not r in stop_words:
appendFile = open('filteredtext.txt','a')
appendFile.write(" "+r)
appendFile.close()
Command Line arguments
 The Python sys module provides access to any command-line arguments via thesys.argv. This
serves two purposes −
 sys.argv is the list of command-line arguments.
 len(sys.argv) is the number of command-line arguments.
 Ex:-

import sys
print (“Number of arguments:”, len(sys.argv), “arguments.”)
Print(“Argument List:”, str(sys.argv))
If you pass this in a command line Number of arguments: 7 arguments.
$ python test.py arg1 arg2 arg3 Argument List:
['main.py','$','python','test.py',
'arg1','arg2','arg3']

You might also like