Professional Documents
Culture Documents
Lab Manual - Data Mining
Lab Manual - Data Mining
Lab Manual - Data Mining
Last Received
2024
2. Switch off your mobile phones during Lab class and maintain silence.
4. Do not play games, watch movies, chat, or listen to music during the class.
5. Do not change desktop settings, screen saver, or any other system settings.
1. Submission of documented lab reports related to completed lab assignments should be done
patterns as reflected in the lab rubric which eventually will benefit the students.
PO2. Problem analysis: Identify, formulate, research literature, and analyze engineering problems to
arrive at substantiated conclusions using the first principles of mathematics, natural, and engineering
sciences.
PO3. Design/Development of solutions: Design solutions for complex engineering problems and design
system components, and processes to meet the specifications with consideration for the public health and
safety and the cultural societal, and environmental considerations.
PO4. Conduct investigations of complex problems: Use research-based knowledge including design of
experiments, analysis and interpretation of data, and synthesis of the information to provide valid
conclusions.
PO5. Modern and usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modelling to complex engineering activities with an
understanding of the limitations.
PO6. The engineer and society: Apply reasoning informed by the contextual knowledge to access
societal, health, safety, legal, and cultural issues and the consequent responsibilities relevant to the
professional engineering practice.
PO7. Environment and sustainability: Understand the impact of professional engineering solutions in
societal and environmental contexts, and demonstrate the knowledge of and need for sustainable
development.
PO8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms
of the engineering practice.
PO9. Individual and team work: Function effectively as an individual, and as a member or leader in
teams, and in multidisciplinary settings.
PO10. Communications: Communicate effectively with the engineering community and with the society
at large. Be able to comprehend and write effective reports and documentation. Make effective presentations
and give and receive clear instructions.
PO11. Project management and finance: Demonstrate knowledge and understanding of engineering and
management principles and apply these to one’s work, as a member and leader in a team. Manage projects
in multidisciplinary environments.
PO12. Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and lifelong learning in the broadest context of technological change.
PSO1: Ability to develop solutions for scientific, analytical, and research-oriented problems in the area of
Computer Science and Engineering (Data Science).
PSO2: Ability to apply suitable programming skills integrated with professional competence to develop
applications catering to the industrial and societal needs in the field of Computer Science and Engineering
(Data Science) and its allied areas.
SYLLABUS
Manipulating strings, Processing Files, Manipulating Lists, Lists and Strings, Dictionary, counting
with Dictionaries, Dictionaries and Files, Tuples, Tuples and Sorting, Regular Expressions,
Networked programs, Sockets and Applications, parsing HTML with Beautiful soup, parsing XML
by Python, REST, JSON and APIs, extracting data from JSON, using database by python, Object-
oriented python, Geocoding, Page rank and web searching, Gm, lane.
3.
4.
5.
6.
7.
8.
Experiment No: 1
Experiment: Basic recap of list, string, and tuple dictionary using python
The four main types of objects you will work with are strings, lists, dictionaries, and tuples.
Knowing how to manipulate these objects is incredibly important in Python. Before discussing
each object in detail, I wanted to give an overview of the four objects.
Definitions:
Immutable vs mutable:
• Strings are immutable. When using strings, bracket operators cannot be used on the
left side of an assignment to change that string.
• Lists are mutable. Unlike strings, you can use the bracket operator on the left side of
the assignment to change an element in a given list.
• Dictionaries are mutable. You can modify any item (key-value pair) in a dictionary
using the bracket operator on the left side of the assignment.
• Tuples are immutable. You can extract elements, but you cannot modify elements of a
tuple on the left side of the assignment operator.
The in operator:
The in operator (and not in operator) is a boolean operator that takes in two arguments that
determine if the first argument is a substring or an element of the second argument.
• When comparing two strings, the operator searches if the first string appears as a
substring in the second and returns a True or False.
• If the second argument is a list, the operator can take in two a string and a list to see if
the string appears as one of the elements in the list and returns a True or False.
For strings, lists, and dictionaries, there are a set of methods you can use to manipulate the
objects. Because tuples are immutable, there are no methods to modify the objects. In general,
the notation for methods is the dot notation. The syntax is the name of the objects followed
by a dot (or period) followed by the method's name.
x = "Hello Boston!"
x.split()
Here we use the method split() applied to a string (making this a string method). The method
splits the string at a given delimiter (the white space “ “).
In the subsequent pages, I will compare and contrast the similarities and differences of strings,
lists, dictionaries, and tuples in greater detail. In general, lists are more common than tuples
(mostly because they are mutable), but there are a few reasons why tuples may be preferable:
1. In some contexts, like a return statement, it is syntactically simpler to create a tuple than
a list. In other contexts, you might prefer a list.
2. If you want to use a sequence as a dictionary key, use an immutable type like a tuple or
string.
3. If we pass a sequence as an argument to a function, using tuples reduces the potential
for unexpected behaviour due to aliasing.
Again, because tuples are immutable, they do not provide methods like split().
Q1. Write a Python script that prints prime numbers less than 20.
Ans:
Q2. Write a program to create, concatenate, and print a string and access a substring from a
given string.
Ans:
# Create a string
user_str1 = input('Enter a string: ')
# Concatenate strings
user_str2 = user_str1 + 'World!'
# Print the concatenated string
print('Concatenated String:', user_str2)
# Access substring
start_index = int(input('Enter the starting index for the substring: '))
end_index = int(input('Enter the ending index for the substring: '))
user_substring = user_str2[start_index:end_index]
# Print the substring
print('Substring:', user_substring)
Experiment No: 2
Experiment: String and File operations
Indexing means referring to an element of an iterable by its position within the iterable. Each
of a string’s characters corresponds to an index number and each character can be accessed
using its index number. We can access characters in a String in Two ways:
Accessing Characters by Positive Index Number: In this type of Indexing, we pass a Positive
index(which we want to access) in square brackets. The index number starts from index number
0 (which denotes the first character of a string)
Output:
T
M
n
=== Code Execution Successful ===
Accessing Characters by Negative Index Number: In this type of Indexing, we pass the
Negative index(which we want to access) in square brackets. Here the index number starts from
index number -1 (which denotes the last character of a string). Example 2 (Negative Indexing):
Output:
e
a
n
Slicing Operator
A slice object is used to specify how to slice a sequence. We can specify where to start the
slicing, and where to end. We can also specify the step, that allows us to e.g. slice only every
other item.
x = slice(0,10,2)
print(a[x])
Output:
Tcn a
1. strip():- This method is used to delete all the leading and trailing characters
mentioned in its argument.
2. 2. lstrip():- This method is used to delete all the leading characters mentioned in its
argument.
3. 3. rstrip():- This method is used to delete all the trailing characters mentioned in its
argument.
Output:
replace():- This function is used to replace the substring with a new substring in the string. This
function has 3 arguments. The string to replace, new string which would replace, and max value
denoting the limit to replace action ( by default unlimited ).
str1 = "TMSL"
str2 = "Techno Main Salt Lake"
Output:
The string after replacing strings is : Techno Main Salt Lake is the best
Python String Methods (find, rfind, startwith, endwith, lower, upper, swapcase & title):
find(“string”, beg, end):- This function is used to find the position of the substring within a
string.It takes 3 arguments, substring, starting index( by default 0), and ending index( by
default string length).
rfind(“string”, beg, end):- This function has a similar working as find(), but it returns the
position of the last occurrence of the string.
Output:
The first occurrence of str2 is at : 17
The last occurrence of str2 is at : 42
startswith(“string”, beg, end) :- The purpose of this function is to return true if the function begins
with the mentioned string(prefix) else return false.
endswith(“string”, beg, end) :- The purpose of this function is to return true if the function ends
with mentioned string(suffix) else return false.
x1= txt.endswith("Hello")
print(x1)
Output:
True
False
lower():- This function returns the new string with all the letters converted into its
lowercase.
upper():- This function returns the new string with all the letters converted into its upper
case.
swapcase():- This function is used to swap the cases of string i.e upper case is converted to
lower case and vice versa.
title():- This function converts the string to its title case i.e the first letter of every word of
string is upper cased and else all are lower cased.
Output:
The lower case converted string is : techno main salt lake
The upper case converted string is : TECHNO MAIN SALT LAKE
The swap case converted string is : tECHNO mAIN sALT lAKE
The title case converted string is : Techno Main Salt Lake
Python String split() method splits a string into a list of strings after breaking the given string
by the specified separator.
text = 'Techno Main Salt Lake'
# Splits at space
print(text.split())
# Splits at ','
print(word.split(','))
# Splitting at ':'
print(word.split(':'))
word = 'CatBatSatFatOr'
# Splitting at t
print(word.split('t'))
Output:
['Techno', 'Main', 'Salt', 'Lake']
['Techno', ' Main', ' Salt', ' Lake']
['Techno', ' Main', ' Salt', ' Lake']
['Ca', 'Ba', 'Sa', 'Fa', 'Or']
In Python, we can use the join() method with different types of iterables such as Lists,
Tuple, String, Dictionary, and Sets. Let’s understand them one by one with the help of
examples.
str = '-'.join('TMSL')
print(str)
Output:
T-M-S-L
Python String index() Method allows a user to find the index of the first occurrence of an
existing substring inside a given string in Python.
Output:
The first position of geeks after 2nd index : 12
Python String rindex() method returns the highest index of the substring inside the string if
the substring is found. Otherwise, it raises ValueError.
result = text.rindex('best')
print("Substring 'best':", result)
Output:
Substring 'best': 12
The string len() function is a built-in function in Python that is used to find the length of the
string. You can know the number of characters in a string using this function.
Output:
17
String count() function is an inbuilt function in Python programming language that returns
the number of occurrences of a substring in the given string. It is a very useful string function
that can be used for string analysis.
#initializing a string
my_string = "Techno Main Salt Lake"
#using string count() method
char_count = my_string.count('a')
#printing the result
print(char_count)
Output:
3
Python String capitalize() method returns a copy of the original string and converts the first
character of the string to a capital (uppercase) letter while making all other characters in the
string lowercase letters.
# initializing a string
name = "techno main salt lake"
#using capitalize function
print(name.capitalize())
Access modes govern the type of operations possible in the opened file. It refers to how the file
will be used once its opened. These modes also define the location of the File Handle in the
file. The file handle is like a cursor, which defines from where the data has to be read or written
in the file and we can get Python output in text file.
Read Only (‘r’): Open text file for reading. The handle is positioned at the beginning of the
file. If the file does not exist, raises the I/O error. This is also the default mode in which a file
is opened.
Read and Write (‘r+’): Open the file for reading and writing. The handle is positioned at the
beginning of the file. Raises I/O error if the file does not exist.
Write Only (‘w’): Open the file for writing. For the existing files, the data is truncated and
over-written. The handle is positioned at the beginning of the file. Creates the file if the file
does not exist.
Write and Read (‘w+’): Open the file for reading and writing. For an existing file, data is
truncated and over-written. The handle is positioned at the beginning of the file.
Append Only (‘a’): Open the file for writing. The file is created if it does not exist. The handle
is positioned at the end of the file. The data being written will be inserted at the end, after the
existing data.
Append and Read (‘a+’): Open the file for reading and writing. The file is created if it does
not exist. The handle is positioned at the end of the file. The data being written will be inserted
at the end, after the existing data.
file1.seek(0)
file1.seek(0)
file1.seek(0)
# readlines function
print("Output of Readlines function is ")
print(file1.readlines())
print()
file1.close()
Output:
Hello
Th
Output of Readline(9) function is
Hello
Output of Readlines function is
['Hello \n', 'This is Delhi \n', 'This is Paris \n', 'This is London \n']
with open("test.txt") as f:
with open("out.txt", "w") as f1:
for line in f:
f1.write(line)
Output:
Case 1:
Contents of file(test.txt):
Hello world
Output(out.text):
Hello world
Case 2:
Contents of file(test.txt):
TMSL
Output(out.text):
TMSL
# initializing string
test_str = "Techno Main SaltLake"
# printing result
print("Count of all characters in GeeksforGeeks is :\n "
+ str(res))
Output:
Q5.
Ans:
# Function to count number of characters, words, spaces and lines in a file
def counter(fname):
num_words += 1
# incrementing character
# count by 1
num_charc += 1
# Driver Code:
if __name__ == '__main__':
fname = 'File1.txt'
try:
counter(fname)
except:
print('File not found')
Output:
Number of words in text file: 25
Number of lines in text file: 4
Number of characters in text file: 91
CO PO
Exp. List of Experiments Mapping Mapping Week
No.
1. Week
1
PO
Score Excellent Good Average (60%) Poor (40%) Abse CO PSO
Criteria (100%) (80%) nt Mapping Mapping
(0%)
Students can Students can Students can The student is not CO1, PO1,
identify the identify the identify the able to CO2 PO2,
problem/ problem/ problem/ analyze understand/analyze PSO1,
analyze the analyze the the problem/ Design /design the problem PSO2
problem/ Design problem/ the solutions and or interpret the
the solutions and Design the solve the problem problem in the
solve the solutions and by applying various specified language
problem by solve the algorithms with
Students can Students can Students can work Students are not CO4 PO9
work work ethically as individuals or able to work
effectively, as individuals members of a team effectively,
3. Individual or sincerely, and or as members sincerely, and
teamwork ethically as of a team ethically as
individuals or individuals or
members of a members of a team
team
Students will Students will Students will Students will not CO5 PO10
prepare effective prepare prepare effective prepare effective
documentation effective documentation of documentation of
of lab classes documentation lab classes lab classes
mentioning of lab classes mentioning problem mentioning
problem mentioning statements, input- objective, input-
4. Documentati statements, problem output output, test cases,
input-output, statements, boundary
on
appropriate test input-output, conditions
cases with test cases
boundary
conditions