Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 38

1

A Practical on Python Libraries and R


Packages

Session :- 2021-2023
Submitted By:- Anurag Sharma
Submitted To:- Dr. Sahil Saharan Mam
Class :- MCA (2nd Sem.)
2

Roll No. :- 20221724

SR. INDEX Page Remarks


no. No.
1. Write a code to find the longest word in the
file :-
(i) with the help of string 4
(ii) with the help of text file
(iii) with the help of csv
2. Write a code to create a Series
(i) Using Dict
(ii) Use following operations over the
created series :
• Get values of Series
• Length of Series 5-6
• Use of .loc and .iloc
• Take operation
• Slicing
3. Write a code to create dataframe :-
(I) using dataframe from a csv file
(II) finally perform the operations
• Get the column with name 7-8
• Difference of two columns
• Sum of two columns
• Access the index
4. Write the code for manipulating DataFrame
Structure:-
(i) Rename the column name
(ii) Add the new column
• Insert L.price*2 as the second
column in the dataframe 10-11
• Using concat command
(iii)Remove columns:
• Use del to delete a book
column (iv) use of append operation
3

5. write the code to handle Indexing of data:


(i)Show that the column are actually index
(ii) Types of Indexes
• Int64Index
• Float64Index
12-14
• DateTimeIndex
(iii)working with indexing
• creating and using an index with a series and
dataframe
• Reindexing a pandas object
(iv)Hierarchical Indexing
(v)use of update operation
6. Write a code to understand categorical data:
(i)creating categoricals
(ii)Renaming categories
(iii)Appending new categories
(iv)Removing categories 15-16
(v)munging school grades
7. Write a code to measure central tendency:
(i)mean
(ii)mode
(iii)median 17
8. Write a code to find variance,standard
deviation,covariance and correlation: 18-19
9. Create a package with valid name and create
function including mean,mode and median 20-22

10. Create a package to insert elements in list and


traverse each element and show metadata.
23-24
11 . Create a package with object documentation
. 25-26
12. Create small application in R package using
Dataframe and show vignettes .
27-29
13. Create a package with valid test cases of
factorial program .
30-31
14. Create a package in R to check the number is
Armstrong or nit in C++ code .
32-34
4

15. Create a program with the test cases of given


string and show output of each command .
35-36

1.write a code to find the longest word in the file :-


(i) with the help of string
(ii) with the help of text file
(iii) with the help of csv

(i) Using string str=input("enter the string value")


longest_word=max(str.split(),key=len)
print("longest_word is",longest_word) print("the length
of",longest_word,"is",len(longest_word))

output:-

(ii) using text file print(max(open("C:\\Users\\


admin\\Desktop\\long.txt"),key=len))

output:-

(iii) with the help of csv fin=open("C:\\Users\\


Admin\Desktop\jind.csv") str=fin.read() words=str.split()
max_len=len(max(words,key=len)) for word in words: if
5

len(word)==max_len: longest_word=word
print(longest_word)

output:-

2. Write a code to create a Series:-


(i) Using Dict
(ii) Use following operations over the created series :
• Get values of Series
• Length of Series
• Use of .loc and .iloc
• Take operation
• Slicing

(i) create series from dict pd.Series({'Kirti':'Daughter',


'Monika':'Sister',
'Poonam':'Friend',
'Anjali':'Neighbour'})

output:-

(ii) Use following operations over the created series :


• Get values of Series
• Length of Series
• Use of .loc and .iloc
• Take operation
• Slicing

• get the values in the Series


6

s=pd.Series([1,2,3])
s.values
output:-

• length of Series
s=pd.Series([0,1,2,3])
len(s)

output:-
4

• get items at positions 11 and 12


s2.loc[[11,12]]

output:-

• explicity by position
s2.iloc[[3,2]]

output:-

• only take specific items by position


s.take([0,1,2])

output:-

• slicing showing items at position 1 through 5


7

s[1:6]

output:-

3.Write a code to create dataframe :-


(i) using dataframe from a csv file
(ii) finally perform the operations
• Get the column with name
• Difference of two columns
• Sum of two columns
• Access the index

(i) using dataframe from a csv file


**read in the data and read first five rows
**use the symbol column as index and only read in column position
0,2,3,7 import pandas as pd sp500= pd.read_csv("C:\\Users\\Admin\\
Desktop\\sp500.csv", index_col='Symbol',usecols = [0,2,3]) sp500

output:-

(ii) finally perform the operations


• Get the column with name
8

• Difference of two columns


• Sum of two columns
• Access of a column by name

• get the column with the name Number = pd.Series([1,2,3,4,5,6,7]) c=


pd.Series([11,12,13,14,15,16,17]) df=
pd.DataFrame({"Number":Number,"Rollno":c}) print(df["Rollno"]

output:-

For difference of two column


differ=df.Number-df.Rollno
print(differ)
df['Difference']=differ print(df)

output:-

• sum of two columns


sum=df.Number+df.Rollno
print(sum df['Sum']=sum
print(df)

output:-
9

•Access of a column by name


sp500.Bookvalue[:5]

output:-
10

4. Write the code for manipulating DataFrame Structure:-


(i) Rename the column name
• Use of in-place after renaming
(ii) Add the new column
• Insert L.price*2 as the second column in the dataframe
• Using concat command
(iii) Remove column:-
• Use del to delete a book column
(iv) Use of append operation
(v) Use of update operation

(i) Rename the column name newSP500 =sp500.rename(columns=


{'Bookvalve': 'Book Value'}) newSP500[:2]

output:-

• use of in-place after renaming


sp500.rename(columns= {'Bookvalve': 'Book Value'},
inplace=True)
sp500.columns

output:-

(ii) Add the new column:


• Insert L.price*2 as the second column in the dataframe •
Using concat command
11

import pandas as pd rounded_price


=pd.DataFrame({'RoundedPrice':sp500.Price.round()}) concatenated
=pd.concat([sp500,rounded_price],axis=1) concatenated[:2]

output:-

(iii)Remove columns:
• Use del to delete a book column
#deleting column
copy=sp500.copy()
del copy['Bookvalue']
copy[:3]

output:-

(iv) use of append operation


df1 =sp500.iloc[0:3].copy()
df2 =sp500.iloc[[10,11,2]]
appended = df1.append(df2)
appended

output:-
12

5. write the code to handle Indexing of data:


(i)Show that the column are actually index
(ii) Types of Indexes
• Int64Index
• Float64Index
• DateTimeIndex
• create a dataframe using index
(iii)working with indexing
• Creating and Using an index with a series and DataFrame
• Reindexing a pandas object
(iv)Hierarchical Indexing
(v)Use of update operation

(i)Show that the column are actually index


temps = pd.DataFrame({"city": ["Missoula","philadelphia"],
"Temperature":[60,80,]})
Temps

output:

(ii) Types of Indexes


• Int64Index
# explicitily create an Int64Index df_i64
=pd.DataFrame(np.arange(20,30), index=np.arange(0,10))
df_i64[:4]

output:-
13

• Float64Index
# indexes using float64index df_f64
=pd.DataFrame(np.arange(0,1000,5),
np.arange(0.0,100.0,0.5))
df_f64.iloc[:5]

output:-

• DateTimeIndex
• create a dataframe using index df_date_times
=pd.DataFrame(np.arange(0,len(date_times)),
index=date_times)
df_date_times

output:-
14

(iii)working with indexing


• Creating and Using an index with a Series and DataFrame import pandas as pd
sp500= pd.read_csv("C:\\Users\\Admin\\Desktop\\sp500.csv",
index_col='Symbol',
usecols = [0,2,3])
sp500

output:-

• reindexing the column


#reindex to how MMM,AFT and MCAindex labels
reindexed =
sp500.reindex(index=['MMM','AFT','MCA']) reindexed
output:-

(iv)Hierarchical Indexing #reindex columns


sp500.reindex(columns=['Price','Bookvalue'])
[:3]

output:-

(v)Use of update operation


df=pd.DataFrame([11,12,13,14],
['a','b','c','d']]) df df1=pd.DataFrame([[18],
['c']]) df.update(df1)
df output:-
15

6. write the code to understand categorical data:


(i) Create Categorical
(ii) Rename the categories
(iii) Appending categories
(iv) Removing categories
(v) Munging school grades

(i)create a dataframe with a categorical value df_categorical


=pd.DataFrame({'A': np.arange(6),
'B': list('aabbca')})
df_categorical['B'] =df_categorical['B'].astype('category',
categories=list('cab'))
df_categorical

output:-

(ii) Rename the categories cat = pd.Categorical


=["bronze","silver","gold"] cat

output:-

(iii) Appending categories with_platinum =


metals.add_categories(["platinum"])
with_platinum

output:-
16

(iv) Removing categories


no_bronze =metals.remove_categories(["bronze"])
no_bronze

output:-

(v) Munging school grades


np.random.seed(123456)
names =['ilvana','Norris','Ruth','Skye','Sol','Dylan','Katina','Alissa','Mare']
grades =np.random.randint(50,101,len(names)) scores =
pd.DataFrame({'Name': names, 'Grade' :grades}) scores

output:-
17

7.Write a code to measure central tendency:- (i)Mean


(ii)Median
(iii)Mode

(i)Mean
import pandas as pd
# Creating the dataframe of student's marks df =
pd.DataFrame({"John - Marks ":[98,87,76,88,96],
"Adam - Marks":[88,52,69,79,80],
"David - Marks":[90,92,71,60,64],
"Rahul - Marks":[88,85,79,81,91]}) df.mean(axis = 0)
output:-

(ii) Median
# Creating the dataframe of student's marks
df = pd.DataFrame({"John - Marks ":[98,87,76,88,96],
"Adam - Marks":[88,52,69,79,80],
"David - Marks":[90,92,71,60,64],
"Rahul - Marks":[88,85,79,81,91]})
df.median(axis = 0)
#calc. the median of the values in each column
omh.median() output:-

(iii) Mode
# Creating the dataframe of student's marks
df1 = pd.DataFrame({"John - Marks ":[98,87,87,76,88],"Adam -
Marks":[88,52,69,79,79],"David - Marks":[90,92,71,71,64],"Rahul -
Marks":[88,85,85,81,91]}) df1.mode() output:-
18

8.Write a code to find variance,standard deviation, covariance and correlation:- (I)


variance
(II) standard deviation
(iii)covariance
(Iv) correlation

(i)variance
#calc. the variance of the values in each column
df.var()

output:-

(ii) standard deviation: #calc. the standard


deviation
df.std()

output:-

(iii) ) covariance:- df = pd.DataFrame([[10,


20, 30, 40], [7, 14, 21, 28], [55, 15, 8, 12],
[15, 14, 1, 8], [7, 1, 1, 8], [5, 4, 9, 2]],
columns=['Apple', 'Orange', 'Banana', 'Pear'], index=['Basket1',
'Basket2', 'Basket3', 'Basket4','Basket5', 'Basket6']) print("Calculating
Covariance ")
print(df.cov()) print("Between
2 columns ") # Covariance of
Apple vs Orange
print(df.Apple.cov(df.Orange))
19

output:-

(Iv) correlation:-
# create dataframe with 3 columns data
= pd.DataFrame({
"column1": [12, 23, 45, 67],
"column2": [67, 54, 32, 1],
"column3": [34, 23, 56, 23]})
# correlation between column 1 and column2
print(data['column1'].corr(data['column2']))

output:-
20

9. Create a package with valid name and create function including mean,
mode,median.
Step:-
1. Click on file menu -> New project.
2. Choose new directory as shown below:-

3. Next, Select R package, which is the second option as shown below:


21

4. Finally, Give name to your package as statistics and click create project:

5. Now you can see the package named as statistics shown in file folder as shown below:
22

6. Now, Open this package and the interface will shown as:

7. Now, make function named as Tendency of .R file in R/directory as shown below:

8. Now, Run this function and the output will be shown as:
23

10.Create a package to insert element in list and traverse each element of list
and show metadata about your package.
Steps:-
1. At first create package named as “mypackage” as we discussed earlier and the interface will
be shown as:

2. Now, Create function() named as Traverse in .R file of R/directory for creating list and
traverse the list as shown below:

3. And, Run the function Traverse() and the output will be shown as below:
24

4. Now, Show the metadata of package “mypackage” by clicking on DESCRIPTION as shown


below:
25

11.Create a package with object documentation and show output of each step.
Steps:-
1. At first, Create package named as “mypackage” as we discussed in program 1.
2. Create .R file in R/directory and save it as Add.R and Add roxygen comments to your .R file as
shown below:

3. Run command ‘devtools::document()’ to convert roxygen comments into .Rdfiles. The output
will be shown as:
26

4. Now, Go to man/directory of your package and you will see that add.Rd file is created
automatically as shown below:

5. Atlast, Open add.Rd file and it will be shown as:


27

12. Create a Package with valid commands and show the output and
additionally show the created Vignettes of the application given below:

➢ Create a csv file, import data in dataframe, rename the column and use sorting over a
column, Add one new row and one new column, And finally show the output.

Steps:-
1. After creating package as we discussed in program 9.
2. Create function() named as dataframe in .R file of R/directory, The function will be shown as:
28

3. And Run the function dataframe and the output will be shown as:
29

4. After that, To create vignettes first run the command “usethis::use_vignettes(MyVignette)”


and the output will be shown as:
30

5. Now, You will see that it will create Vignettes/directory in your package, Open this directory
and click on MyVignette.Rmd file and make changes to it as you want and the interface will
be shown as:

6. Now, click on knit option and an html page open as shown below:
31

13. Create a package with the valid test cases of Factorial program.
Steps:-
1. Write a function named as factorial in .R file of R/directory as shown below:

2. Load it with command “devtools::toad_all()” as shown below:

3. To setup your package to use testthat, Run the command “usethis::use_testthat()” as shown
below:
32

4. This will create tests/testthat directory as shown below:

5. Now create a file test_fact.R in testthat/directory as shown below:

6. Finally, Run your tests using the command “devtools::test()” or directly click on Run Tests
option and the output will be shown as:

14. Create a Package in R to check the number is Armstrong or not in C++.


Steps:-
33

1. To setup your package with Rcpp, run the following command “usethis::use_rcpp()” and
output will be shown as

2. Create an src/directory to hold .cpp files as shown below:

3. Add Rcpp to the LinkingTo and imports fields in the DESCRIPTION of your package as shown
below:
34

4. Once you are set up, the basic workflow, create a new C++, as shown below:

5. Now, Create function in C++ named as checkArmstrongNumber to check the given number is
Armstrong or not as shown below:
35

6. Finally, Run the code and the output will be shown as:

15. Create a program with the valid test cases of Given string and show Output
of each command.
Steps:-
1. Write a function named as Len in .R file of R/directory as shown below:
36

2. Load it with command “devtools::load_all()” as shown below:

3. To setup your package to use testthat, Run the command “usethis::use_testthat()” as shown
below:

4. This will create tests/testthat directory as shown below:

5. Now create a file test_Strlen.R in testthat/directory as shown below:


37

6. Finally, Run your tests using the command “devtools::test()” or directly click on Run Tests
option and the output will be shown as:
38

You might also like