Professional Documents
Culture Documents
How To Create DataFrame in Python
How To Create DataFrame in Python
How To Create DataFrame in Python
Python provides data structures like Series, DataFrame, Pandas. In this article, we are going to read about DataFrames.
As we know, Python also supports Data Structure. For new and beginners, let’s first discuss what Data Structure is. A
data structure is basically a way of storing data in such a way that it can be easily accessed and worked with, like,
Storing data in a way so that we can quickly access the last item, we create a STACK (LIFO).
Storing data in a way so that we can quickly access the first item, we create a QUEUE (FIFO).
Series
Data Frame
Pandas
Data Frame
It is a 2-Dimensional labeled array, which stores ordered collection columns that can store data of different types.
It has two indexes or we can say two axes - a row index and a column index.
Data Frame is “Value-Mutable” and “Size-Mutable”, i.e., we can change the value as well as the size.
import pandas as pd
Students = pd.Series([‘Raj’,’Raman’,’Rahul’], index=[1,2,3])
Marks=pd.Series([75,89,90], index=[1,2,3])
Contact=pd.Series([‘9899’,’9560’,’9871’], index=[1,2,3])
df=pd.DataFrame(dict)
print(df)
Note Index must be same for all Series.
import pandas as pd
dictObj={
‘EmpCode’ : [‘E01’,’E02’,’E03’,’E04’],
‘EmpName’ : [‘Raj’,’Raman’,’Rahul’,’Rohit’],
‘EmpDept’ : [‘HR’,’Accounts’,’IT’,’HR’]
}
df=pd.DataFrame(dictObj)
print(df)
As we can see in the output, it generates the index and keys of 2-D dictionary (which become columns).
We can also change the index value by passing the index in DataFrame(), like
df=pd.DataFrame(dictObj, index=[‘I’,’II’,’III’,’IV’])
Note Index value must be the same length of rows, otherwise, it generates an error.
Creating DataFrame using 2-D Dictionary contains values as Dictionary or Nested Dictionary,
import pandas as pd
yr2018 = {‘NoOfArticles’:1200, ‘NoOfBlogs’:1000, ‘NoOfNews’:700}
yr2019 = {‘NoOfArticles’:1500, ‘NoOfBlogs’:1500, ‘NoOfNews’:900}
yr2020 = {‘NoOfArticles’:2000, ‘NoOfBlogs’:1800, ‘NoOfNews’:1000}
In the above line of code, first, we created 3 dictionaries - yr2018, yr2019 and yr2020. After that, we created a
“Published” dictionary which contains other dictionaries. We can also create the above dictionary like below.
Published = {
2018 = {‘NoOfArticles’:1200, ‘NoOfBlogs’:1000, ‘NoOfNews’:700},
2019 = {‘NoOfArticles’:1500, ‘NoOfBlogs’:1500, ‘NoOfNews’:900},
2020 = {‘NoOfArticles’:2000, ‘NoOfBlogs’:1800, ‘NoOfNews’:1000}
}
df = pd.DataFrame(Published)
print(df)
import numpy as np
import pandas as pd
arr=([[11,12,13],[14,15,16],[17,18,19],[20,21,22]])
df=pd.DataFrame(arr)
print(df)
As we can see, the output that it automatically gives row indexes and column indexes which started from 0. We can also
change column name and row name like,
df=pd.DataFrame(arr,columns=[‘One’,’Two’,’Three’], index=[‘I’,’II’,’III’,’IV’])
Note If number of elements in each row different, then Python will create just single column in the dataframe object and
the type of column will be consider as Object, like,
import numpy as np
import pandas as pd
arr=np.array([[2,3],[7,8,9],[3,6,5]])
df=pd.DataFrame(arr)
print(df)
4. Creating DataFrame from another DataFrame
We can also create a new DataFrame by existing DataFrame. Like
df2=pd.DataFrame(df)
print(df2)
Conclusion
Now, we have learned about DataFrames in python and how we can create it. After reading this article, I hope we are able
to create DataFrame in python.
All the queries related to this article and sample files are always welcome. Thanks for reading.!!!