Cheat Sheet Template

Community
Data Science Cheat Sheet

Python-Basics Lists
>>> a = 'is'
>>> b = 'nice'
>>> my_list = ['my', 'list', a, b]
>>> my_list2 = [[4,5,6,7], [3,4,5,6]]
Variables and Data Types
Selecting List Elements Index starts at 0
Variable Assignment
Subset
>>> x=5
>>> my_list[1] >>>
Select item at index 1
>>> x
my_list[-3] Select 3rd last item
5
Slice
Calculations With Variables >>> my_list[1:3] Select items at index 1 and
>>> x+2 >>> my_list[1:] 2 Select items after index 0
Sum of two variables
7 >>> my_list[:3] Select items before index 3
>>> x-2 >>> my_list[:] Copy my_list

Subtraction of two variables
3 Subset Lists of Lists

>>> x*2 my_list[list][itemOfList]
Multiplication of two variables >>> my_list2[1][0]
10
>>> my_list2[1][:2]
>>> x**2 Exponentiation of a variable
25 List Operations
Remainder of a variable
>>> x%2
>>> my_list + my_list >>>
1
Division of a variable ['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice']
>>> x/float(2)
my_list * 2
2.5
['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice']
Types and Type Conversion >>> my_list2 > 4
True
str() ' 5', '3.45', 'True' Variables to strings
List Methods
int() 5, 3, 1 Variables to integers
>>> my_list.index(a) Get the index of an item Count
>>> my_list.count(a) an item
float() 5.0, 1.0 Variables to floats >>> my_list.append('!') Append an item at a time
>>> my_list.remove('!') Remove an item
Variables to booleans >>> del(my_list[0:1]) Remove an item
bool() True, True, True
>>> my_list.reverse() Reverse the list
>>> my_list.extend('!') Append an item
Help >>> my_list.pop(-1) Remove an item
>>> my_list.insert(0,'!') Insert an item
>>> help(str) >>> my_list.sort() Sort the list
Strings
>>> my_string = 'thisStringIsAwesome'
>>> my_string 'thisStringIsAwesome'
String Operations Index starts at 0
>>> my_string[3]
String Operations >>> my_string[4:9]
>>> my_string * 2 String Methods
'thisStringIsAwesomethisStringIsAwesome'
>>> my_string + 'Innit' >>> my_string.upper() String to uppercase
'thisStringIsAwesomeInnit' >>> my_string.lower() String to lowercase
>>> 'm' in my_string >>> my_string.count('w') Count String elements
True >>> my_string.replace('e', 'i') Replace String elements
>>> my_string.strip() Strip whitespaces
Libraries
Install Python
Import libraries
>>> import numpy
>>> import numpy as np Data analysis Machine learning
Selective import
>>> from math import pi
Scientific computing 2D plotting Leading open data science
Free IDE that is
Create and share
platform powered by Python included with

documents with live code,
Anaconda visualizations, text, ...
@aadhi06 my profile
@cloudnloud @cloudnloud info@cloudnloud.com @cloudnloud cloudnloud
Numpy Arrays
>>> my_list = [1, 2, 3, 4]
>>> my_array = np.array(my_list)
Community
>>> my_2darray = np.array([[1,2,3],[4,5,6]])
Selecting Numpy Array Elements Index starts at 0
Python For Data Science CheatSheet
Subset
>>> my_array[1] Select item at index
Pandas Basics
2 1
Slice
Select items at index 0 and
>>> my_array[0:2] Pandas

array([1, 2]) 1
The Pandas library is built on NumPy and provides easy-to-use
Subset 2D Numpy arrays data structures and data analysis tools for the Python
my_2darray[rows,
>>> my_2darray[:,0] programming language.

array([1, 4]) columns]
Numpy Array Operations Use the following import convention:
>>> import pandas as pd
>>> my_array > 3
array([False, False, False, True], dtype=bool) Pandas Data Structures
>>> my_array * 2
array([2, 4, 6, 8]) Series
>>> my_array + np.array([5, 6, 7, 8]) a 3
A one-dimensional labeled
array([6, 8, 10, 12]) b -
array capable of holding any
Numpy Array Functions data type Index
c 5
d 7
>>> my_array.shape Get the dimensions of the 4
>>> np.append(other_array) array Append items to an array >>> s = pd.Series([3, -5, 7, 4], index=['a', 'b', 'c', 'd'])
>>> np.insert(my_array, 1, 5) Insert items in an array
DataFrame
>>> np.delete(my_array,[1]) Delete items in an array
>>> np.mean(my_array) Mean of the array Columns
>>> np.median(my_array) Median of the array CountryCapitalPopulation A two-dimensional labeled
Correlation coefficient data structure with columns
>>> my_array.corrcoef() 0 BelgiumBrussels11190846
Standard deviation of potentially different
>>> np.std(my_array)
Index
1 IndiaNew Delhi1303171035 types.
2 BrazilBrasília207847528
Also see NumPy Arrays >>> data = {'Country': ['Belgium', 'India', 'Brazil'], 'Capital':
>>> s['b'] Get one element ['Brussels', 'New Delhi', 'Brasília'], 'Population': [11190846,
-5 1303171035, 207847528]}
>>> df[1:] Get subset of a DataFrame
>>> df = pd.DataFrame(data, columns=['Country', 'Capital',
Country Capital Population 1 India New Delhi 1303171035 2 Brazil Brasília
207847528
'Population'])
Selecting, Boolean Indexing & Setting
Getting
By Position
>>> df.iloc[[0],[0]]
'Belgium'
Select a single value by row & column

Dropping
>>> df.iat([0],[0])
'Belgium' >>> s.drop(['a', 'c']) Drop values from rows (axis=0)
By Label >>> df.drop('Country', axis=1) Drop values from columns(axis=1)
>>> df.loc[[0], ['Country']]
'Belgium'
Select single value by row & column labels Sort & Rank
>>> df.at([0], ['Country']) >>> df.sort_index() Sort by labels along an axis
'Belgium' >>> df.sort_values(by='Country') Sort by the values along an
>>> df.rank() axis Assign ranks to entries
By Label/Position
Select single row of subset of rows
>>> df.ix[2]
Country Brazil
Retrieving Series/DataFrame Information
Capital Brasília
Basic Information
Population 207847528 Select a single column of subset of columns >>> df.shape (rows,columns)
>>> df.ix[:,'Capital']
>>> df.index Describe index
0 Brussels >>> df.columns Describe DataFrame
1 New Delhi >>> df.info() columns Info on DataFrame
2 Brasília Select rows and columns Number of non-NA values
>>> df.count()
>>> df.ix[1,'Capital'] Series s where value is not >1
'New Delhi' where value is <-1 or >2 Summary
>>> df.sum() Sum of values
Boolean Indexing
>>> df.cumsum() Cummulative sum of values

>>> s[~(s > 1)] >>> df.min()/df.max() Minimum/maximum values

>>> s[(s < -1) | (s > 2)] s Use filter to adjust DataFrame >>> df.idxmin()/df.idxmax()
Minimum/Maximum index value
>>> df[df['Population']>1200000000] >>> df.describe() Summary statistics
>>> df.mean() Mean of values
Setting >>> df.median() Median of values
>>> s['a'] = 6
Set index a of Series s to 6
@aadhi06 my profile
Community
Who we Are ?
Cloudnloud Tech Community was started by Vijayabalan a cancer survivor in 2011 with the notion to save
the children diagnosed with cancer by creating a sustainable source of funds for their treatment. The
community helps Indian IT aspirants to dream bigger than the monotonous job search system and instead
achieve great heights of life by building a name for themselves. Over the years, the community has emerged
globally and the members participate in various tech events across the globe as speakers on trending
technologies.
Officially registered as Pvt Ltd in Jan-2015, the community runs with a motto to help cancer children from
the revenue generated from the events/courses/programs organized. Today Cloudnloud is living that
dream with 7000 plus cancer children survivors.
About Me :
Am a tech noob developer when I get an error I find a cheat sheet for syntax.
With diversified skills in eating, sleeping, and coding... Sometimes even running.
Currently sniffing on metaverse, AI, ML, DS, AR/VR, and blockchain techs.
Aadhityaa S.B.
flowcv.me/aadhi
@aadhi06
@aadhi06 my profile

Cheat Sheet Template

Uploaded by

Copyright:

Available Formats

You might also like

Cheat Sheet Template

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cheat Sheet Template

Uploaded by

Copyright:

Available Formats

Community

Data Science Cheat Sheet

>>> x+2 >>> my_list[1:] 2 Select items after index 0

Sum of two variables

7 >>> my_list[:3] Select items before index 3

>>> x-2 >>> my_list[:] Copy my_list

3 Subset Lists of Lists

platform powered by Python included with

Anaconda visualizations, text, ...

>>> my_array[0:2] Pandas

>>> my_2darray[:,0] programming language.

>>> df.cumsum() Cummulative sum of values

>>> s[~(s > 1)] >>> df.min()/df.max() Minimum/maximum values

dream with 7000 plus cancer children survivors.

@cloudnloud @cloudnloud info@cloudnloud.com @cloudnloud cloudnloud

You might also like