Cheat Sheet Template

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Community

Data Science Cheat Sheet


Python-Basics Lists
>>> a = 'is'
>>> b = 'nice'
>>> my_list = ['my', 'list', a, b]
>>> my_list2 = [[4,5,6,7], [3,4,5,6]]
Variables and Data Types
Selecting List Elements Index starts at 0
Variable Assignment
Subset
>>> x=5
>>> my_list[1] >>>
Select item at index 1
>>> x
my_list[-3] Select 3rd last item
5
Slice
Calculations With Variables >>> my_list[1:3] Select items at index 1 and

>>> x+2 >>> my_list[1:] 2 Select items after index 0

Sum of two variables

7 >>> my_list[:3] Select items before index 3

>>> x-2 >>> my_list[:] Copy my_list


Subtraction of two variables

3 Subset Lists of Lists


>>> x*2 my_list[list][itemOfList]
Multiplication of two variables >>> my_list2[1][0]
10
>>> my_list2[1][:2]
>>> x**2 Exponentiation of a variable
25 List Operations
Remainder of a variable

>>> x%2
>>> my_list + my_list >>>
1
Division of a variable ['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice']
>>> x/float(2)
my_list * 2
2.5
['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice']
Types and Type Conversion >>> my_list2 > 4
True
str() ' 5', '3.45', 'True' Variables to strings
List Methods
int() 5, 3, 1 Variables to integers
>>> my_list.index(a) Get the index of an item Count
>>> my_list.count(a) an item
float() 5.0, 1.0 Variables to floats >>> my_list.append('!') Append an item at a time
>>> my_list.remove('!') Remove an item
Variables to booleans >>> del(my_list[0:1]) Remove an item
bool() True, True, True
>>> my_list.reverse() Reverse the list
>>> my_list.extend('!') Append an item
Help >>> my_list.pop(-1) Remove an item
>>> my_list.insert(0,'!') Insert an item
>>> help(str) >>> my_list.sort() Sort the list
Strings
>>> my_string = 'thisStringIsAwesome'
>>> my_string 'thisStringIsAwesome'
String Operations Index starts at 0
>>> my_string[3]
String Operations >>> my_string[4:9]
>>> my_string * 2 String Methods
'thisStringIsAwesomethisStringIsAwesome'
>>> my_string + 'Innit' >>> my_string.upper() String to uppercase
'thisStringIsAwesomeInnit' >>> my_string.lower() String to lowercase
>>> 'm' in my_string >>> my_string.count('w') Count String elements
True >>> my_string.replace('e', 'i') Replace String elements
>>> my_string.strip() Strip whitespaces
Libraries
Install Python
Import libraries
>>> import numpy
>>> import numpy as np Data analysis Machine learning
Selective import
>>> from math import pi
Scientific computing 2D plotting Leading open data science
Free IDE that is
Create and share

platform powered by Python included with


documents with live code,

Anaconda visualizations, text, ...

@aadhi06 my profile
@cloudnloud @cloudnloud info@cloudnloud.com @cloudnloud cloudnloud
Numpy Arrays
>>> my_list = [1, 2, 3, 4]
>>> my_array = np.array(my_list)
Community
>>> my_2darray = np.array([[1,2,3],[4,5,6]])
Selecting Numpy Array Elements Index starts at 0
Python For Data Science CheatSheet
Subset
>>> my_array[1] Select item at index
Pandas Basics
2 1
Slice
Select items at index 0 and

>>> my_array[0:2] Pandas


array([1, 2]) 1
The Pandas library is built on NumPy and provides easy-to-use
Subset 2D Numpy arrays data structures and data analysis tools for the Python

my_2darray[rows,

>>> my_2darray[:,0] programming language.


array([1, 4]) columns]
Numpy Array Operations Use the following import convention:
>>> import pandas as pd
>>> my_array > 3
array([False, False, False, True], dtype=bool) Pandas Data Structures
>>> my_array * 2
array([2, 4, 6, 8]) Series
>>> my_array + np.array([5, 6, 7, 8]) a 3
A one-dimensional labeled
array([6, 8, 10, 12]) b -
array capable of holding any
Numpy Array Functions data type Index
c 5
d 7
>>> my_array.shape Get the dimensions of the 4
>>> np.append(other_array) array Append items to an array >>> s = pd.Series([3, -5, 7, 4], index=['a', 'b', 'c', 'd'])
>>> np.insert(my_array, 1, 5) Insert items in an array
DataFrame
>>> np.delete(my_array,[1]) Delete items in an array
>>> np.mean(my_array) Mean of the array Columns
>>> np.median(my_array) Median of the array CountryCapitalPopulation A two-dimensional labeled
Correlation coefficient data structure with columns
>>> my_array.corrcoef() 0 BelgiumBrussels11190846
Standard deviation of potentially different
>>> np.std(my_array)
Index
1 IndiaNew Delhi1303171035 types.
2 BrazilBrasília207847528

Also see NumPy Arrays >>> data = {'Country': ['Belgium', 'India', 'Brazil'], 'Capital':
>>> s['b'] Get one element ['Brussels', 'New Delhi', 'Brasília'], 'Population': [11190846,
-5 1303171035, 207847528]}
>>> df[1:] Get subset of a DataFrame
>>> df = pd.DataFrame(data, columns=['Country', 'Capital',
Country Capital Population 1 India New Delhi 1303171035 2 Brazil Brasília

207847528
'Population'])
Selecting, Boolean Indexing & Setting
Getting
By Position
>>> df.iloc[[0],[0]]
'Belgium'
Select a single value by row & column

Dropping
>>> df.iat([0],[0])
'Belgium' >>> s.drop(['a', 'c']) Drop values from rows (axis=0)
By Label >>> df.drop('Country', axis=1) Drop values from columns(axis=1)
>>> df.loc[[0], ['Country']]
'Belgium'
Select single value by row & column labels Sort & Rank
>>> df.at([0], ['Country']) >>> df.sort_index() Sort by labels along an axis
'Belgium' >>> df.sort_values(by='Country') Sort by the values along an
>>> df.rank() axis Assign ranks to entries
By Label/Position
Select single row of subset of rows
>>> df.ix[2]
Country Brazil
Retrieving Series/DataFrame Information
Capital Brasília
Basic Information
Population 207847528 Select a single column of subset of columns >>> df.shape (rows,columns)
>>> df.ix[:,'Capital']
>>> df.index Describe index
0 Brussels >>> df.columns Describe DataFrame
1 New Delhi >>> df.info() columns Info on DataFrame
2 Brasília Select rows and columns Number of non-NA values
>>> df.count()
>>> df.ix[1,'Capital'] Series s where value is not >1
'New Delhi' where value is <-1 or >2 Summary
>>> df.sum() Sum of values
Boolean Indexing

>>> df.cumsum() Cummulative sum of values


>>> s[~(s > 1)] >>> df.min()/df.max() Minimum/maximum values


>>> s[(s < -1) | (s > 2)] s Use filter to adjust DataFrame >>> df.idxmin()/df.idxmax()
Minimum/Maximum index value
>>> df[df['Population']>1200000000] >>> df.describe() Summary statistics
>>> df.mean() Mean of values
Setting >>> df.median() Median of values
>>> s['a'] = 6
Set index a of Series s to 6

@aadhi06 my profile
@cloudnloud @cloudnloud info@cloudnloud.com @cloudnloud cloudnloud
Community

Who we Are ?

Cloudnloud Tech Community was started by Vijayabalan a cancer survivor in 2011 with the notion to save

the children diagnosed with cancer by creating a sustainable source of funds for their treatment. The

community helps Indian IT aspirants to dream bigger than the monotonous job search system and instead

achieve great heights of life by building a name for themselves. Over the years, the community has emerged

globally and the members participate in various tech events across the globe as speakers on trending

technologies.
Officially registered as Pvt Ltd in Jan-2015, the community runs with a motto to help cancer children from

the revenue generated from the events/courses/programs organized. Today Cloudnloud is living that

dream with 7000 plus cancer children survivors.

About Me :

Am a tech noob developer when I get an error I find a cheat sheet for syntax.
With diversified skills in eating, sleeping, and coding... Sometimes even running.
Currently sniffing on metaverse, AI, ML, DS, AR/VR, and blockchain techs.
Aadhityaa S.B.
flowcv.me/aadhi

@aadhi06

@aadhi06 my profile

@cloudnloud @cloudnloud info@cloudnloud.com @cloudnloud cloudnloud

You might also like