Professional Documents
Culture Documents
Project Ip 2019-20
Project Ip 2019-20
Project Ip 2019-20
A Numpy array is simply a grid that contain values of the same homogeneous type.
1d array
A one dimensional array is named group of contiguous set of element having same data type.
2d array
Example :
Import numpy as np
List =[ 1,2,3,4]
Print (a1 )
Output:
[1,2,3,4]
Numpy refers to the dimensions of its array as axes . The axes of an ndarray also describe the order of
indexes in multi dimensional ndarrays.
Every new dimension added gets the next axis number ( as you can see in above given three
dimensional ndarray ndarray 3)
The data type and item size are related. The item size is as per the data type ex. For data type int16 , the
item size is 2 bytes ( equal to 16 bits).
Shape
The shape of an array is a tuple of integers giving the size 9f the array along each dimension.
Data type
Another important term associated with nd arrays is the data type , also called dtype which tells about
the type of data stored in the nearest. Recall that ndarray or numpy array store homogenous element
all element having same data type.
Syntax:
Example :
In: a. Shape
Out: (4,)
In : a2.shape
Out:(2,3)
Out : 4
In : a1. Dtype
To create ndarray from sequence of all types ( numeric sequence , or string sequence or dictionaries
etc.) , You can use fromiter () function.
Syntax:
Example :
The above statement will create an ndarray from the keys of dictionaries addict having numPy data
type inter. You can check it yourself by using attributes dtype and item size with the nd array name are
Also , now you can either access individual element of nd array are using indexes or display it fully , just
the way you normally do.
In : ar5.dtype,ar5.item size
Out: (style(‘int32’),4)
In [2]: print(ar5[0],ar5[3])
Out: 1 4
Out: [1,2,3,4,5]
Example 2
In[3] : print(ar6)
[ ‘t’ ‘h’ ‘I’ ‘s’ ‘I’ ‘s’ ‘m’ ‘y’ ‘h’ ‘o’ ‘m’ ‘ e']
The arrange function is similar to Python range function but it return am adarray in place of Python list
returned by range of python. In other word , the arrange creates numpy array with evenly spaced values
within a specific numerical range.
Example :
Arr5=np. Arrange(7)
Print (arr5)
Syntax:
Example :
Print (a1)
Array slicing
Array slicing refers to the process of extracting a subset of element from an existing array and
returning the result as another array , possibly in a different dimension from the original.
1D array slicing
Array [3:7]
Out:. ([ 8,10,12,14])
Array [:5]
Out: [2,4,6,8,10])
2D array slices
In: Array
[12,14,16,18,20],
[22,24,25,28,30],
[32,34,36,38,40] ])
[18,20],
[28,30] ] )
The above example showed you the joining of 1d arrays. You can also use hstack() and vstack()
Example :
[3,4,5] ] )
[13,14,15] ] )
Array 3
[3,4,5],
[10,11,12],
[13,14,15] ] )
Array 4
Series
A series is a pandas data structure that represent a one dimensional array like object containing an
array of data of any numpy data type and an associated array of data labels, called it’s index.
Data frame
A data frame is a two dimensional labelled array likes, pandas data structure that stores an ordered
collection column that can store data of different types.
The min() and max() function find out the minimum or maximum value respectively from a given set of
data frame in our case.
<Data frame >.Min( axis= none, skipna =none, numeric only = none)
Parameter:
Axis. (0 or 1) by default , minimum or maximum is calculated along axis 0 that is index (0) , column (1).
Skipan. (True or false ) exclude NA/null values when computing the result.
Numeric only. (True or false) include only float , int , boolean column, if none will attempt to use
everything then is only numeric data.
Example :
In : sal_df.min(). In : sal_df.max()
Out:. Out:
Out:
2016. 56000.0
2017. 59000.0
2018. 58500.0
2019. 61000.0
Dtype : float 64
The function count() count the nom -NA entries for each row and column. The values none ,NaN ,NaT
etc. Are considered as NA in pandas .
Syntax:
Example :
In : sal_df. Count ()
Out:
2016. 4
2017. 4
2018. 4
2019. 1
Dtype: integer 64
The function sum() returns the sum of the values for the requested axis. The syntax for sum() is:
<data frame >. Sum (axis=none , skip na= none , numeric only =none , min count =0)
Example :
In : sal_df. Count ()
Out:
2016. 186500.0
2017. 207000.0
2018. 221000.0
2019. 61000.0
Dtype ‘ float64
What are quantiles?
Quantile are points in distribution that relate to the rank order of values in that distribution.
The quantile of a value is the fraction of observation less than or equal to the value . The quantiles of the
median is 0.5 , by definition . The 0.25 quantiles ( also known as the 25 percent ; percentiles are just
quantiles multiplied by 100) and the 0.75 are known as quartiles and the difference between them I.
Inter quartile range.
Example :
Out:
The var() function computer variance and returns unbiased variance over requested axis.
Example :
In : sal_df.var ()
Out:
2016. 8.022917e+07
2017. 5.299000e+07
2018. 1.075000e+07
2019. NaN
Dtype : float 64
Creating histogram
You have read about histogram in the frequency diagrams ( statistics under economics ) in class xi .
For a continues dataset , you can create histogram using Python pandas hist () function.
There are many ways to create histograms in Python in this section , we shall talk about hist() of pandas
and in a later chapter , we shall create histogram using pyplot library histogram method.
A histogram is a plot that lets you discover , and show , the underlying frequency distribution ( shape ) of
a set of continues data . Consider the following histogram that has been computed using the following
dataset containing ages of 20 people .
To create histogram form a data frame , you can use hist () function of data frame , which draws one
histogram of the data frame column . This functions calls pyplot library hist() , on each series in the data
frame resulting in one histogram per column.
Syntax:
Parameters:
Column. String or sequence ; if passed , will be used to limit data to a subset of column.
By. Object , optional ; if passed , then used to form histogram for separate groups.
Bins. Integer or sequence , default 10; number of histogram bins to be used , if an integer is given,
bina+1 bin edges are calculated and returns if bins is sequence , gives bin edges , including left edge of
first bin and right edge of last bin. In this case , bins is returned unmodified.
● apply( ) is a series function, so it applies the given function to one row or one column of the dataframe
( as single row/ column of a dataframe is equivalent to a series ).
● apply map ( ) is an element function ,so it applies the given function to each individual element ,
separately – without taking into account other elements.
Syntax:
Parameters
Funcname() : the function to be applied on the series inside the data frame I.e, on rows and columns . It
should be a function that work with series and similar object.
Example: apply( )
Out [48]:
2016 46625.0
2017. 51750.0
2018. 55250.0
2019. 61000.0
Dtype :. Float 64
Out 47:
For apply ( ) , by default , the axis is 0 , the function is applied on individual column. To apply the function
row wise , you may write :
Example :
Out [58] :
Qtr1. 48725.000000
Qtr2. 51033.333333
Qtr3. 53666.666667
Qtr4. 55500.000000
Dtype : float 64
Function group by ( )
The group by( ) function rearrange data into groups based on some criteria and stores the rearranged fat
in a new group by object . You can apply aggregate function on the group by object using agg ( ) .
Syntax:
Axis. {0 or Index , 1 or column } , default 0 ; slit among row (0) or column (1).
If you want to created groups on multiple column , all you need to do is to pass a sequence to group by (
) that contain the names of first group column, followed by second group column , followed by the third
and so on.
Often in data science you need to have summary statics in the same table . You can achieve this using
agg( ) method on the group by object crested using group by ( ) method.
The agg( ) method aggregates the data od the data frame using one or more operation over the
specified axis. The syntax for using agg( ) is :
Parameters
Axis. ( 0 or index , 1 or column } , default 0 ;.If 0 or index : apply function to each column . If 1 or
column : apply function to each row.
When you create a data frame object , it gets its row number ( or the index ) and column labels
automatically . But sometimes we are not satisfied with the row and column labels of a data frame . For
this, pandas offers you a major functionality. You may change the row index and column labels as and
when you require.
Row and column refers to labels of axis 1 I.e column labels . There are many similar methods provided by
pandas library that help you change rearrange , rename index’s or column labels . Thus , you should read
following lines carefully to know the difference between the working of these methods .
1. method that simply renames the index and/ or column labels in a datafram
2. Reindex ( ) a method that can specify the new order of existing indexes and column label anr
also create new indexes column labels.
3. Reindex_like ( ) A method for creating indexes / column – labels based on other dataframe
object.
The rename () function renames the existing indexes column labels in a data frame . The old and new
index column labels are to be provided in the form of dictionary where keys are the old indexes row
labels and the values are the new names for the same e,g
Syntax:
Parameters
In place. Boolean default false ( which returns a new dataframe with renamed I Dex
labels . If true , then changes are made in the current data frame and new data frame is not returned.
Labels array – like , optional; New labels / index to conform to the axis specified by axis to.
Index , column array – like ,optional ; New labels / index to conform to, should be specified using
keywords. Preferably an Index object to avoid duplicating data.
Fill_ value the values to be filled in the newly added row/ columns
Ndf.reimdex([‘Qtr4’,’Qtr1’,’Qtr3’,’Qtr1’])
Ndf.reindex([‘Qtr4’,’Qtr1’,’Qtr3’,’Qtr2’],axis=0)
The reindex like( ) function works on a data frame and reindexed it’s data as per the argument data
frame passed on it . This function does following things:
(a) If the current data frame has some matching row indexes columns labels as the passed data
frame , then retain the index label and it’s data.
(b) If the current data frame has some row indexes columns labels in it which are not in the passed
data frame , drop them.
(c) If the current datagrams does not have some row indexes/ column labels which are in the passed
data frame , then add them to current data frame with values as NaN
(d) The reindex_like ( ) ensure that the current data frame object conforms to the same indexes /
labels on all axes.
Syntax:
Parameters
Other. Name of a data frame as per which current <data frame > is to reindexed.
Now if you reindex ndf2 as per sal_df, you will issue command as :
namely - histogram, bar charts, power spectra, error charts etc. It is used along with NumPy to provide
an environment for matlab.
Pyplot provides the state-machine interface to the plotting library in matplotlib. It means that figures
and axes are implicitly and
calling plot from pyplot will automatically create the necessary figure and axes to achieve the desired
plot. Setting a title will then automatically set that title to the current axes object. The pyplot
Example program
import numpy as np
per = [94,85,45,25,50,54]
index = np.arange(len(label))
plt.bar(index, per)
rotation=30)
plt. show()
Histogram in Python –
There are various ways to create histogram in python pandas. One of them is using matplotlib python
library. Using this library we can easily create histogram. We have to write just few statements to create
histogram. So install matplotlib library using following statements at command prompt.>pip install
matplotlib
After installation we can create histogram. If pip does not work then copy the pip.exe file to the folder
where we want to run the above command or move to the folder ofpip.exe then write above command
E.g. Program in python. Develop a python program with below code and execute it.
import numpy as np
data = [1,11,21,31,41]
edge color="red")
plt.sh.
Scatter plots
A scatter plot is a two-dimensional data visualization that uses dots to represent the values obtained for
two different variables - one plotted along the x-axis and the other plotted along the y-axis.
Example program
weight1=[93.3,67,62.3,43,71,71.8]
height1=[116.3,110.7,124.8,176.3,137.1,113.9]
plt. show()
application
The pandas library of Python offers many handy function and functionally , which are very useful for
carrying out many data science related tasks. One such functionality is function application . By function
application , it means that a function ( a library function or user defined function) may be applied on a
data frame in multiple ways:
For the above mentioned three types of function application , pandas offers following three functions
(a) Pipe () data frame wise function application.
(b) Apply () row wise / column wise function application.
(c) Apply map () individual element wise function application.
Other than the above three there are two more function application function: aggregation through
group by() and transform() .
Agile software development refers to the software process models that are : people focused ,
communication, flexible ( ready to adapt to expected change at any time ),speedy (encourage rapid and
iterative development of the product in small release), lean ( focused on shortening timeframe and cost
and on improved quality), responsive ( react appropriately to expected and unexpected changes )and
learning ( focuses on improvement during and after product development ).
Pair programming
Pair programming is a practice of software development wherein two programmers work in pair to
develop the software while sitting at the same workstation . One programmer thinks and the other
codes. Both programmers keep swapping their roles .
• User Satisfaction
• Flexibility in development
Scrum
In software terms scrum is an agile software product development strategy that organized software
developers as a team to reach a common goal of creating a ready for market product
The Sprint
The main part of Scrum is a Sprint, where a useable and potentially releasable product Increment is
created. Sprints can be of one week to one month in length. There are three events each Sprint:
Daily Scrum – The Development Team meets for 5 to 15 minutes daily to inspect progress toward Goal.
Sprint Review – The Team review about task competed as the per Backlog.
Sprint Retrospective – The Team discusses on right and wrong development, and how to improve
The use case diagram are a formal way of representing how a business system interacts with its
environment by illustrating the activities that are performed by the users of the system.
The main purpose of use case diagram is to determine the functional requirement of a system.
Various element of use case diagram are actors, use cases, communication and relationships.
1. Actor. It is a person or thing which are outside the system and are involved in a task e.g, in a
banking system , account holder are not part of the system but they are involved in banking
tasks such as deposits , withdraw and so forth .
2. Use case . It represent a task of the system with which actors interact in other word it is the
way to deposit one functionality of the system.
3. Communication. It is the linking line joining an actor and it’s task.
4. Boundary of system. It depict the system in totality that is what all use cases together make a
system . Remember actor are not part of system . They are outside the system and use the
system.
5. Types of relationship. Two use cases have own relationship without involving any external
actor . Use cases generally have relationship of the following three types
(a) Include relationship.
(b) Extend relationship
(c) Generalisation relationship.
Communication
Actor. Actor
Buy ticket
Use case
• <<include>> Use Case The time to use the <<include>> relationship is after you have completed the
first cut description of all your main Use Cases
• <<extend>> Use Case :The <<extend>> use case accomplishes this by conceptually inserting additional
action sequences into the base use-case sequence.
• Abstract and generalized Use Case The general use case is abstract. It can not be instantiated, as it
contains incomplete information. The title of an abstract use case is shown in italics
Version control system are a specific , specialized set of software tools that helps a software team
manage changes to source code over time
Features of VCS
• Easy to trace changes in code to find the version that introduced a bug
Software engineering
Software engineering is a structured , systematic approach for the design , development and
maintenance of software system.
Software process refers to a set of logically related activities , which are carried out in a systematic
order that leads to the production of the software to be delivered .
There are some fundamental activities that are common to all software processes., Known as process
activities . These are:
The waterfall model is a linear and sequence approach of software development where
software develop systematically from one phase to another in a downward fashion . This model
is divided into different phases and the output of one phase is used as the input of the next
phase.
Advantages of waterfall model
Easy to understand
Easy to arrange task
Clearly defined stages
Easy to manage
Well understood milestones
Evolutionary model
The evolutionary model is a rapid software implementation is rapidly developed from very
abstract specification , which is them iteratively modified according to the users appraisal of the
software .
Advantages of Evolutionary Model
Error reduction: because versions are tested at each incremental cycle
User satisfaction: Users are given full chance of experimenting partially
developed system.
High quality: Quality is maintained due to thoroughly testing.
Low risk: There is significant reduction of risk as a versions is implemented.
Reduction Cost: It reduces cost by providing structured and disciplined steps.
Disadvantages
Requirement changes effect the software development.
Control over the system evolution is lost
Delivery model
The delivery model that iteratively update system for changes are also known as process
iteration model
There are two such delivery models that support process iteration
Incremental model
The spiral model comprises of software process activities organised in a spiral and has many
cycles. This model combines the features of the prototyping model and waterfall model and is
advantages for large complex and expensive software system.
SQL
In order to access data within the Oracle database , all program and users must use structured query
language (SQL) . SQL is the set of command that is recognised by nearly all RDBMSs. SQL command
can be divided into following categories .
DDL command . The data definition. Language command as the name suggest allow you to perform
tasks related to data definition . That is through these command you can perform tasks like create, alter
and drop schema object , grant and revoke privilege etc.
DML command. As the name suggest are used to manipulate data. That is DML command query and
manipulate data in existing schema objects.
TCL. Command. That allow you to manage and control the transaction ( a transaction is one complete
unit of work involving many steps) example:
Creating table
Order date,
Customer id integer,
Amount integer,
SQL join
Syntax:
Example :
Syntax:
Example :
To update the roll and QOH for items having code less than ‘ 1040 ‘ we shall write
Update items
Set roll=400,QOH=700
To delete some data from tables , you can use SQL DELETE command . The delete command robes
row from a table . This removes the entire row not individual field values so no field argument is
needed or accepted .
Ex:
Altering tables
Syntax:
Example
Group by function
The group by clause combines all those records that have identical values in a particular field or a group
of field . This grouping result into one summary record per group if group function are used with it . In
other word the group by clause is used in select statement to divide the table into groups. Grouping
can be done by column name or with aggregate function in which case the aggregate produces. A values
for each group.
For example to calculate the number of employee in each grade you use the command
From employee
Group by job
Output:
Job Count(*)
Analyst 2
Clerk 4
Manager 3
President 1
Salesman 4
Now consider the following query which is also grouping records based on deptno
From employee
Group by deptno;
Output:
Having function:
The having clauses places condition on groups in contract to where clause that place condition on
individual tow . While where condition cannot include aggregate function , having condition cam do so.
For example :
To display the jobs where the number of employee is less than 3 you can use the command:
From employee
Group by job
Group by deptno
Consider the following table named “ soft drink “. Write command of SQL .
6. To display names and drinks codes of those drinks that have more than 120
calories .
7. To display drink codes , names and calories of all drinks , in descending order of
calories.
8. To display names and price of drinks that have price in the range 12 to 18 ( both
12 and 18 included).
9. Increase the price of all drinks in the given table by 10%.
Answers
1. Select DNAME , DRINKCODE
From SOFTDRINK
FROM SOFTDRINK
4. UPDATE SOFTDRINK
Set price = price + 0.10 * PRICE;
Answer :
The general principles of the Agile Method
• Developers and user must work together throughout the software development.
• At regular intervals, the team focus on how to become more effective, so as tune and adjust
Answer :
e.g. program
import numpy as np
A = np .array([1,2,3,4,5,6])
print(B)
OUTPUT
[[1 2 3]
[4 5 6]]
Manager publisher
Product
manager Chief editor
Manage titles
Create orders
Online customer