Download as pdf or txt
Download as pdf or txt
You are on page 1of 116

Hindi Vidya Prachar Samiti’s

Ramniranjan Jhunjhunwala College of Arts, Science


& Commerce
(Empowered Autonomous College)

Affiliated to
UNIVERSITY OF MUMBAI

DEPARTMENT OF INFORMATION TECHNOLOGY


2023-2024

t.Y.B.Sc. (IT) SEM VI

PAPER RJSUITp603 – Business intelligence

Name:- Shivam Vishwakarma

Roll No:- 6415


Shivam Vishwakarma 6415 BI

Hindi Vidya Prachar Samiti’s


Ramniranjan Jhunjhunwala College of Arts, Science &
Commerce
(Empowered Autonomous College)

Affiliated to
UNIVERSITY OF MUMBAI

This is to certify that Mr./Ms. Vishwakarma Shivam Suresh Sushila,


Roll No. 6415 of TY BSc IT class has completed the required number of
Experiment of Practical Business intelligence, in partial
fulfillment of the Requirements for the award of the degree of Bachelor
of Science (Information Technology) during the academic year 2023-2024.

College seal Sign of Co-ordinator

R.J.College
2
Shivam Vishwakarma 6415 BI

Index
Business Intelligence
2023-24

Sr. Title Date Remark


No.

1 Data Analysis using Excel 07/12/2023

2 What-if Analysis using Excel and Power BI 14/12/2023

3 Creating Pivot Table and Pivot Chart using Excel and Power 04/01/2024
BI

4 Import data from legacy data structures. 11/01/2024

5 ETL and Data Modeling in Power BI 11/01/2024

6 Creating Reports and Charts in Power BI 11/01/2024

7 Creating Dashboards in Power BI 11/01/2024

8 DAX Queries in Power BI 18/01/2024

9 Implementation of Regression Algorithms using Python 25/01/2024

10 Implementation of Classification Algorithms using Python 08/02/2024

11 Implementation of Clustering Algorithms using Python 23/02/2024

R.J.College
3
Shivam Vishwakarma 6415 BI

Practical 1 - Data Analysis using Excel

Dec 7, 2023
Demonstrate the use of various features of Excel that are required to do the data analysis.
Example: Result Analysis

Result Analysis

Suppose, the data set we are having initially

Marks

Roll no Name Sub 1 Sub 2 Sub 3 Sub 4

1 Vikas Yadav 25 35 60 56

2 Amit Mishra 45 60 55 78

3 Sunita Yadav 56 42 40 84

4 Priyanka Gupta 45 35 20 38

5 Anita Shetty 37 40 35 90

6 Ranjan Yadav 56 56 35 60

7 Ankita Mishra 34 30 60 61

8 Vaibhav Yadav 89 78 90 89

9 Soniya Gupta 24 78 35 45

10 Sunil Shetty 45 56 40 89

Operation 1
We wanted to have the total marks of the students

Use the Sum formulas


1) =SUM(start cell: end cell)

R.J.College
1
Shivam Vishwakarma 6415 BI

Or
2) =(cell1 + cell2 + cell3 + ….)

You can see the sum of marks from sub 1 to sub 4 is present in the total mark column

Now for all the others, you don't need to write the formula again and again.
Just select the cell where you have applied the formula and drag the cursor to the column to
which you want to apply the same formula.

R.J.College
2
Shivam Vishwakarma 6415 BI

After the release of the cursor, you can see the total marks of all student in their respective total
column

Operation 2
Now, let's derive Percentage of the Students
As percentage is calculated as
(mark obtained in all subjects/total marks of all subjects) * 100

R.J.College
3
Shivam Vishwakarma 6415 BI

Operation 3
As the percentage is showing in decimal points what if I want the percentage to be a whole
number only
Apply formula
=ROUNDUP(Cell no, digits to round up)

You can see the percentage in whole number only

Operation 4
Suppose, you want to mark the students who are failing in the individual subjects,
The criteria for a student to be filed in a student if he is marks less than 40

Step 1: Select your table

R.J.College
4
Shivam Vishwakarma 6415 BI

Step 2: go to the ‘Data’ tab and in the Data, tab select the ‘filter’ option

After clicking on the filter option, you can see the table heading a down arrow symbol in the
column that is filtering options

Step 3: click the arrow of the ‘sub 1’ column uncheck the select all option select only the marks
that match the fail criteria and click on OK

R.J.College
5
Shivam Vishwakarma 6415 BI

You will be only able to see the students who have less than 40 marks in sub 1
Select the marks of sub1 and mark them in red color this student fails in sub 1 because they got
less than 40 marks

Now, again click on the sub 1 column arrow select all, and click on ok, and all the records will
be visible this is important if you forget then you will operate only on the selected data not on
entire data.

R.J.College
6
Shivam Vishwakarma 6415 BI

After doing this do the same form sub 2, sub 3, and sub 4 columns of marks

After filtering data for failed students,


We can say the the students who are marked in red color in the sub mark column failed in that
particular subject

R.J.College
7
Shivam Vishwakarma 6415 BI

Suppose, you want to remove that data filtering option again select the data table go to the Data
tab menu, and unselect the filter option and you will see the filtering option is now removed from
columns headings

Operation 5
Let Highlight the percentage marks which are greater than 80%

Select the cells

Go to Home tab → Conditional Formatting → Highlight cell rules → Greater than

I have formatted as font color - orange and style bold

R.J.College
8
Shivam Vishwakarma 6415 BI

After clicking on OK

Operation 6
Let Highlight the percentage marks which are lesser than 45%
Again select the cells
Go to Home tab → Conditional Formatting → Highlight cell rules → Less than

R.J.College
9
Shivam Vishwakarma 6415 BI

Less than 45% with custom formatting font-color red and style bold

Result

R.J.College
10
Shivam Vishwakarma 6415 BI

Operation 7
Now,
Let look for the Highest and Lowest scorer from the total mark obtained by the students
We will mark the cell as Yellow for the highest scorer and red as the lowest scorer

Select the cells

Go to Home tab → Conditional Formatting –. Top/Bottom Rules → Top 10 Items

The below following dialog bax will open give the no. of time you want and with which
formatting style
I gave top 1 with Yellow fill and Dark Yellow text click on OK

R.J.College
11
Shivam Vishwakarma 6415 BI

And you come to see the highest scorer in yellow

And do for Lowest scorer

Select the cells


Go to Home tab → Conditional Formatting –. Top/Bottom Rules → Bottom 10 Items

R.J.College
12
Shivam Vishwakarma 6415 BI

Operation 8
Let find the top 10% scorer from each subject from above lessons
But now we will use custom formatting

R.J.College
13
Shivam Vishwakarma 6415 BI

I have choose color blue and fill light blue

R.J.College
14
Shivam Vishwakarma 6415 BI

Press on OK and Again OK

And you will see


Top 10% of each sub in blue cell formatting

R.J.College
15
Shivam Vishwakarma 6415 BI

Operation 9
Let go through the this three option this options used for formatting the cells

1) Data Bar

R.J.College
16
Shivam Vishwakarma 6415 BI

2) Color Scales

3) Icon sets

R.J.College
17
Shivam Vishwakarma 6415 BI

From clear rules Options you can clear the conditional formatting rules from the cells

Operation 10
Data Visualization
Now,
Select total Marks column
Go to Insert → Charts ans select bar graph

This charts will appear

R.J.College
18
Shivam Vishwakarma 6415 BI

Format the graph as per our preference

Another chart example

Line chart example

R.J.College
19
Shivam Vishwakarma 6415 BI

Link to Spreadsheets
https://docs.google.com/spreadsheets/d/1P-j0sW05YQ_-
scZfc0OyWJEIkTMXljc4/edit?usp=drive_link&ouid=106011563062589671609&rtpof=true&sd
=true

https://docs.google.com/spreadsheets/d/1GtJMYp1Kb6Vww85lPQPAa8AUg-
3M593h/edit?usp=drive_link&ouid=106011563062589671609&rtpof=true&sd=true

R.J.College
20
Shivam Vishwakarma 6415 BI

Practical 2 - What-if Analysis using Excel and Power BI

Dec 14, 2023


Demonstrate what-if analysis using goal seek, scenario manager, and data table for your data.

Example 1 - Goal Seek


Semester wise SGPA scored

Problem Statement - Suppose you are in semester 6 and after completion of your degree to
wanted to achieve the aggregate CGPA of 9.5. you want to know how much you have to score in
SEM 6 to get the the aggregate CGPA of 9.5

Go to Data → What-If Analysis → Goal Seek..

This window will appear

Set cell → final CGPA cell


To value → CGPA score you want to achieve

R.J.College
21
Shivam Vishwakarma 6415 BI

By changing cell → Sem 6 Grade point cell

Click on OK

Conclusion
If you want to achieve an Aggregate og 9.5 CGPA then you need to score 9.20 SGPA in SEM 6

Example 2 - Goal Seek

You have given the profit of a Company ABC for respective Academic Year
You want to know how much you have to make profit in year 2024 - 25 to make profit of
Rs 25,00,000

R.J.College
22
Shivam Vishwakarma 6415 BI

Go to → Data → What-If analysis → Goal Seek..

Set cell → total


To value → 25 lakh
By changing cell → profit of year 2024 - 25

Conclusion
It means you have to make a profit of Rs 5,98,000 to make an aggregate profit of Rs 25,00,00 by
the year 2024 - 25

R.J.College
23
Shivam Vishwakarma 6415 BI

Example 3 - Scenario Manager

Go to
Data → What-If Analysis → Scenario

Click on Add

R.J.College
24
Shivam Vishwakarma 6415 BI

Name your scenario and changing cell in my case the no of customer will be changing cell
Click on Ok

Give the no of customer value (75 in my case)

You can see my scenario got added to scenario manager

R.J.College
25
Shivam Vishwakarma 6415 BI

Add few more Scenarios


I have total 3 scenario’s

Select a scenario and click on Show


I choose scenario for 84 customers

R.J.College
26
Shivam Vishwakarma 6415 BI

Example 4 - Scenario Manager

R.J.College
27
Shivam Vishwakarma 6415 BI

Total Column having formula


Rate * Qty

And Total Row is also having formula


Wheat’s Total + Rice Total

R.J.College
28
Shivam Vishwakarma 6415 BI

Scenario 1 Result

Scenario 2 -Result

R.J.College
29
Shivam Vishwakarma 6415 BI

Example 5 - Data Table

Balance

Interest

R.J.College
30
Shivam Vishwakarma 6415 BI

One variable data table Example

Activity 1 → Calculate balance for different Initial Investment

Select the cells like below manner

Go to
Data → What-If Analysis → Data Table

The below dialog box appear select the original initial Investment i.e 2000 click on Ok

And you will get Balance for different Initial Investment

R.J.College
31
Shivam Vishwakarma 6415 BI

Activity 2 → Calculate balance for different Annual Rate of Interest

Select the cells like below manner

Go to
Data → What-If Analysis → Data Table

Select Column input cells as the original annual rate of Investment i.e 5%

Activity 3 → Calculate balance for different No. of Years


Select the cells like below manner

R.J.College
32
Shivam Vishwakarma 6415 BI

Go to
Data → What-If Analysis → Data Table

Select Column input cells as the original No. of year i.e 5

Tow variable data table example

Activity 1 → Calculate Balance and Interest for Different Initial Investment


Select the cells like below manner

R.J.College
33
Shivam Vishwakarma 6415 BI

Go to
Data → What-If Analysis → Data Table

The below dialog box appear select the original initial Investment i.e 2000 click on Ok

Example 6 - Data table

R.J.College
34
Shivam Vishwakarma 6415 BI

Activity 1 → Check total profit for the following Highest price

Select the cells like below manner

Go to
Data → What-If Analysis → Data Table

The below dialog box appear select the cel l of % unit sold at Highest price i.e 60%

R.J.College
35
Shivam Vishwakarma 6415 BI

Activity

Select the cells like below manner

Go to
Data → What-If Analysis → Data Table

The below dialog box appear


As the rows are representing Highest price
Select row input cell as 50 the original highest price
And column input sell as the original % sold i.e 60

And the Result

R.J.College
36
Shivam Vishwakarma 6415 BI

Link to Spreadsheet
https://docs.google.com/spreadsheets/d/1ITi_MkKjpcGj8WzbrNvjEEiuPHwOmnAZ/edit?usp=d
rive_link

R.J.College
37
Shivam Vishwakarma 6415 BI

Practical 3 - Creating Pivot Table and Pivot Chart using Excel and Power BI

Jan 4, 2024
Demonstrate the creation of one dimensional and two-dimensional pivot table and pivot chart to
perform analysis using Microsoft Excel and PowerBI for any sample data like fruits sale data.
Pivot Table

Order id Product Category Amount Country

1 Carrot Veg 4 United States

2 Broccoli Veg 8 United States

3 Banana Fruit 1 United States

4 Banana Fruit 3 Canada

5 Beans Veg 3 Germany

6 Orange Fruit 4 United States

7 Broccoli Veg 7 Australia

8 Banana Fruit 2 New Zealand

9 Apple Fruit 2 France

10 Carrot Veg 3 United States

11 Broccoli Veg 3 Canada

12 Banana Fruit 3 Germany

13 Banana Fruit 3 United States

14 Beans Veg 3 Australia

15 Orange Fruit 3 New Zealand

R.J.College
38
Shivam Vishwakarma 6415 BI

16 Broccoli Veg 3 France

17 Banana Fruit 3 Canada

18 Apple Fruit 3 New Zealand

Select your Entire Table

Same select Your Entire Table

R.J.College
39
Shivam Vishwakarma 6415 BI

R.J.College
40
Shivam Vishwakarma 6415 BI

Practical 4 - Import data from legacy data structures

Jan 11, 2024


Aim: Import the legacy data from different sources such as (Excel, Web, XML, etc) in PowerBi.

Import the legacy data from different data sources such as (Flat file, Excel, Web, XML, JSON,
OData, etc. Perform the extraction transformation and Loading (ETL) process to load in
PowerBi)

Note: Use your data for Flat File, Excel, Web, XML, JSON, OData

File for oData - https://services.odata.org/v2/northwind/northwind.svc/

1) Excel
Data in Excel

Open power BI
Get Data → All → Excel

R.J.College
41
Shivam Vishwakarma 6415 BI

Click on Connect

Select your desired file xlsx(Excel) from file explorer

Click on Load

2) XML

Content of XML file

R.J.College
42
Shivam Vishwakarma 6415 BI

Open Power BI
Get Data → All → XML

Click on Connect

R.J.College
43
Shivam Vishwakarma 6415 BI

Click on Load

3) JSON

Content in JSON file

R.J.College
44
Shivam Vishwakarma 6415 BI

Open Power BI
Get Data → All → JSON

Click o Connect

R.J.College
45
Shivam Vishwakarma 6415 BI

R.J.College
46
Shivam Vishwakarma 6415 BI

Practical 5 - ETL and Data Modeling in Power BI

Jan 11, 2024


Aim: Perform Extract, Transformation, and Loading processes to construct the database in
Power BI.

Create a data model for the student database in Power BI and import data from an Excel
worksheet. Also, perform ETL and prepare data for result analysis.

Loading and Transformation


1. Load Financial Sample Data into PowerBI
→ Open Power Bi
→ Select Get data

→ Navigate to our Desired Excel Workbook (filename.xlsx) and select that

→ you will see the Excel Sheets Data

R.J.College
47
Shivam Vishwakarma 6415 BI

1. Create the data model by relating all tables if possible


→ Both the sheets are having same data need not to do Data relation

2. Open the financial data table for transformations.

R.J.College
48
Shivam Vishwakarma 6415 BI

3. Transform the ‘Units Sold’ columns data by changing its data type to the whole number.
Select Unit Solds column

R.J.College
49
Shivam Vishwakarma 6415 BI

4. Transform Segment column data to uppercase

R.J.College
50
Shivam Vishwakarma 6415 BI

5. Shorten the column name from month Name to just Month

6. The product montana is not continued last month. So exclude the data of the Montana
product from the table by deselecting the product from the column filter.

R.J.College
51
Shivam Vishwakarma 6415 BI

7. You see that each transformation has been added to the list under query Settings in
Applied Steps.

8. Back on the Home Tab, Select Close & Apply


R.J.College
52
Shivam Vishwakarma 6415 BI

And you will get the Final table after Data transformation

R.J.College
53
Shivam Vishwakarma 6415 BI

Practical 6 - Creating Reports and Charts in Power BI

Jan 11, 2024


Demonstrate the report creation in Power BI for data of any subject.
(For ex: Financial Sample and Retail Data)

Report Creation
1. Add report title as “Executive Summary - Finance Report”
● On the Insert Ribbon, select TextBox to add title to the report and type “Executive
Summary - Finance Report”.

● Select the text you typed. Set the font size to 20 and bold

● Resize the box to fit on line.

R.J.College
54
Shivam Vishwakarma 6415 BI

2. To Check Profit by Date , add a line chart to see which month and year had the highest
profit.

● From the fields pane, drag the profit field to the blank area on the report canvas. By
default, power bi display a column chart with one column, profit.
After dragging profit colum to canvas

R.J.College
55
Shivam Vishwakarma 6415 BI

● Drag the Date Field to the same Visual. If you created a Calender Table in Extra credit :
Create a table in DAX earlier in this article , drag the Date field from your Calender table
instead.
Power BI updates the column chart to show profit by the two years.

And After dragging Date colum of Calender Table

After Dragging Date colum same sheet to the same Visual

3. Change the visualization type to Line chart. (In the Visualizations pane change the chart
type)

Select the colum chart and click on Line chart in Visualizations


R.J.College
56
Shivam Vishwakarma 6415 BI

And your chart will be converted to line chart

4. Create a map visual to check the hight profit Country or Region.


● From the Fields pane, drag the Country field to a blank area on your report canvas
to create a map.
● Drag the Profit field to the map.
● Power BI creates a map visual with bubbles representing the relative profit of
each location. Compare the bubble size to identify the highest profit country.

R.J.College
57
Shivam Vishwakarma 6415 BI

5. Create a bar chart to check sales by product and segment and determine which companies
and segments to invest in.
● Drag the two charts you've created to be side by side in the top half of the canvas.
Save some room on the left side of the canvas.
● Select a blank area in the lower half of your report canvas.
● In the Fields pane, select the Sales, Product, and Segment fields.

Power BI automatically creates a clustered column chart.

R.J.College
58
Shivam Vishwakarma 6415 BI

● Drag the chart so it's wide enough to fill the space under the two upper charts.

6. Add date slicer to the report to check or filter the year wise or month wise data.
● In the Fields pane, select the Date field in the Financials table. Drag it to the blank
area on the left of the canvas.
● In the Visualizations pane, choose Slicer. Power BI automatically creates a
numeric range slicer.
● You can drag the ends to filter, or select the arrow in the upper-right corner and
change it to a different type of slicer.

Date slicer using the DAX table


● In the Fields pane, select the Date field in the Calendar table. Drag it to the blank
area on the left of the canvas.
● In the Visualizations pane, choose Slicer.
● In the Fields section of the Visualizations pane, select the drop-down in Fields.
Remove Quarter and Day so only Year and Month are left.

R.J.College
59
Shivam Vishwakarma 6415 BI

R.J.College
60
Shivam Vishwakarma 6415 BI

Practical 7 - Creating Dashboards in Power BI

Jan 11, 2024


Demonstrate the creation of a dashboard in Power BI for data on any subject.

Format the Report and chart created in Practical 6

Use View ->Change theme to Executive.

And edit the chart and report properties from Format visuals

R.J.College
61
Shivam Vishwakarma 6415 BI

Final Dashboard

R.J.College
62
Shivam Vishwakarma 6415 BI

Practical 8 - DAX Queries in Power BI

Jan 11, 2024


Execute the following DAX queries on food sales data.
1. Create a measure to display quarter quarter-wise sum of sales amount
Previous Quarter Sales = CALCULATE(SUM(Sales[SalesAmount]),
PREVIOUSQUARTER(Calendar[DateKey]))
2. Create a measure to display the Minimum sales amount (Sales Amount)
3. Create a measure to display the maximum sales amount (Sales Amount)
4. Display the maximum product cost of the product.
5. Display a measure to count distinct brand names.
6. Create a measure to display the total of blank values for product manufacturers.

R.J.College
63
Shivam Vishwakarma 6415 BI

Practical : Creation of PowertBI Data model and a basic report


Jan 11, 2024

Loading and Transformation


2. Load Financial Sample Data into PowerBI
→ Open Power Bi
→ Select Get data

→ Navigate to our Desired Excel Workbook (filename.xlsx) and select that

→ you will see the Excel Sheets Data

R.J.College
64
Shivam Vishwakarma 6415 BI

9. Create the data model by relating all tables if possible


→ Both the sheets are having same data need not to do Data relation

10. Open the financial data table for transformations.

R.J.College
65
Shivam Vishwakarma 6415 BI

11. Transform the ‘Units Sold’ columns data by changing its data type to the whole number.
Select Unit Solds column

R.J.College
66
Shivam Vishwakarma 6415 BI

12. Transform Segment column data to uppercase

R.J.College
67
Shivam Vishwakarma 6415 BI

13. Shorten the column name from month Name to just Month

14. The product montana is not continued last month. So exclude the data of the Montana
product from the table by deselecting the product from the column filter.

R.J.College
68
Shivam Vishwakarma 6415 BI

15. You see that each transformation has been added to the list under query Settings in
Applied Steps.

16. Back on the Home Tab, Select Close & Apply


R.J.College
69
Shivam Vishwakarma 6415 BI

And you will get the Final table after Data transformation

Writing DAX Expressions

1. Create a new measure name Total Units Sold to add all the numbers in the Units Sold
Column.
● On the Home ribbon , select New Measure.
● Type the Following Expressions.
Total Units Sold = SUM(financials[Units Sold])
● Select the check mark besides the expression box to commit the expression

R.J.College
70
Shivam Vishwakarma 6415 BI

Create a new table in data model


1. Create a new table named ‘Calender’ to generate a Calender table of all dates between
January 1, 2013, and December 31, 2014.

● Select The Data View on the left.


● On the Home ribbon, select New Table.
● Type the following expression in the expression box.
Calendar = CALENDAR(DATE(2013,01,01),Date(2014,12,31))
● Select the check mark to commit.
● Now select Model view on the left
● Update data model to link Financial Sample

R.J.College
71
Shivam Vishwakarma 6415 BI

Report Creation
7. Add report title as “Executive Summary - Finance Report”

R.J.College
72
Shivam Vishwakarma 6415 BI

● On the Insert Ribbon, select TextBox to add title to the report and type “Executive
Summary - Finance Report”.

● Select the text you typed. Set the font size to 20 and bold

● Resize the box to fit on line.

8. To Check Profit by Date , add a line chart to see which month and year had the highest
profit.

R.J.College
73
Shivam Vishwakarma 6415 BI

● From the fields pane, drag the profit field to the blank area on the report canvas. By
default, power bi display a column chart with one column, profit.
After dragging profit colum to canvas

● Drag the Date Field to the same Visual. If you created a Calender Table in Extra credit :
Create a table in DAX earlier in this article , drag the Date field from your Calender table
instead.
Power BI updates the column chart to show profit by the two years.

And After dragging Date colum of Calender Table

After Dragging Date colum same sheet to the same Visual

R.J.College
74
Shivam Vishwakarma 6415 BI

9. Change the visualization type to Line chart. (In the Visualizations pane change the chart
type)

Select the colum chart and click on Line chart in Visualizations

And your chart will be converted to line chart

R.J.College
75
Shivam Vishwakarma 6415 BI

10. Create a map visual to check the hight profit Country or Region.
● From the Fields pane, drag the Country field to a blank area on your report canvas
to create a map.
● Drag the Profit field to the map.
● Power BI creates a map visual with bubbles representing the relative profit of
each location. Compare the bubble size to identify the highest profit country.

R.J.College
76
Shivam Vishwakarma 6415 BI

11. Create a bar chart to check sales by product and segment and determine which companies
and segments to invest in.
● Drag the two charts you've created to be side by side in the top half of the canvas.
Save some room on the left side of the canvas.
● Select a blank area in the lower half of your report canvas.
● In the Fields pane, select the Sales, Product, and Segment fields.

Power BI automatically creates a clustered column chart.

● Drag the chart so it's wide enough to fill the space under the two upper charts.

R.J.College
77
Shivam Vishwakarma 6415 BI

12. Add date slicer to the report to check or filter the year wise or month wise data.
● In the Fields pane, select the Date field in the Financials table. Drag it to the blank
area on the left of the canvas.
● In the Visualizations pane, choose Slicer. Power BI automatically creates a
numeric range slicer.
● You can drag the ends to filter, or select the arrow in the upper-right corner and
change it to a different type of slicer.

Date slicer using the DAX table


● In the Fields pane, select the Date field in the Calendar table. Drag it to the blank
area on the left of the canvas.
● In the Visualizations pane, choose Slicer.
● In the Fields section of the Visualizations pane, select the drop-down in Fields.
Remove Quarter and Day so only Year and Month are left.

R.J.College
78
Shivam Vishwakarma 6415 BI

R.J.College
79
Shivam Vishwakarma 6415 BI

Practical 9 - Implementation of Regression Algorithms using Python

Jan 30, 2024


Simple Linear Regression
Problem Statement - Given the datasetsal2.csv that has two variables: salary (dependent variable)
and experience (Independent variable) solve
the following queries :
1. Find out if there is any correlation between two variables: salary and experience.
2. Find the best fit line for the dataset.
3. Demonstrate how the dependent variable is changing by changing the independent variable.

Multiple Linear Regression


Problem Statement - Given a dataset complist.csv of 50 start-up companies with five main
attributes: R&D Spend, Administration Spend, Marketing Spend, State, and Profit for a financial
year, create a model that can easily determine which company has a maximum profit, and which
is the most affecting factor for the profit of a company.

Link to Google Colab Notebook


https://colab.research.google.com/drive/1ydGtsNBCSsVHKDkk4rEMX8lMhnMWJ-
yw#scrollTo=DqRDVBd4SewG&uniqifier=

R.J.College
80
Shivam Vishwakarma 6415 BI

R.J.College
81
Shivam Vishwakarma 6415 BI

Checking Sample_data Folder

Basics

Working Directory

R.J.College
82
Shivam Vishwakarma 6415 BI

Variables

List

Functions

R.J.College
83
Shivam Vishwakarma 6415 BI

NUMPY - Library

R.J.College
84
Shivam Vishwakarma 6415 BI

R.J.College
85
Shivam Vishwakarma 6415 BI

R.J.College
86
Shivam Vishwakarma 6415 BI

Simple Linear Regression

R.J.College
87
Shivam Vishwakarma 6415 BI

R.J.College
88
Shivam Vishwakarma 6415 BI

MATPLOTLIB - Library

R.J.College
89
Shivam Vishwakarma 6415 BI

R.J.College
90
Shivam Vishwakarma 6415 BI

R.J.College
91
Shivam Vishwakarma 6415 BI

Collab File Link for Simple and Multiple


https://colab.research.google.com/drive/1JtOgJ_Q0OAKWAYHDDcbTC6UB6huBAcj-
#scrollTo=mSg2w7gf7SBe

Simple Linear Regression

Link to CSV File


https://drive.google.com/file/d/1YAsH2BpruK2lbagd0LT_ONbR3-2ap_fq/view?usp=drive_link

Given the dataset Salary_Data.csv that has two variables: salary (dependent variable) and
experience (Independent variable) solve the following queries

1) Find out if there is any correlation between two variables: salary and experience.
2) Find the best fit line for the dataset.
3) Demonstrate how the dependent variable is changing by changing the independent
variable.

R.J.College
92
Shivam Vishwakarma 6415 BI

R.J.College
93
Shivam Vishwakarma 6415 BI

R.J.College
94
Shivam Vishwakarma 6415 BI

R.J.College
95
Shivam Vishwakarma 6415 BI

Combine Code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Load the dataset


data = pd.read_csv('/content/sample_data/Salary_Data.csv')

# Display the dataset


print(data)

# Get summary statistics of the dataset


data_description = data.describe()
print(data_description)

# Extract features and target variable


x = data['YearsExperience']
y = data['Salary']

R.J.College
96
Shivam Vishwakarma 6415 BI

# Visualize the dataset


plt.scatter(x, y)
plt.xlabel('Years of Experience', fontsize=12)
plt.ylabel('Salary', fontsize=12)
plt.title('Scatter Plot of Salary vs Years of Experience', fontsize=14)
plt.show()

# Reshape features for sklearn


x = np.array(x).reshape(-1, 1)

# Initialize and fit the linear regression model


model = LinearRegression()
model.fit(x, y)

# Get the coefficients and intercept of the model


coef = model.coef_
intercept = model.intercept_
score = model.score(x, y)
print(f"Coefficient: {coef}\nIntercept: {intercept}\nR-squared:
{score}")

# Predictions
y_predict = model.predict(x)

# Visualize the regression line


plt.scatter(x, y)
plt.plot(x, y_predict, color='red')
plt.xlabel('Years of Experience', fontsize=12)
plt.ylabel('Salary', fontsize=12)
plt.title('Linear Regression Fit', fontsize=14)
equation = f"y = {intercept:.2f} + {coef[0]:.2f}x\nR-squared:
{score:.2f}"
plt.text(5, 15, equation, fontsize=16)
plt.show()

R.J.College
97
Shivam Vishwakarma 6415 BI

Multiple Linear Regression

Given a dataset complist.csv of 50 start-up companies with five main attributes: R&D Spend,
Administration Spend, Marketing Spend, State, and Profit for a financial year, create a model
that can easily determine which company has a maximum profit, and which is the most affecting
factor for the profit of a company.

Link to CSV File


https://drive.google.com/file/d/1WG__uKMZLiNTDFr-
V4iwg0iSmN53CneJ/view?usp=drive_link

R.J.College
98
Shivam Vishwakarma 6415 BI

R.J.College
99
Shivam Vishwakarma 6415 BI

R.J.College
100
Shivam Vishwakarma 6415 BI

Practical 10 - Implementation of Classification Algorithms using Python

Feb 8, 2024
Problem Statement: Classify the iris species by using the following algorithms.
(Given Dataset: iris.csv)
a. Logistic Regression
b. K-Nearest Neighbors (KNN)
c. Naive Bayes
d. Decision Tree
e. Support Vector Machine

Colab Filelink
https://colab.research.google.com/drive/163Tos5ap05SpnBa-5MzC5bC_aY8-
QZ9q#scrollTo=GZKGK6_j-NL3

Link to CSV file (iris.csv)


https://drive.google.com/file/d/136rOQftInylEGN6gLsVIST7dDkPUkHGN/view?usp=drive_lin
k

R.J.College
101
Shivam Vishwakarma 6415 BI

R.J.College
102
Shivam Vishwakarma 6415 BI

R.J.College
103
Shivam Vishwakarma 6415 BI

R.J.College
104
Shivam Vishwakarma 6415 BI

R.J.College
105
Shivam Vishwakarma 6415 BI

Practical 11 - Implementation of Clustering Algorithms using Python

Feb 15, 2024


a. K-means Clustering Algorithm
Given a dataset of Mall_Customers, which is the data of customers who visit the mall and spend
there, implement the k-means clustering algorithm.

b. Agglomerative Clustering
Implement an agglomerative hierarchical clustering algorithm using Python for Mall_customer
dataset. The dataset contains the information of customers that have visited a mall for shopping.
So, the mall owner wants to find some patterns or some particular behavior of his customers
using the dataset information. (Dataset: Mall_Customers.csv)

Link to Collab File


https://colab.research.google.com/drive/1PApHnDRIByUB0nxO1V7Hcjt5zvzHzvtg#scrollTo=C
A4MnPwvAiuf

R.J.College
106
Shivam Vishwakarma 6415 BI

R.J.College
107
Shivam Vishwakarma 6415 BI

R.J.College
108
Shivam Vishwakarma 6415 BI

R.J.College
109
Shivam Vishwakarma 6415 BI

R.J.College
110
Shivam Vishwakarma 6415 BI

R.J.College
111
Shivam Vishwakarma 6415 BI

R.J.College
112
Shivam Vishwakarma 6415 BI

R.J.College
113

You might also like