Professional Documents
Culture Documents
6433 Bi
6433 Bi
6433 Bi
Affiliated to
UNIVERSITY OF MUMBAI
Affiliated to
UNIVERSITY OF MUMBAI
This is to certify that Ms. Khan Ariba Mohammed Hakim , Roll No.
6433 of TY BSc IT class has completed the required number of
Experiment of Practical business intelligence, in partial
fulfilment of the Requirements for the award of the degree of
Bachelor of Science (Information Technology) during the academic
year 2023-2024.
3 Creating Pivot Table and Pivot Chart using Excel and Power BI 04/01/24
Practical No : 1
Demonstrate the use of various features of Excel that are required to do the data analysis.
Example: Result Analysis
Open Excel
FORMULAE:
Formula to calculate values of Total = SUM(C3:F3)
Formula to calculate values of Percentage = G3/400*100
Formula to calculate values of Rounded % = ROUNDUP(H3,0)
Data Bars : This will add the coloured data bar to represent the value.
Select the values of the Total column. Select Data Bars of Conditional Formatting from Home tab. Select
colour
Similarly for the Percentage column, but select Pie chart instead of Stacked Bar. The result will be
Select the values from Sub1 Marks to Sub2 Marks to represent it in Line Chart
Practical No : 2
Demonstrate what-if analysis using goal seek, scenario manager and data table for your own data.
Select the CGPA value. Select the Goal Seek from What-If Analysis of Data tab.
c. Academic year
2. Scenario Manager : This will create different groups of values or scenarios and switch between them.
a. Room and Amount
Click Add
Now add a scenario and select the value cell of No. of customers in the Changing cells textbox
b. Grain:
Enter the below data
Add scenario manager on value of Total named Monday where the rate of wheat will be change to 52 from 50
Add second scenario named Tuesday where the rate of rice will be changed to 76 from 10
Add a second scenario named Tuesday with the value of Rate /kg of Rice
Add the third scenario named Wednesday with the changing cell as the value of Rate /kg of Rice
3. Data Table : Displays the results of multiple inputs at the same time
a. Stationary:
Enter the below data with the formula for total profit = B4*C4+B5*C5
Make another table to find the total highest price as shown below.
Select the value cells
Go to the Data tab, now select the Data Table from What-If Analysis.
Enter the highest value of % sold as column input cell in the Data Table window.
Click OK
Now make a new table with the first value of Balance is equal to the value of Balance from original table
Apply Data Table to it as well and add column input cell as defined
Make a table
Apply Data Table, add both row and column input cell value as defined
Practical No : 3
Creating Pivot Table and Pivot Chart using Excel and Power BI
Demonstrate the creation of one dimensional and two dimensional pivot table and pivot chart to perform
analysis using Microsoft Excel and PowerBI for any sample data like fruits sale data.
a. Microsoft Excel
Sale data:
Enter the below data
Select the whole data and select the Table/Range from Pivot Table from Insert tab
Click OK
b. Power BI
Practical No : 4
Import the legacy data from different sources such as (Excel, Web, XMl, etc) in PowerBi.
Import the legacy data from different data sources such as (Flat file, Excel, Web, XML, JSON, OData,etc.
Perform the extraction transformation and Loading (ETL) process to load in PowerBi)
Note: Use your data for Flat File ,Excel,Web, XMl,JSON, OData
File for oData - https://services.odata.org/v2/northwind/northwind.svc/
2. CSV file
Download iris.csv file from Google. Select File type Text/CSV from the Get Data window.
3. Excel file
Load Stud_info.xlsx :
4. XML document
Select XML
5. OData
File for oData - https://services.odata.org/v2/northwind/northwind.svc/
6. Web
7. Json
Practical No : 5
Perform Extract , Transformation, Loading process to construct the database in Power BI.
Create a data model for the student database in PowerBI and import data from excel worksheet. Also perform
ETL and prepare data for result analysis.
ETL on Stud_info.xlsx :
Load Stud_info.xlsx :
First we need to change the format of student names. For that select Capitalize Each Word from Format of
Transform tab.
Before : After :
Also we need to change the format of the students department. Similar steps need to be followed, but instead
of Capitalize Each Word we need to select UPPERCASE from the Format of Transform tab.
Before : After :
Now we need to change the Roll Nos from 1-5 to 6433-6437. For that, select the first value of Roll No column
and select Replace Values from the Replace Values of Transform tab.
Before : After :
ETL on iris.csv
2) Data groups
4) Split Column
2) Reverse
4) Replace value
Practical No : 6
Demonstrate the report creation in Power BI for data of any subject. (For ex: Financial Sample and Retail
Data)
1. Perform
Q.2. Create the Data Model by relating all the tables if possible
Q.4. Transform ‘Units Sold’ Columns data by changing its data type to whole number
Before: After:
After:
Q.6. Shorten the column name from month name to just month
Before:
After:
Q.7. The product Montana did not continue last month. So exclude the data of the Montana product from the
table by deselecting the product from the column filter.
Before:
After:
Q.8. You see that each data transformation has been added to the list Query Settings in Applied Steps.
Q. Create a new measure named Total Units Sold to add all the numbers in the Unit Sold column.
Go to Home tab → Select a bar chart from Visualizations → Click Total Units Sold from Fields
Q. Create a new table named ‘Calendar’ to generate a Calendar table of all dates between January 1 2013, and
December 31, 2014.
Update data model to link Finances table Date to Date Table Date.
On the Home ribbon, select the Text Box to add title to the report and type “Executive Summary - Finance
Report”.
Select the text you typed. Set the Font Size to 20 and Bold. Resize the box to fit on one line.
Q. To check Profit by Date, add a line chart to see which month and year had the highest profit.
From the Fields pane, drag the Profit field to a blank area on the report canvas. By default, Power BI displays
a column chart with one column, Profit. Drag the Date field to the same visual. If you created a Calendar table
in Extra credit: Create a table in DAX earlier in this article, drag the Date field from your Calendar table
instead. Power BI updates the column chart to show profit by the two years.
In the Fields section of the Visualizations pane, select the drop-down in the X- axis value. Change Date from
Date Hierarchy to Date.
Power BI updates the column chart to show profit for each month.
Q. Change the visualization type to Line chart. (In the Visualizations pane change the chart type)
From the Fields pane, drag the Country field to a blank area on your report canvas to create
a map.
Drag the Profit field to the map.
Power BI creates a map visual with bubbles representing the relative profit of each location.
Compare the bubble size to identify the highest profit country.
Q. Create a bar chart to check sales by product and segment and determine which companies and segments to
invest in.
Drag the two charts you’ve created to be side by side in the top half of the canvas.
Save some room on the left side of the canvas.
Select a blank area in the lower half of your report canvas.
In the Fields pane, select the Sales, Product, and Segment fields.
Power BI automatically creates a clustered column chart.
Drag the chart so it’s wide enough to fill the space under the two upper charts.
Q. Add date slicer to the report to check or filter the year wise or month wise data.
In the Fields pane, select the Date field in the Financials table. Drag it to the blank area on the left of the
canvas.
In the Visualizations pane, choose Slicer. Power BI automatically creates a numeric range slicer.
You can drag the ends to filter, or select the arrow in the upper-right corner and change it to a different type of
slicer.
Expand each year and resize the visual, so all months are visible.
Practical 7
Open Power BI
Under Visualization → Format your report page → Canvas Background → Change the background color.
Now create a new Table in which we will execute all our measures that will be used further in the dashboard.
Under Home Section → Enter Data.
As you can see the new Table with one Column1 is created under Data.
Under New Measure write the DAX Query as shown below. Use the CountRows function of DAX Query to
count the rows in the table.
Total Customer = COUNTROWS('6433_Mall_Customers')
Now again create a new measure and then use the DAX Query to query the data according to it.
Check the Total Customer checkbox, a stacked column chart will be displayed as below
Do the formatting and set it into the first square box also add a text into it
Drag and drop the Age column in X-axis and Male and Female measures in Y-axis
By clicking on a bar, we can find the total no. of customers, no. of male customers and no. of female
customers
Now, add a new page, before that rename our first page as Home
Copy the above two rounded rectangles from the Home Dashboard and paste it in the Spending and Income
Dashboard. Also select a clustered column chart as we did in the Home page. Select Spending Score(1-100)
column in X-axis and Max of Annual Income (k$). Do the formatting as per need.
For navigating between the dashboards, we will now add a Blank Button from Insert tab
Enable the Action option, select Page Navigation for Type and None (because we are adding this button in the
Spending and Income Dashboard itself) in Destination.
Likewise make another button in Spending and Income Dashboard, give it a text named Home, set its action
type as Page Navigation and select Home in Destination
Do the same things in Home Dashboard, be careful while selecting the Destination for each buttons
Practical No : 8
Click on Commit
Go to Report view
We can also display the minimum value using a card visualisation tool.
6. Create a measure to display the total of blank values for product manufacturers.
Blank values = CALCULATE(COUNTROWS('Product'), ISBLANK('Product'[Manufacturer]))
Practical No : 9 (30/01/24)
Create a BI folder in your drive. Now right click and select Google Colaboratory option
Rename it as 6433_DataAnalysisUsingPython
We can also execute linux commands, the commands should precede an exclamation mark (!)
We can also use file icon to check the directories and files
Right click on sample_data directory and select upload option to upload a .csv file. Double click on the .csv
file to see its data.
Simple practice:
1. int variable
2. String variable
5. array
1D array with multiple values:
2D array
Column,row
3D array:
Array,row,column
reshape() of 1D array
matmul() :
dot() :
add()
add function
Importing libraries
Extracting dependent and independent variables from the dataset. for x variable, we have taken -1 value since
we want to remove the last column from the dataset and for y variable, we have taken 1 value as a parameter,
since we want to extract the second column and indexing starts from the zero.
Values of x and y :
1. Find out if there is any correlation between two variables: salary and experience.
3. Demonstrate how the dependent variable is changing by changing the independent variable.
y = mx + c [where m is coefficient or slope and c is intercept]
Method 1:
Method 2 :
Example 1
fit() method takes the training data as arguments. Used to train a machine learning model on a given dataset.
intercept_ is the value at which the regression line crosses the y-axis.
predict() is used to make predictions on new data, based on a trained model. It accepts one argument, the new
data X_new, and returns the learned label for each object in the array.
Pyplot is a submodule of Matplotlib, a Python library used to create data visualisations. Pyplot provides a
number of functions for creating and manipulating plots
Example 2
Example 3
Problem Statement - Given a dataset complist.csv of 50 start-up companies with five main attributes: R&D
Spend, Administration Spend, Marketing Spend, State, and Profit for a financial year, create a model that can
easily determine which company has a maximum profit, and which is the most affecting factor for the profit of
a company.
Practical No : 10
Problem Statement: Classify the iris species by using the following algorithms. (Given Dataset: iris.csv)
a. Logistic Regression :
Logistic regression is a linear model that can be used to model the probability of a binary outcome. It uses a
sigmoid function to map the linear combination of the input features to a value between 0 and 1, which can be
interpreted as the likelihood of the positive class.
c. Naive Bayes :
Naive Bayes is a probabilistic algorithm that can be used to classify an observation based on the Bayes’
theorem and the assumption of conditional independence among the input features. It calculates the posterior
probability of each class given the observation and chooses the class with the highest probability
d. Decision Tree :
Decision tree is a hierarchical algorithm that can be used to classify an observation based on a series of rules
derived from the input features. It splits the data into subsets based on the best feature and threshold at each
node, until a leaf node is reached that contains only one class or a predefined minimum number of samples.
Practical No : 11
a. K-Mean Cluster
Given a dataset of Mall_Customers, which is the data of customers who visit the mall and spend there,
implement the k-means clustering algorithm.
b. Agglomerative Clustering
Implement agglomerative hierarchical clustering algorithm using Python for Mall_customer dataset.The
dataset contains the information of customers that have visited a mall for shopping. So, the mall owner wants
to find some patterns or some particular behaviour of his customers using the dataset information. (Dataset:
Mall_Customers.csv)