Ba Zc420 Ec-3r First Sem 2023-2024

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 24

Q 1. Set A.

With reference to the following figure, what would you do if some Countries/Provinces (any geographical entity) are
missing and displaying a null when you use map view? (5 Marks) OR
With reference to the following figure, what is the problem that the visualization is facing? How would you solve it?

When some geographical entities are missing and displaying as null in a map view, you have a couple of options to
handle this situation:
Filter Data: Exclude the null values from the view and calculations. This will remove the entities with missing data
from the map, providing a cleaner visualization.
Show at Default Position: Display the entities with null values at a default position on the axis. This allows you to
retain the data points but indicates that their exact locations are unknown.
Choose the option that best suits your needs for data analysis and visualization. If filtering out the data, ensure it does
not significantly impact the overall analysis. If showing at a default position, consider how to clearly communicate
this in your report or presentation to avoid confusion.

The visualization is facing an issue with null values for the “Longitude (generated)” data. To address this problem,
you have two options:
Filter Data: This will exclude the null values from the view and calculations, resulting in a map that only displays
entities with complete data.
Show at Default Position: This option will show the entities with null values at a predetermined default position on
the map.
To solve the issue, you can choose to filter the data if the null values are not significant or show them at a default
position if they are essential for the analysis but lack specific location details. Ensure to communicate your choice
clearly when presenting the data.
Q 2. Set A.
In Tableau, what are aggregation and disaggregation? (5 marks)
In Tableau, aggregation and disaggregation are two different ways of representing and analyzing data at different
levels of detail1:
Aggregation: This involves summarizing or combining multiple data points into a single value. It allows you to see
the overall picture or a higher-level summary of your data. Aggregated data is often used for creating charts,
visualizations, and performing calculations such as sums, averages, counts, and percentages1. For example, if you
have a dataset with sales transactions, you can aggregate the data by summing up the sales amount for each product
category to see the total sales for each category1.

In Tableau, you can aggregate measures or dimensions, though it’s more common to aggregate measures. Whenever you add a
measure to your view, an aggregation is applied automatically to its values. Sum, average, and median are common
aggregations.The type of aggregation applied varies depending on the context of the view.
You can aggregate a dimension in the view as Minimum, Maximum, Count, or Count (Distinct). When you aggregate a
dimension, you create a new temporary measure column, so the dimension actually takes on the characteristics of a measure.
Tableau provides a set of predefined aggregations that are shown in the table. You can set the default aggregation for
any measure that isn’t a calculated field that itself contains an aggregation, such as AVG([Discount]). See Set the
Default Aggregation for a Measure. You can also set the aggregation for a field already in the view. For details,
see Change the Aggregation of a Measure in the View.
RESULT FOR
AGGREGATIO MEASURE THAT
DESCRIPTION
N CONTAINS 1, 2,
2, 3
Attribute Returns the value of the given expression if it only has a single *
value for all rows in the group, otherwise it displays an asterisk (*)
character. Null values are ignored. This aggregation is useful when
aggregating a dimension. To set a measure in the view to this
aggregation, right-click (control-click on Mac) the measure and
choose Attribute. The field then changes to show the text ATTR:

Dimension Returns all unique values in a measure or dimension. 3 values (1, 2, 3)


Sum Returns the sum of the numbers in a measure. Null values are 1 value (8)
ignored.
Average Returns the arithmetic mean of the numbers in a measure. Null 1 value (2)
values are ignored.
Count (Distinct) Returns the number of unique values in a measure or dimension. 1 value (3)
When applied to a dimension, Tableau creates a new temporary
column that is a measure because the result of a count is a number.
You can count numbers, dates, booleans, and strings. Null values
are ignored in all cases.
This aggregation isn’t available for the following types of
workbooks:
Workbooks created before Tableau Desktop 8.2 and that use
Microsoft Excel or Text File data sources.
Workbooks that use legacy connections.
Workbooks that use Microsoft Access data sources.
If you’re connected to a workbook that uses of one of these types,
Count (Distinct) is unavailable and Tableau shows the message
"Requires extract." To use this aggregation, extract your data.
See Extract Your Data.
Minimum Returns the smallest number in a measure or continuous dimension. 1 value (1)
Null values are ignored.
Maximum Returns the largest number in a measure or in the given expression 1 value (3)
based on a sample population. Null values are ignored. Returns a
Null if there are fewer than 2 members in the sample that aren’t
Null. Use this function if your data represents a sample of the
population.
Std. Dev (Pop.) Returns the standard deviation of all values in the given expression 1 value (0.7071)
based on a biased population. Assumes that its arguments consist of
the entire population. Use this function for large sample sizes.
Variance Returns the variance of all values in the given expression based on a 1 value (0.6667)
sample. Null values are ignored. Returns a Null if there are fewer
than 2 members in the sample that are not Null. Use this function if
your data represents a sample of the population.
Variance (Pop.) Returns the variance of all values in the given expression based on a 1 value (0.5000)
biased population. Assumes that its arguments consist of the entire
population. Use this function for large sample sizes.
Disaggregate Returns all records in the underlying data source. To disaggregate
all measures in the view, select Aggregate Measures from the
Analysis menu (to clear the check mark).
Tableau allows you to view data in disaggregated form (relational
databases only). When data are disaggregated, you can view the
individual rows of your data source. For example, after discovering
RESULT FOR
AGGREGATIO MEASURE THAT
DESCRIPTION
N CONTAINS 1, 2,
2, 3
that the sum of sales for rubber bands is $14,600, you might want to
see the distribution of individual sales transactions. To answer this
question, you must create a view that shows individual rows of data.
That is, you must disaggregate the data (see How to Disaggregate
Data). Another way to look at disaggregated data is to view the
underlying data for all or part of a view. For more details, see View
Underlying Data.

Disaggregation: This involves displaying and analyzing individual data points at their most granular level. It allows
you to see each individual record or data point in your dataset1. Disaggregated data provides detailed information and
enables you to drill down into specific data elements for in-depth analysis1. For example, if you have a dataset with
customer transactions, you can disaggregate the data to see each individual transaction, including the customer name,
product purchased, date, and amount1.

In Tableau, you can switch between data aggregation and disaggregation by manipulating the level of detail in your
visualizations.
Whenever you add a measure to your view, an aggregation is applied to that measure by default. This default is
controlled by the Aggregate Measures setting in the Analysis menu.
If you decide you want to see all the marks in the view at the most detailed level of granularity, you can disaggregate
the view. Disaggregating your data means that Tableau displays a separate mark for every data value in every row of
your data source.
To disaggregate all measures in the view:Clear the Analysis >Aggregate Measures option. If it’s already selected,
click Aggregate Measures one time to deselect it.

Disaggregating your data can be useful for analyzing measures that you may want to use both independently and
dependently in the view. For example, you may be analyzing the results from a product satisfaction survey with the
Age of participants along one axis. You can aggregate the Age field to determine the average age of participants or
disaggregate the data to determine at what age participants were most satisfied with the product.
Disaggregating data can be useful when you’re viewing data as a scatter plot. See Example: Scatter Plots,
Aggregation, and Granularity.

Q.3 Set B
Give five types of Quick Table Calculation that is possible in Tableau. What purpose does Quick Table Calculation
serve in tableau?
Quick table calculations allow you to quickly apply a common table calculation to your visualization using the most
typical settings for that calculation type so that you can continue with your analysis.
You can only perform quick table calculations on measures in the view.
The following quick table calculations are available in Tableau for you to use:
Running total
Difference
Percent difference
Percent of total
Rank
Percentile
Moving average
YTD total
Compound growth rate
Year of year growth
YTD growth
Running Total: This calculation provides a running total of the measure across your data1.
Difference: This calculation shows the difference between the measure of two points in your data 1.
Percent Difference: This calculation shows the percentage difference between the measure of two points in your
data1.
Percent of Total: This calculation shows each value as a percentage of the total of all the values in your data 1.
Moving Average: This calculation provides a moving average of the measure across your data1.
This feature saves time and effort as compared to creating a dedicated calculation 2. It also enhances the robustness of
operations like running total, moving average, etc2. With traditional table calculations, you can apply the same
settings, but you must apply them manually1.
Set up the visualization
Open Tableau Desktop and connect to the Sample-Superstore data source, which comes with Tableau and navigate to a new
worksheet.
From the Data pane, drag Order Date to the Columns shelf.
Drag State to the Rows shelf.
Drag Sales to Text on the Marks Card.
Drag Profit to Color on the Marks Card.
On the Marks card, click the Mark Type drop-down and select Square
On the Marks card, right-click SUM(Profit) and select Quick Table Calculation > Moving Average.
A delta symbol appears on the field to indicate that a quick table calculation is being applied to the field. The colors in the
visualization update to show the moving average of profit across the years.
Q 3. Set A.
What does bin mean in Tableau? How do you create bins? (5 Marks)
Sometimes it's useful to convert a continuous measure (or a numeric dimension) into bins.
Any discrete field in Tableau can be considered as a set of bins. For example, suppose you create a view
with Profit on Rows and State on Columns. You could consider the State field as a set of bins—each profit value is
sorted into a bin corresponding to the state from which the value was recorded. But if you want to see values
for Profit assigned to bins without reference to a dimension, you can create a numeric bin, with each individual bin
corresponding to a range of values.
Note: You can bin data only for relational data sources and binned fields cannot be used in calculations. However, it
is possible to create a calculated field that will replicate a bin of a specific size. For
example: (FLOOR([Sales]/1000)*1000) will create bins with a size of 1000. By dragging this calculation to the
dimensions pane, you can use these bins with cube data sources and calculated fields.
When you create bins from a measure you create a new dimension. That's because you are creating a field with a limited and
discrete set of possible values out of a field with an unlimited, continuous range of values. However, once the dimension is
created, you can convert it to a continuous dimension.
Create a Binned Dimension:
In the Data pane, right-click (control-click on Mac) a measure and select Create > Bins
In the Create Bins dialog box, accept the proposed New field name or specify a different name for the new field. On the web,
the dialog box is named Edit Bins and has a slightly different appearance, but the options are the same.
On the web, the dialog box is named Edit Bins and has a slightly different appearance, but the options are the same.
If Tableau can perform the optimizing calculation quickly enough, the value you see initially in Size of bins is
Tableau's estimate of the optimal bin size.
After you click OK to dismiss the Create Bins dialog box, a new binned field appears in the Dimensions area of
the Data pane.
If Tableau cannot perform the optimizing calculation quickly, the Size of bins field defaults to 10. In this case you
can click Suggest Bin Size to have Tableau perform the optimizing calculation.
The formula that Tableau uses to calculate an optimal bin size is
Number of Bins = 3 + log2(n) * log(n)
In the formula, n is the number of distinct rows in the table. The size of each bin is determined by dividing the
difference between the smallest and the largest values by the number of bins.
The four read-only fields in the lower part of the Create Bins dialog box show you the data that Tableau uses to
suggest a bin size. You can also consider these values if you want to set a bin size manually. The values are:
Min The field's minimum value.
Max The field's maximum value.
Diff The difference between the field's minimum and maximum values.
CntD The number of distinct values (rows) in the data.
After you click OK to dismiss the Create Bins dialog box, a new binned field appears in the Dimensions area of
the Data pane.
When you add a binned dimension to the view, each bin acts as an equal-sized container that summarizes data for a
specific range of values. Column or row headers are created, where each bin label designates the lower limit of the
range of numbers that is assigned to the bin. Note that the lower limit is inclusive.

In Tableau, a bin refers to a data manipulation technique used to group continuous numeric data into discrete ranges
or bins1. This is particularly useful when you want to analyze data in a more aggregated or categorical manner 1. Bins
are containers of equal size that store data values corresponding to or fitting in bin size 2. Also, bins group a set of data
into groups of equal interval or size making it a systematic distribution of data2.
To create bins in Tableau, follow these steps34526:
Navigate to the Data pane and right-click on the desired field or dimension from which you want to retrieve the data 6.
Select the Create option and then click on Bins6.
In the Create Bins dialog box, you can specify the size of the bins, and Tableau will automatically create a new field
with the binned data values3.
You can either enter a value in the Size of bins field or have Tableau calculate a value for you 3.
After you click OK to dismiss the Create Bins dialog box, a new binned field appears in the Dimensions area of the
Data pane3.
Please note that you can bin data only for relational data sources and binned fields cannot be used in calculations 3.
However, it is possible to create a calculated field that will replicate a bin of a specific size 3. For example:
(FLOOR([Sales]/1000)*1000) will create bins with a size of 10003. By dragging this calculation to the dimensions pane,
you can use these bins with cube data sources and calculated fields3.

Q 3. Set B.
Give an example how storyline can be used in Tableau. (5 Marks)
Sure, let’s consider an example of how a storyline can be used in Tableau to examine a trend1. This example walks
you through building a story about earthquake trends over time1.
Frame the Story: The purpose of the story is to answer the question: "Are big earthquakes becoming more
common?"1.
Build the Story: Use Tableau Desktop to open the Earthquake Trend Story workbook that you downloaded1. The
workbook has three dashboards that you’ll use to build your story1.
State the Question: By default, Tableau uses the worksheet name as the story title 1. In this case, replace it with the
following: "Are big earthquakes on the rise?"1.
Start Big: The first story point will show the broadest possible viewpoint—all earthquakes, across the entire planet 1.
On the Story pane, double-click “Map dashboard” to place it on the story sheet1.
The story feature in Tableau is a great way to showcase this type of analysis because it has a step-by-step format
which lets you move your audience through time1. Rather than showing you how to create all the views and
dashboards from scratch, this example starts from an existing workbook1. What you’ll do is pull the story together1.
This is just one example of how you can use storylines in Tableau. The possibilities are endless depending on your
data and the insights you want to convey1.
Q 4. Set A.
Differentiate parameters and filters in Tableau. (5 marks)
In Tableau, parameters and filters are both used to manipulate the view of your data, but they serve different purposes
and have different functionalities1234:
Parameters are defined as workbook variables like a number, date, or calculated field that allow users to replace a
constant value in a calculation, filter, or reference line1. They are more global in nature, meaning they can affect
multiple worksheets or dashboards within a workbook4. Parameters are independent of the data source and allow
users to set their own values and boundaries1. They can be used in calculated fields1 and are single select based on a
list or require user input1. Parameter actions enable changes in your view, by hover, select or menu1.

Filters, on the other hand, are designed to narrow down data based on specific conditions or criteria 3. They allow
users to focus on a subset of their data, making it easier to analyze and draw insights from the visualization 3. Filters
are typically applied to individual worksheets4 and are dependent on the data source1. They display the exact number
of boundaries, derived from the data source1. Filters can have multi-select values1 and are by default at the Worksheet
level1. However, filters can be applied to different worksheets if needed1.
In summary, while parameters offer interactivity and flexibility, filters are designed to narrow down the data based on
specific conditions or criteria3. Both are powerful tools in Tableau that can greatly enhance your data visualization
and analysis1234.

Q 4. Set C.
How can parameters be used along with bins in Tableau? Explain with an example. (5 Marks)
In Tableau, parameters can be used along with bins to create dynamic bin sizes, which can be adjusted interactively
by the user1234. This can be particularly useful when you want to allow the user to explore different levels of
granularity in the data1234.
A parameter is a workbook variable such as a number, date, or string that can replace a constant value in a
calculation, filter, or reference line.
For example, you may create a calculated field that returns True if Sales is greater than $500,000 and otherwise
returns False. You can replace the constant value of “500000” in the formula with a parameter. Then, using the
parameter control, you can dynamically change the threshold in your calculation.
To use a parameter, you must:
Set up the initial parameter.
Use the parameter in an element such as a calculation, filter, or reference line.
Adjust the parameter value by:
Showing the parameter control
Using a parameter action
Setting a dynamic parameter to update automatically
Create a parameter
To create a parameter from the Data pane:
In the Data pane, click the drop-down arrow in the upper right corner and select Create Parameter.
In the Create Parameter dialog box, give the field a Name
Specify the data type for the values it accepts
Specify the display format to use in the parameter control (Tableau Desktop only)
Specify how the parameter accepts values. You can select from the following options:
All: The parameter control is a simple text field.
List: The parameter control provides a list of possible values for you to select from.
If you select List, you must specify the list of values. Click in the left column to type your list of values, or you can
add members of a field or paste from the clipboard by selecting Add values from.
Range: The parameter control lets you select values within a specified range.
If you select Range, you must specify a minimum, maximum, and step size. The step size controls the jumps between
values, such as letting you choose each number (5, 6, 7...) or going from 5 to 10 to 15.
To refresh the parameter’s list of values (or domain) whenever the workbook opens, select List or Range, and then
select When the workbook opens. Notice that some options are grayed out because the workbook is dynamically
pulling values from the data source.
When finished, click OK.

Here’s an example of how you can use parameters with bins in Tableau1234:
Create a Parameter for Bin Size: Right-click in the Data pane and select “Create Parameter”. In the dialog box, you
can specify the name, current value, and allowable values for the parameter1234.
Create a Calculated Field for Bins: Right-click on the measure you want to bin and select “Create > Calculated
Field”. In the calculation, use the formula INT([your measure]/[your parameter])*[your parameter]1234. This will create
dynamic bins based on the parameter value1234.
Use the Calculated Field in Your Visualization: You can now use the calculated field in your visualization. The bin
size will change dynamically based on the parameter value1234.
Show Parameter Control: Right-click on the parameter in the Data pane and select “Show Parameter Control”. This
will allow the user to adjust the bin size interactively1234.
This way, parameters can be used with bins to create more flexible and interactive visualizations in Tableau 1234.

Q 5. Set A.
What is union in tableau, how does it differ from joins? (5 marks)
In Tableau, a union is used to combine two or more tables by appending values (rows) from one table to another 1. To
union your data in Tableau data source, the tables must come from the same connection 1. Unions effectively lengthen
the data by adding more rows2. This method is helpful when dealing with data that has the same structure but is split
across multiple files or tables2.
On the other hand, joins in Tableau are used to combine related data on common fields3. The result of combining data
using a join is a table that’s typically extended horizontally by adding fields of data3. Joins work by matching
columns from one table with columns from another4. This operation can be done anywhere in the flow and is often
used early in a flow to understand your data sets and expose areas that need attention right away 3.
In summary, while unions work with rows and are used to append data vertically, joins work with columns and are
used to combine data horizontally4.

Q 5. Set B.
Suppose there a table is spread around in 6 pages in a pdf file. Can you use Tableau to access this table and visualize?
If yes, give the steps involved. (5 marks)
Yes, you can use Tableau to access and visualize a table spread across multiple pages in a PDF file. Here are the steps
involved:

1. **Connect to the PDF File**:


- Open Tableau Desktop and navigate to the **Connect** pane.
- Click on **PDF File** under **Connect**.
- Select the PDF file that contains the table you want to access.
- In the **Scan PDF File** dialog box, specify the pages in the file that you want Tableau to scan for tables. You
can choose to scan all pages, a single page, or a range of pages. Note that the scan counts the first page of the file as
page 1, similar to most PDF readers[1].

2. **Scan for Tables**:


- Tableau will automatically identify tables in the specified pages of the PDF file. The tables are given unique
names and are displayed in the left pane after a scan[1].

3. **Select the Table**:


- In the left pane, select the table you want to access and visualize.
- If the table spans across multiple pages, Tableau interprets it as multiple tables. To resolve this issue, you can use
a union to combine the tables[1].

4. **Start Your Analysis**:


- Once you've selected the table, click on the **Sheet** tab to start your analysis.
- You can now drag the table onto the canvas and start building your visualization[1].

5. **Validate the Data**:


- Make sure to validate the data in the table by using either the data grid or if you used the Data Interpreter, the
results workbook[1].

6. **Visualize the Data**:


- Use Tableau's various visualization tools to create a meaningful representation of your data. This could include
charts, graphs, maps, or other types of visualizations[1].
By following these steps, you can effectively access and visualize a table spread across multiple pages in a PDF file
using Tableau Desktop.

Q 6. Set A.
What do you understand by the term “Use Data Interpreter” in Tableau? Give example of two use cases when this
feature can be used, explain how? (10 Marks)
Cleanup Excel Datasheets
Cleanup tabular data from PDF
Extract Sub Tables
In Tableau, the Data Interpreter is a feature that helps clean up your data1. It can detect things like titles, notes,
footers, empty cells, and so on, and bypass them to identify the actual fields and values in your data set 1. It can even
detect additional tables and sub-tables so that you can work with a subset of your data independently of the other
data1. After Data Interpreter has done its magic, you can check its work to make sure it captured the data that you
wanted and identified it correctly1.
Here are two use cases where the Data Interpreter can be used:
Cleaning Excel Data: Excel files can be structured in various ways and often contain extra footer and header rows
around the data1. The Data Interpreter can help identify the structure of the data in the Excel file and turn it into the
proper format for analysis in Tableau2. It takes care of cleaning and transforming that data for you2.
Working with Formatted Tables: In some cases, data tables are formatted and not easily understandable by Tableau 3.
The Data Interpreter can be used for cleaning the data in such scenarios where data tables are formatted 3.
To use the Data Interpreter, you connect to an Excel spreadsheet or other connector that supports Data Interpreter
such as Text (.csv) files, PDF files or Google sheets1. Then, on the Data Source page, in the left pane, you select the
Use Data Interpreter check box to see if Data Interpreter can help clean up your data 1. After you clean your data with
Data Interpreter, you can review the results1.
From Tableau Help
Clean Data from Excel, CSV, PDF, and Google Sheets with Data Interpreter
When you track data in Excel spreadsheets, you create them with the human interface in mind. To make your spreadsheets easy
to read, you might include things like titles, stacked headers, notes, maybe empty rows and columns to add white space, and you
probably have multiple tabs of data too.
When you want to analyze this data in Tableau, these aesthetically pleasing attributes make it very difficult for Tableau to
interpret your data. That’s where Data Interpreter can help.
What does Data Interpreter do?
Data Interpreter can give you a head start when cleaning your data. It can detect things like titles, notes, footers, empty cells, and
so on and bypass them to identify the actual fields and values in your data set.
It can even detect additional tables and sub-tables so that you can work with a subset of your data independently of the other
data.
After Data Interpreter has done its magic, you can check its work to make sure it captured the data that you wanted and
identified it correctly. Then, you can make any necessary adjustments.
After you select the data that you want to work with, you might also need to do some additional cleaning steps like pivoting
your data, splitting fields, or adding filters to get the data in the shape you want before starting your analysis.
Turn on Data Interpreter and review results
From the Connect pane, connect to an Excel spreadsheet or other connector that supports Data Interpreter such as Text (.csv)
files, PDF files or Google sheets.
Drag a table to the canvas (if needed), then on the Data Source page, in the left pane, select the Use Data Interpreter check
box to see if Data Interpreter can help clean up your data.
In the Data pane, click the Review the results link to review the results of the Data Interpreter.
A copy of your data source opens in Excel on the Key for the Data Interpreter tab. Review the key to find out how to read the
results.
Click each tab to review how Data Interpreter interpreted the data source.
If Data Interpreter found additional tables, also called found tables or sub-tables, they are identified in the <sheet
name>_subtables tab by outlining their cell ranges. A separate tab is also included for each sub-table, color coded to identify the
header and data rows.
If Data Interpreter does not provide the expected results, clear the Cleaned with Data Interpreter check box to use the original
data source.
To replace the current table with any of the found tables, drag the current table off the canvas and then drag the found table that
you want to use to the canvas.
If Data interpreter has misidentified the range of the found table, after you drag the found table to the canvas, click the drop-
down arrow on that table, and then select Edit Found Table to adjust the corners of the found table (the top-left cell and
bottom-right cell of the table).
Data Interpreter Example
In this example we are connecting to an Excel spreadsheet with violent crime data by city and state for the year 2016. This
spreadsheet includes multiple tables on one sheet and some extra formatting.

Title
Merged header cells
Extra white space
Sub-tables
The extra formatting in this spreadsheet makes it difficult for Tableau to determine what the field headers and values are.
Instead, it reads the data vertically and assigns each column the default value F1, F2, F3 (Field 1, Field 2, Field 3) and so on.
Blank cells are read as null values.

To see if Data Interpreter can help clean this data set, we select Use Data Interpreter.
Data Interpreter detected the proper headings for the fields, removed the extra formatting and found several sub-tables. The sub-
tables are listed in the Sheets section in the Data pane and are named using the original sheet name and the cell ranges for each
sub-table.
In this example there are three sub-tables: Crimes 2016 A4:H84, Crimes 2016 K5:L40, and Crimes 2016 O5:P56.
To examine the results of the Data Interpreter more closely, we click the Review the results link in the Data pane to view an
annotated copy of the spreadsheet.
Here we see a copy of the original data, color coded to identify which data was identified as header data and which data was
identified as field values.

The next tab shows us the sub-tables that Data Interpreter found, outlined by the cell ranges.

In this example the first sub-table, Crimes 2016 A4:H84, has the main data that we want to work with. To use this table as our
data table, we can simply drag the original table off the canvas and then drag the new table to the canvas.
Once we have the data that we want to work with in the canvas, we can do some additional clean up on the data. For example
we can:
Change the field names so that they represent city, state, and month names.
Pivot the months fields.
Drag in the third sub-table Crimes 2016 o5:P56 and join it to our first sub-table on the State field to include state populations
for our analysis.
Hide any duplicate fields that were added as a result of the join.
The results might look something like this:

Now we are ready to start analyzing our data in Tableau.

Q 6. Set B.
Which Tableau 9 feature allows you to change the format of the data from where the years are in columns to when
the years have their own column? Explain the steps involved. (10 Marks).
The Tableau 9 feature that allows you to change the format of the data from columns to rows is called "Pivot Data
from Columns to Rows." This feature is particularly useful when working with data sources such as Microsoft Excel,
text files, Google Sheets, and .pdf files, where the data is initially presented in a crosstab format[5].
Here are the steps involved in pivoting data from columns to rows in Tableau:
1. **Connect to the Data Source**:
- Open Tableau Desktop and navigate to the **Connect** pane.
- Select the data source that contains the data you want to pivot, such as Microsoft Excel, text file, Google Sheets,
or .pdf file.

2. **Drag the Field to the Columns Shelf**:


- Drag the field that contains the years (or any other field you want to pivot) to the **Columns** shelf in the
**Dimensions** pane.
3. **Drag the Field to the Rows Shelf**:
- Drag the same field to the **Rows** shelf in the **Dimensions** pane.
4. **Pivot Data**:
- Right-click (or control-click on Mac) on the field in the **Columns** shelf and select **Pivot**.
- Choose **Pivot Data from Columns to Rows** from the context menu.
5. **Adjust the Data Structure**:
- The data will now be pivoted from columns to rows. You can adjust the data structure as needed by moving fields
around in the **Dimensions** pane or by using other Tableau features such as aggregations and filters.
6. **Visualize the Data**:
- Once the data is pivoted, you can create a visualization using the pivoted data. This could include charts, graphs,
maps, or other types of visualizations.
By following these steps, you can effectively change the format of your data from columns to rows in Tableau,
making it easier to analyze and visualize your data.
The feature in Tableau 9 that allows you to change the format of the data from where the years are in columns to
when the years have their own column is called Pivot12. Pivoting changes the data from a cross-tab format (where the
years are in columns) to a columnar format (where the years have their own column)12.
Here are the steps to pivot your data in Tableau12:
Connect to your data source in Tableau12.
In the data source tab, select the columns that you want to pivot12.
Right-click on the header of one of the selected columns and choose Pivot12.
Tableau will create two new columns: one for the years (Pivot Field Names) and one for the values (Pivot Field
Values)12.
You can rename these columns by double-clicking on the header12.
Please note that the Pivot option is only available for certain types of data sources 12. If you don’t see the Pivot option,
it might be because you’re connected to a data source that doesn’t support pivoting12.

Q 7. Set A.
What is implicit and explicit measures in Power BI? How is it related to calculated field in Tableau? How do you
create an explicit measure in Tableau? (10 marks)
Implicit and explicit measures are two types of measures used in Power BI to analyze data. Implicit measures are
automatically created by Power BI based on the fields used in a visualization. These measures are quick and easy to
use but offer limited flexibility. They are often used for simple aggregations such as sum, count, or average.

Explicit measures, on the other hand, are created by writing a DAX formula. They provide greater flexibility and
control over the calculation. Explicit measures can be used to perform complex calculations that are not possible with
simple aggregations. They also offer better performance and consistency across reports.

In Tableau, calculated fields are similar to explicit measures in Power BI. They are used to perform custom
calculations on data. Calculated fields are created by writing a formula using Tableau's calculation language, which is
different from DAX used in Power BI. Calculated fields can be used to perform complex calculations, such as
aggregations, filtering, and data manipulation.

To create an explicit measure in Tableau, you would typically follow these steps:

1. Open your Tableau workbook and navigate to the "Analysis" pane.


2. Click on "Create Calculated Field" under the "Dimensions" section.
3. Enter a name for your calculated field and select the data type (e.g., number, string, etc.).
4. Write your formula using Tableau's calculation language. This could include functions like SUM, AVG, COUNT,
or custom logic using IF statements, etc.
5. Click "OK" to create the calculated field.
6. Drag the calculated field to the "Columns" or "Rows" area of your visualization to display the results.

Here's an example of a simple calculated field in Tableau that calculates the average sales by region:

```sql
Average Sales by Region = AVG(SUM(Sales[Sales Amount]))
This formula calculates the average sales amount for each region by summing the sales amount for each region and
then dividing by the number of regions.
In summary, implicit and explicit measures in Power BI are used for data analysis, while calculated fields in Tableau
are used for custom data manipulation. Both are powerful tools for data analysis and visualization, but they have
different syntax and functionality.

Answer 2:
In Power BI, implicit measures are automatically created when a numeric field is dragged and dropped into a visual 12.
Power BI identifies numeric fields in the data model and automatically marks them with a Sigma sign 3. This means
that these column values will be summarized, once you drag them to a report visual3. You can define what type of
aggregation you want to apply to a specific column, or not to aggregate at all3.
On the other hand, explicit measures are manually created12. Whenever you create a measure yourself in Power BI,
that is called an explicit measure1. Explicit measures offer more flexibility and control over data modeling3. They are
reusable and can support more complex calculations3.
In Tableau, a similar concept to explicit measures in Power BI is the calculated field 4. A calculated field in Tableau
allows you to create new data from data that already exists in your data source. When you create a calculated field,
you are essentially creating a new field (or column) in your data source, where the values in the field are determined
by a calculation that you control4.
To create an explicit measure (or a calculated field) in Tableau, follow these steps 5:
Connect to your data source.
In the Data pane, locate the field that contains your quantitative data, such as sales or profit.
Drag the field from the Dimensions area to the Measures area in the Data pane5.
This way, you can create explicit measures in Tableau, similar to how you would in Power BI 5.
Q 7. Set B.
Explain what is relationships in Power BI. How are relationships defined in Power BI Desktop? Explain What is a
One to Many relationships. (10 marks)
A one to many relationship might give you insight about a specific quarter’s sales and inventory. The sales column from one
table may relate to several inventory items from the second table.
Relationships in Power BI are connections between tables that enable users to access data from multiple tables with
ease. These connections are established via a common attribute, which is a column that exists in both tables. This
attribute serves as the bridge between the tables, allowing data from one table to be linked to data in another table.

In Power BI Desktop, relationships are defined by identifying the common attribute between two tables. This process
is automatic in many cases, as Power BI can detect relationships based on the data it contains. However, users can
also manually define relationships by selecting the tables and the common attribute in the "Manage Relationships"
dialog box.

One to Many relationships are a type of relationship where one entity in the first table relates to many entities in the
second table. This is a common scenario in data modeling, where a single entity can have multiple related entities.
For example, a customer can have many orders, but each order is related to only one customer. This type of
relationship is often used in business data settings, where a single entity can have multiple transactions or interactions
with other entities.

To illustrate this concept, let's consider an example from the search results:

```sql
A one to many relationship might give you insight about a specific quarter’s sales and inventory. The sales column
from one table may relate to several inventory items from the second table.
```

In this example, the sales column from the sales table relates to multiple inventory items from the inventory table.
This is a classic example of a one to many relationship, where one sales record can be related to multiple inventory
items, but each inventory item is related to only one sales record.
In summary, relationships in Power BI are essential for connecting tables and enabling data analysis. One to Many
relationships are a common type of relationship where one entity in the first table relates to many entities in the
second table, often used in business data settings to model transactions and interactions between entities.

In Power BI, relationships refer to the connections established between different tables of data 1. These relationships
are essential for creating meaningful visualizations, reports, and dashboards by allowing you to combine and analyze
data from multiple tables coherently1. They are similar to joining in SQL, merging in Power Query, or using a
VLOOKUP in Excel2.
Relationships in Power BI are defined in the following ways34:
Autodetect during load: If you query two or more tables at the same time, when the data is loaded, Power BI Desktop
attempts to find and create relationships for you3. Power BI Desktop looks at column names in the tables you’re
querying to determine if there are any potential relationships3. If there are, those relationships are created
automatically3.
Create a relationship manually: On the Modeling tab, select Manage relationships > New 3. In the Create relationship
dialog box, select a table and the column you want to use in the relationship3. Then, select the other table you want in
the relationship and the other column you want to use3. By default, Power BI Desktop automatically configures the
options Cardinality (direction), Cross filter direction, and Make this relationship active for your new relationship 3.
A One to Many relationship in Power BI is a type of relationship where the column in one table has only one instance
of a particular value, and the other related table can have more than one instance of a value 3. This is a common type
of relationship and allows data to be aggregated across tables5. For example, in a sales database, a single customer
can make multiple purchases, but each purchase is made by only one customer. This would be a One (customer) to
Many (purchases) relationship5.
Q 1 SET 1. Creating a foreground and background in a visualization satisfies which Visual Perception Theory and
why? Explain other principles of the Visual Perception Theory. (5 marks)

Creating a foreground and background in a visualization satisfies the Figure-Ground Organization principle of the
Visual Perception Theory1. This principle states that we tend to segment our visual world into figure (object of focus)
and ground (background). The figure is what we focus on and stands out against the ground1. This principle allows us
to focus on what we believe is important while everything else recedes into the background1.
Other principles of the Visual Perception Theory include:

1. Proximity: Objects that are close to each other tend to be grouped together1.
2. Similarity: We tend to group together objects that look similar1.
3. Closure: We tend to see complete figures even when part of the information is missing1.
4. Symmetry: Symmetrical elements are more likely to be grouped together1.
5. Common Fate: Elements moving in the same direction are perceived as a group1.
6. Continuity: Lines are seen as following the smoothest path1.
7. Good Gestalt: Elements are grouped together if they form a pattern that is regular, simple, and orderly 1.
8. Past Experience: Our perceptions are influenced by our past experiences2.

These principles guide the way we organize and interpret visual information, helping us make sense of the world
around us1

Q 1 SET 1. How does Visual Perception Theory influence the design and effectiveness of data visualizations, and
what are some key principles from this theory that data visualization designers should consider when creating
visualizations for data analysis and communication? (5 marks)

Visual Perception Theory plays a crucial role in the design and effectiveness of data visualizations 123. Here’s how:
1. Understanding of Visual Perception: Visual perception is the ability to interpret the surrounding environment
by processing information that is contained in visible light1. In data visualization, understanding visual
perception can help us create visuals that communicate information effectively 1.
2. Aid to Decision Making: The main purpose of data visualization is to aid in good decision making1. To make
good decisions, we need to be able to understand trends, patterns, and relationships from a visual1. This
understanding is facilitated by our visual perception1.
3. Balance between Perception and Cognition: Data visualization shifts the balance between perception and
cognition to use our brain’s capabilities to its advantage1. This means more use of visual perception and lesser
use of cognition1.
Key principles from Visual Perception Theory that data visualization designers should consider include:

1. Similarity: Objects with the same visual properties are assumed to be similar and are grouped together 4.
2. Proximity: Objects that are close together are perceived as a group4.
3. Continuity: Elements that are aligned (on the same line, curve, or plane) are perceived to be more closely
related to each other than to other elements4.
4. Closure: We tend to see complete figures even when part of the information is missing1.
5. Symmetry: Symmetrical elements are more likely to be grouped together1.
6. Common Fate: Elements moving in the same direction are perceived as a group1.
7. Good Gestalt: Elements are grouped together if they form a pattern that is regular, simple, and orderly 1.
8. Past Experience: Our perceptions are influenced by our past experiences1.

These principles guide the way we organize and interpret visual information, helping us make sense of the world
around us1. They can help designers create more effective and intuitive data visualizations 123.

Q 2 SET 1.
What is denoted by the following icons with respect to datatypes in Tableau? (5 marks)
Sl. No Icon
1. This icon represents a string data type. String data is text data, such as
names, addresses, or descriptions.
2. This icon represents a date data type. Date data includes just the year,
month, and day.
3. This icon represents a date and time data type. Date and time data
includes the year, month, day, hour, minute, and second.
4. This icon represents a number data type. Number data can be integers
(whole numbers) or decimals.
5. This icon represents a geographic data type. Geographic data includes
locations such as countries, states, and cities.

Q 2 SET 2. What are the different data types available in Tableau, and how does choosing the appropriate data type
for a field impact data analysis and visualization in Tableau? (5 marks)

Tableau supports seven primary data types1234:

1. String Values: These are alphanumeric characters. They can be divided into Char and Varchar types1.
2. Number/Integer Values: These include both integer and floating-point numbers1.
3. Date Values: Tableau supports all forms of dates1.
4. Date & Time Values: These are a combination of date and time values1.
5. Boolean Values: These are the result of relational calculations and can be either true or false 1.
6. Geographic Values: These are used with maps1.
7. Cluster or Mixed Values: These are a combination of different data types1.

Choosing the appropriate data type for a field is crucial for effective data analysis and visualization in Tableau 56.
Here’s why:

 Accuracy of Visualizations: Different data types can lead to different kinds of visualizations. For example,
numeric data can be represented as bar charts or line graphs, while geographic data can be represented as
maps7.
 Appropriate Calculations: Certain calculations can only be performed on specific data types. For example,
you can’t perform arithmetic operations on string data5.
 Efficient Data Import: Choosing the appropriate data type can significantly impact the efficiency of your data
import process8.
 Data Consistency: Consistent data types ensure that the data can be compared and analyzed accurately 5.
 Enhanced Understanding: Correct data types can enhance understanding, communication, and collaboration
among team members5.

Q 3 SET1.
What is denoted by the blue and the green pill in Tableau as shown the following image? Explain. (5 marks)

In Tableau, the blue pill typically represents a dimension, while the green pill represents a measure.
Dimensions are qualitative data such as names, dates, and geographical data. They are often discrete and used to
categorize, segment, and reveal the details in your data. In the given image, “Product Name” and “YEAR(Order
Date)” are dimensions.
Measures are quantitative data like numbers that you can perform calculations on. They are often continuous and hold
the values that you want to analyze. In the given image, “SUM(Profit)” is a measure.
Choosing the right dimensions and measures is crucial for creating meaningful visualizations in Tableau. The color-
coding (blue for dimensions and green for measures) helps users quickly identify and understand the role of each
field in their visualization. This enhances the effectiveness of data analysis and communication.

In the specific example you sent, it appears to be a table with two rows. The first row shows "Product Name" which
would likely be a discrete field (blue pill) as it refers to individual product names. The second row shows
"YEAR(Order Date)" which would likely be a continuous field (green pill) because it shows a date range over a
continuous period.
Q 3 SET2. How does Tableau distinguish between 'dimensions' and 'measures'? Provide example of when and why
you would use each type of data in your visualizations. (5 marks)

In Tableau, each field is identified as a dimension or measure in the Data pane, depending on the type of data the
field contains1.
Dimensions are typically categorical values that act as descriptive data points. They can be text, dates, or
geographical data, among others, that provide a way to slice and dice measures21. Dimensions determine the level of
detail in your visualization1. For example, in a toy store dataset, dimensions could be Toy Type (e.g., Action Figures,
Dolls, Board Games), Brand (e.g., LEGO, Barbie, Hasbro), Color (e.g., Red, Blue, Green), or Country of Origin (e.g.,
USA, China, Germany)3. These entities help to segregate data into understandable chunks2.
Measures are numerical data fields that represent quantitative values21. They perform calculations, aggregations, and
statistical operations on the data21. Measures are typically used to create quantitative visualizations and provide a
quantitative aspect to the data analysis21. For example, in the same toy store dataset, measures could be Price (e.g.,
$10.99, $24.99), Number of Units Sold (e.g., 50, 100, 200), Revenue (e.g., $550, $2,499), Average Rating (e.g., 4.2,
3.9, 4.5), or Profit Margin (e.g., 20%, 15%, 30%)3. These represent the quantitative aspects of the toys2.
Choosing the right dimensions and measures is crucial for creating meaningful visualizations in Tableau21. The
color-coding (blue for dimensions and green for measures) helps users quickly identify and understand the role of
each field in their visualization1. This enhances the effectiveness of data analysis and communication21.

In Tableau, 'dimensions' and 'measures' are two different types of data that you can use in your visualizations.
Dimensions are qualitative values. They are often categorical and used to segment or categorize your data. Examples
of dimensions include names, dates, or geographical data. You can think of dimensions as the "what" of your data.
For instance, if you're analyzing sales data, 'Product Category', 'Region', and 'Salesperson' could be dimensions. You
would use dimensions to group or filter your data in your visualizations. For instance, you might want to see sales
data segmented by region or product category.
Measures, on the other hand, are quantitative values. They are numerical data that can be measured, aggregated, or
subjected to mathematical operations. Examples of measures include sales, profit, or quantity. You can think of
measures as the "how much" or "how many" of your data. In the sales data example, 'Sales' and 'Profit' could be
measures. You would use measures to perform calculations or create metrics in your visualizations. For instance, you
might want to see the total sales or average profit.
In summary, you would use dimensions to segment your data, and measures to provide the numerical metrics you
want to analyze within those segments. The choice between using a dimension or a measure in a visualization
depends on what question you're trying to answer with your data.
In Tableau, dimensions and measures are two fundamental types of data fields used for creating visualizations.
Dimensions represent categorical data and provide context and structure to the data, while measures represent
numerical data and are used for calculations and aggregations.Dimensions are typically used on the rows and columns
shelves of visualizations. They define the axes and structure the layout of the chart or graph. Examples include bar
charts, scatter plots, and pie charts. Dimensions are not aggregated and represent specific values, such as product
categories, customer names, or sales categories.Measures, on the other hand, are used as the numerical values
determining the size, position, or color of marks in a visualization. They are used for creating quantitative elements in
the chart, like bar lengths, data points, or color gradients. Measures are always continuous data types since they deal
with numerical values, which can be measured across a continuous range. Examples of measures include sales
revenue, quantity sold, and profit.When creating visualizations, you can use dimensions and measures together to
analyze and understand your data. For example, you might create a bar chart with a dimension like "Product
Category" on the rows shelf and a measure like "Sales Revenue" on the columns shelf to show the sales revenue for
each product category.In summary, dimensions and measures serve different roles in Tableau visualizations.
Dimensions provide context and structure to the data, while measures are used for calculations and aggregations.
Understanding the key differences between dimensions and measures is essential for effectively working with
Tableau.
Q4 SET1.
What does the following screenshots show in Tableau? Explain what is Logical Layer in Tableau. (5 Marks)

The screenshots provided in the image are from Tableau, a popular data visualization tool.
Screenshot A appears to show a data model in Tableau, specifically a logical layer. The logical layer is where you
define the relationships between your tables. In this case, it seems like the “Book” table is related to both the
“Checkouts” and “Ratings” tables.
Screenshot B seems to show the data source tab in Tableau. This is where you connect to your data and set up your
data source. The “Book is inside of 3 tables” message suggests that the “Book” table is a logical table made up of
three physical tables.
The Logical Layer in Tableau is a part of its data model where you define the relationships between your tables. It
allows you to create more complex data structures and perform more advanced analysis. For example, you can create
multi-table joins, unions, blends, and more. This layer is logical because it doesn’t physically alter your data; instead,
it defines how Tableau should interpret and analyze your data.

Q4 SET2.
What does the following screenshots show in Tableau? Explain what is Physical Layer in Tableau. (5 Marks)

The screenshots provided in the image are from Tableau, a popular data visualization tool.
Screenshot A appears to show a data model in Tableau, specifically a logical layer. The logical layer is where you
define the relationships between your tables. In this case, it seems like the “Book” table is related to both the
“Checkouts” and “Ratings” tables.
Screenshot B seems to show the data source tab in Tableau. This is where you connect to your data and set up your
data source. The “Book is inside of 3 tables” message suggests that the “Book” table is a logical table made up of
three physical tables.
The Physical Layer in Tableau refers to the actual data and how it’s stored and accessed. It includes the data source,
the tables in the data source, and the fields in those tables. This layer is “physical” because it deals with the actual,
tangible data that’s stored in your computer or server. The physical layer is where you connect to your data and set up
your data source, as shown in Screenshot B. The physical layer is important because it affects the performance of
your data analysis and visualizations. For example, if your data is stored in a slow, remote server, then your Tableau
visualizations might take a long time to load.

Q 5 SET1. How is blend different from join in Tableau? Mention the default Data Blending Join and explain how it is
different from other types of join. (5 marks)

In Tableau, both blending and joining are used to combine data from multiple tables, but they work in different
ways1234:
Joining combines tables into a single view for Tableau to use in the physical layer canvas5. It extends the number of
columns of both tables by adding new columns2. Joins are done at the row level, and you need to have a relationship
between the two tables2. There are four types of joins that you can use in Tableau: inner, left, right, and full outer567.
Blending, on the other hand, is done at the aggregate level2. It queries each data source independently, aggregates the
results to the appropriate level, then presents the results together visually in the view8. Blending is particularly useful
when the blend relationship—the linking fields—need to vary on a sheet-by-sheet basis, or when combining
published data sources8.
The default Data Blending Join in Tableau is a LEFT join19810. This means the primary table should contain all
possible values1. However, by switching which data source is primary, or by filtering nulls, it is possible to emulate
left, right, and inner joins10. This is different from other types of join as blending is always aggregated, joins
(mostly) on row-level3. Also, blending can combine data between different kinds of data sources, joining only within
the same database3.

Blend and join in Tableau are both used to combine data, but they work in different ways and are used for different
purposes.
Data blending in Tableau is used when the data resides in different databases or when it's not possible to join them for
other reasons. It's a way to visually combine data without physically joining it. Data blending happens at the
worksheet level and it doesn't create a persistent join that can be used across multiple worksheets. The default data
blending join is a left join.
On the other hand, a join in Tableau is used when you need to combine data from the same database or different
databases that support cross-database joins. It happens at the data source level and creates a persistent join that can be
used across multiple worksheets. Joins can be left, right, inner, or outer.
The main difference between the default data blending join and other types of join is that the default data blending
join is a left join, which includes all records from the primary data source and the matching records from the
secondary data source. If there's no match, the result is NULL from the secondary data source. However, other types
of join (right, inner, outer) include different sets of records based on the join condition.
In Tableau, blending is a method of combining data from multiple data sources without the need for any special
coding. It is a left outer join by default, which means that the primary table contains all possible values, and the
secondary table may contain null values. Blending is useful when the blended relationship, which links fields, needs
to vary on a sheet-by-sheet basis. It can handle different levels of detail and work with published data sources.On the
other hand, joining tables is a more traditional method of combining data from the same database. It is typically more
performant because the work is done by the database rather than the local machine. Tables from different data sources
can usually be joined using a cross-database join, but there are limitations, such as the inability to join published data
sources, most extract-only data sources, or cube data sources.When deciding between joining tables and blending
data, consider factors such as the number of connections in your workbook, the relationship between the data in each
source, the number of records, and the types of views you will need to create. In most scenarios, traditional table joins
are the best choice, but if the secondary table has a large amount of data, data blending may be faster because it
aggregates the data first.In summary, blending and joining are two different ways to combine data in Tableau.
Blending is a flexible method for quickly changing the linking properties on the fly for ad-hoc analysis, while joining
is more performant and suitable for larger datasets with a 1:1 relationship between tables.
Q5 SET2. What is Right Outer Join in Tableau? How does it differ from other types of joins? How is blend different
from join in Tableau? Mention the default Data Blending Join. (5 marks)
A Right Outer Join in Tableau returns all records from the right table and the matching records from the left
table123. If there are no matches in the left table, null values are returned123. This join type is beneficial when you
want to retain all the data from the right table and associate it with relevant information from the left table1.
The difference between a Right Outer Join and other types of joins in Tableau is as follows45123:
Inner Join: Returns only the records that have matching values in both tables4.
Left Outer Join: Returns all the records from the left table and the matching records from the right table4.
Full Outer Join: Returns all records when there is a match in either the left or the right table4.
Blending in Tableau is different from joining. It is done at the aggregate level67. Blending queries each data source
independently, aggregates the results to the appropriate level, then presents the results together visually in the
view86. Blending can handle different levels of detail and also work with published data sources86. It is particularly
useful when the blend relationship—the linking fields—need to vary on a sheet-by-sheet basis, or when combining
published data sources86.
The default Data Blending Join in Tableau is a LEFT join9101112. This means the primary table should contain all
possible values9101112. However, by switching which data source is primary, or by filtering nulls, it is possible to
emulate left, right, and inner joins12.

A Right Outer Join in Tableau includes all the records from the right (or second) table and the matching records from
the left (or first) table. If there is no match, the result is NULL on the left side.
This differs from other types of joins. For instance, an Inner Join only includes records where the key exists in both
tables, while a Left Outer Join includes all records from the left table and matching records from the right table. A
Full Outer Join includes all records when there is a match in either the left or the right table.
Data blending in Tableau is used when the data resides in different databases or when it's not possible to join them for
other reasons. It's a way to visually combine data without physically joining it. Data blending happens at the
worksheet level and it doesn't create a persistent join that can be used across multiple worksheets.
The default data blending join is a left join, which includes all records from the primary data source and the matching
records from the secondary data source. If there's no match, the result is NULL from the secondary data source. This
is different from a join operation which happens at the data source level and creates a persistent join.
In Tableau, a right outer join is a type of join that combines all values from the right table and corresponding matches
from the left table. When a value in the right table doesn't have a corresponding match in the left table, you see a null
value in the data grid. This type of join is useful when you want to include all data from the right table and match it to
the left table if possible.Right outer joins are different from other types of joins, such as inner, left, and full outer
joins, in that they prioritize the preservation of data from the right table. Inner joins only include rows where there is
a match in both tables, left joins include all rows from the left table and corresponding matches from the right table,
and full outer joins include all rows from both tables.Data blending in Tableau is a method of combining data from
multiple data sources without the need for any special coding. It is a left outer join by default, which means that the
primary table contains all possible values, and the secondary table may contain null values. Blending is useful when
the blended relationship, which links fields, needs to vary on a sheet-by-sheet basis. It can handle different levels of
detail and work with published data sources.The default Data Blending Join in Tableau is a left outer join, which
means the primary table should contain all possible values. This default join type is chosen because it allows for more
flexibility in ad-hoc analysis and quickly changing the linking properties on the fly. However, if you need a different
type of join, such as a right outer join, you can configure the join in the join dialog.
Q 6 SET1. Name the type of chart that is displayed in the following screenshot. How can it be created? Explain. (5
marks)
The type of chart displayed in the screenshot is a Dual-Axis Chart. This chart type allows you to plot two measures
with different scales and magnitudes on two different axes in the same visualization. In this case, it’s a combination
of a bar graph (for “Shipping Cost”) and a line graph (for “Profit”).
Here are the steps to create a Dual-Axis Chart in Tableau:
Drag the Dimensions and Measures: First, drag the dimension (“Order Date”) into the Columns region or just double
click over it. Then, bring the first measure (“Shipping Cost”) into the Rows region.
Add the Second Measure: Drag the second measure (“Profit”) to the right of the first measure in the Rows region.
You should now see two graphs in your view.
Create the Dual Axis: Right-click anywhere on the right y-axis and select “Dual Axis”. This will overlay the two
graphs on top of each other.
Synchronize the Axes: To ensure both measures are visible and comparable, you might need to synchronize the axes.
You can do this by right-clicking on either axis and selecting “Synchronize Axis”.
Remember, dual-axis charts are useful for analyzing two measures with different scales. They allow you to compare
these two measures and visualize their relationships.

The chart displayed in the screenshot is a line chart.


Here's how you can create a line chart in Tableau:
Connect to your data: In Tableau, you'll first need to connect to your data source, which could be a spreadsheet,
database, or other file type.
Drag data to the X and Y axes: Once connected, drag the dimension you want on the X-axis (typically time-based or
categorical data) and the measure you want on the Y-axis (typically numerical data). In the example you sent, it looks
like "Month" (X-axis) and "SUM(Profit)" (Y-axis) are used.
Adjust chart type (optional): By default, Tableau might create a bar chart. You can change this to a line chart by
selecting the chart type from the dropdown menu in the top left corner of the visualization.
Add additional elements (optional): You can customize your line chart further by adding things like titles, labels,
legends, or filters.
Here are some additional points to consider when creating line charts:
Line charts are effective for showing trends over time or comparing multiple measures across a common category.
You can add multiple lines to a chart to compare trends between different groups.
Line charts can also be used with dual axes, which allows you to plot two measures on different scales.
By following these steps, you can create informative line charts in Tableau to effectively visualize trends and patterns
in your data.
Q6 SET2. Explain what do you understand by the Dual Axis chart in Tableau. How can it be created? (5 marks)
To create a Dual Axis chart in Tableau, follow these steps:
Drag a dimension to the Columns or Rows shelf.
Drag a measure to the Columns or Rows shelf. Tableau creates a vertical axis.
Drag another measure to the right of the vertical axis in the Columns or Rows shelf. Tableau creates a second vertical
axis.
Right-click the second axis and select "Synchronize Axis" if you want the two axes to have the same scale.
Right-click the second axis again and select "Dual Axis".
Remember to always check the "Synchronize Axis" option when creating a Dual Axis chart to ensure that your data is
displayed accurately.
A dual-axis chart in Tableau is a type of chart that represents two sets of data overlaid on top of each other, with a
shared axis. This allows you to compare two measures with different scales. For example, you might have an X-axis
for date (months) values, and two separate Y-axis representing separate measures, such as sales and profit, on each
side of your chart. This allows you to compare profit and sales figures over a period.To create a dual-axis chart in
Tableau, follow these steps:
Create a graph for one of your measures.
Drag your second measure onto your row shelf.
Tableau will generate a second graph which is technically a dual-axis chart at this point.
Synchronize the axes by right-clicking on the secondary axis and selecting "Synchronize Axis." This aligns the scale
of the secondary axis to the scale of the primary axis.
Dual-axis charts are particularly useful for analyzing two measures that have different scales. In the case of measures
with the same scale, they can share the same axis using the blend method. Blending measures in Tableau involves
dragging one measure or axis and dropping it onto an existing axis. Instead of adding rows and columns, Tableau will
show your second measure on that same continuous axis.Here's an example of a dual-axis chart:
On the left hand side, you have your current year sales as a line chart.
On the right hand side, you have your prior year sales as a bar chart.
By synchronizing the axes, you can ensure that the scales of the two measures are aligned, allowing for accurate
comparison. You can also hide one of the axes and display the values on the graph area, or change the order of your
fields in the row shelf to alternate which field represents the primary and secondary axis.
Your personal and company data are protected
A Dual Axis Chart in Tableau is a type of chart that allows you to plot two measures with different scales and
magnitudes on two different axes in the same visualization123. This is particularly useful when you want to compare
these two measures and visualize their relationships123.
Here are the steps to create a Dual Axis Chart in Tableau1245:
Load the Dataset: First, load the requisite dataset into Tableau. You can do this by clicking on the Data menu, and
under it, click on the “New Data Source” option2.
Select the Data Source Type: Select the requisite data source type2.
Examine the Data: The data gets loaded and can be seen in the “Data Source” tab. Examine the data prior to
proceeding to the analysis2.
Drag the Dimensions and Measures: Moving to the sheet tab, one can find the dimensions and the measures present
in the requisite sections2. First, drag the dimension into the Columns region or just double click over it2. Then, bring
the measures into the analysis2.
Create the Dual Axis: To add a measure as a dual axis, drag the field to the right side of the view and drop it when
you see a black dashed line appear1. You can also right-click (control-click on Mac) the measure on the Columns or
Rows shelf and select Dual Axis1.
Remember, dual-axis charts are useful for analyzing two measures with different scales123. To ensure both measures
are visible and comparable, you might need to synchronize the axes123.
Q: How to use Data Interpreter to clean data from PDF files in Tableau.
To use the Data Interpreter feature to clean data from PDF files in Tableau, follow these steps:

1. **Connect to the PDF File**: Open Tableau and under Connect, click on "PDF File." Select the PDF file you want
to connect to and click "Open."
2. **Specify Pages to Scan**: In the "Scan PDF File" dialog box, specify the pages in the file that you want Tableau
to scan for tables. You can choose to scan all pages, a single page, or a range of pages. Note that the scan counts the
first page of the file as page 1.

3. **Review Data Source**: After scanning, review the data source. If your file contains one table, click the sheet tab
to start your analysis. If there are multiple tables, drag a table onto the canvas from the left pane and then click the
sheet tab to begin analysis.

4. **Use Data Interpreter**: In the data source page, check if the Data Interpreter option is available. If Tableau
detects unique formatting or extraneous info that needs to be removed for data optimization, it will prompt you to use
Data Interpreter.

5. **Clean Data**: Activate the Data Interpreter by checking the box. The Data Interpreter will detect sub-tables and
remove unique formatting that could hinder analysis. It will clean the data by removing unnecessary components like
headers, footers, and empty rows/columns.

6. **Review Results**: After using the Data Interpreter, review the results to ensure that the data has been cleaned
correctly. You can click on the link provided to see how the data cleaning process took place and understand the
changes made.

By following these steps, you can effectively use the Data Interpreter feature in Tableau to clean data from PDF files,
making it ready for analysis and visualization.
Common Data visualization techniques:

Common Data Visualization Techniques


1. Bar Charts and Column Charts: Bar charts and column charts represent data using horizontal
or vertical bars of varying lengths. They are effective for comparing data across different
categories or groups.
2. Line Charts: Line charts are used to depict trends and changes over time. They connect data
points with lines, allowing users to observe patterns and fluctuations in the data.
3. Pie Charts: Pie charts represent data as slices of a circle, with each slice representing a category
or proportion of the whole. They are useful for displaying relative proportions or percentages.
4. Scatter Plots: Scatter plots display the relationship between two variables as points on a graph.
They help in identifying correlations or clusters in data.
5. Heat Maps: Heat maps use color gradients to represent data values in a matrix or table format.
They are effective for visualizing patterns and distributions in large datasets.
6. Maps and Geospatial Visualizations: Maps and geospatial visualizations display data in a
geographic context. They are useful for analyzing spatial patterns and relationships.

Importance of Data Visualization


Data visualization plays a crucial role in understanding and interpreting data effectively. Here are some
key reasons why data visualization is important:
1. Improved Data Understanding: Visual representations of data enable users to grasp complex
information quickly and easily. Visualizations help in identifying patterns, outliers, and
correlations, making it easier to understand the underlying data.
2. Enhanced Decision-Making: Visualizing data allows decision-makers to gain insights and make
informed decisions. By visualizing data, patterns and trends become more apparent, enabling
decision-makers to identify opportunities and address challenges.
3. Effective Communication: Visuals have a powerful impact on communication. Data
visualizations make it easier to convey information to a diverse audience, regardless of their
technical background. Visuals are more engaging and memorable than rows of numbers or text.
4. Identification of Relationships and Patterns: Visualizing data helps in identifying relationships
and patterns that might be hidden in raw data. Visuals allow users to explore data from different
angles and perspectives, enabling the discovery of meaningful insights.
5. Quick Detection of Anomalies: Visualizations can quickly reveal outliers and anomalies in data.
By visualizing data, it becomes easier to spot data points that deviate from the norm, which can
be critical for identifying errors or outliers that require further investigation.

Benefits of Data Visualization


1. Clarity and Simplicity: Visualizations simplify complex data by presenting it in a visually
appealing and understandable format. They condense large amounts of data into a concise and
meaningful representation.
2. Engagement and Understanding: Visual representations are more engaging and memorable
than textual or numerical data. Visualizations captivate the audience’s attention and facilitate
better comprehension of the data.
3. Efficient Data Exploration: Visualizations allow users to explore data interactively. Users can
drill down, filter, and manipulate visualizations to gain deeper insights into the underlying data,
enabling them to ask and answer specific questions.
4. Storytelling and Persuasion: Data visualizations have a narrative quality that can convey stories
and influence decision-making. They help in presenting data-driven arguments and persuading
stakeholders by providing compelling visual evidence.
5. Broad Applicability: Data visualization is applicable across various industries and domains. It
can be used in fields such as business, finance, healthcare, marketing, and more, to analyze data
and drive better outcomes.

You might also like