SQL SERVER - A Tricky Question and Even Trickier Answer - Index Intersection - Partition Function

SQL SERVER A Tricky Question and Even Trickier
Answer Index Intersection Partition Function

Here is the question: Write a select statement using a single table, using single table single time
only without using join keywords, which generate execution plan with 2 join operators. Use
AdventureWorks as a sample database.
Here is his answer:
SELECT Row_number() OVER (ORDER BY OBJECT_ID) num, Rank() OVER (ORDER BY OBJECT_ID
DESC)
num2
INTO
#tmp
FROM
sys.columns
-Enable
Execution
Plan
with
CTRL+M
SELECT
num,
SUM(num2)
OVER
(Partition
BY
num)
FROM #tmp
When I saw this answer I was very happy because I did not visualize it as a solution when I
was asking the question. Here is the execution plan of the T-SQL code above. Its easy to see that
there are multiple joins because of the Partition Function used in the query. What an excellent
participation by Alphonso Jones.
Click to Enlarge
Here is the answer which I had visualized when I asked the question. I was running the query on
AdventureWorks database and executed the following query, which in turn, generated an
execution plan with multiple joins:
USE
GO
SELECT
FROM
WHERE
GO
AdventureWorks2012
[EmployeeID]
258
*
[Purchasing].[PurchaseOrderHeader]
AND
[VendorID]
=
1580
Look at the execution plan of the above query. You can see the joins even though I am using
single table and there is no join syntax in the query.
Click to Enlarge
Personally, I liked the solution of Alphonso Jones as his solution will always generate multiple
joins due to Partition Function. On the other hand, my solution is a bit tricky for it requires
Indexes on the table [Purchasing].[PurchaseOrderHeader], which generates index
intersection. Index Intersection is a technique which utilizes more than one index on a table to
satisfy a given query.
BACKGROUND
This article demonstrates some commonly asked SQL queries in a job interview. I will be
covering some of the common but tricky queries like:-
(i) Finding the nth highest salary of an employee.

(ii) Finding TOP X records from each group.
(iii) Deleting duplicate rows from a table.
NOTE : All the SQL mentioned in this article has been tested under SQL Server 2005.
(i) Finding the nth highest salary of an employee.

Create a table named Employee_Test and insert some test data as:Collapse | Copy Code
CREATE TABLE Employee_Test
(
Emp_ID INT Identity,
Emp_name Varchar(100),
Emp_Sal Decimal (10,2)
)
INSERT
INSERT
INSERT
INSERT
INSERT
INTO
INTO
INTO
INTO
INTO
Employee_Test
Employee_Test
Employee_Test
Employee_Test
Employee_Test
VALUES
VALUES
VALUES
VALUES
VALUES
('Anees',1000);
('Rick',1200);
('John',1100);
('Stephen',1300);
('Maria',1400);
It is very easy to find the highest salary as:-
Collapse | Copy Code

--Highest Salary
select max(Emp_Sal) from Employee_Test
Now, if you are asked to find the 3rd highest salary, then the query is as:Collapse | Copy Code
--3rd Highest Salary
select min(Emp_Sal) from Employee_Test where Emp_Sal in
(select distinct top 3 Emp_Sal from Employee_Test order by Emp_Sal desc)
The result is as :- 1200

To find the nth highest salary, replace the top 3 with top n (n being an integer 1,2,3 etc.)
--nth Highest Salary
select min(Emp_Sal) from Employee_Test where Emp_Sal in
(select distinct top n Emp_Sal from Employee_Test order by Emp_Sal desc)
(ii) Finding TOP X records from each group

Create a table named photo_test and insert some test data as :Collapse | Copy Code
create table photo_test
(
pgm_main_Category_id int,
pgm_sub_category_id int,
file_path varchar(MAX)
)
insert into photo_test values
(17,15,'photo/bb1.jpg');
insert
insert
insert
insert
insert
insert
insert
into
into
into
into
into
into
into
photo_test
photo_test
photo_test
photo_test
photo_test
photo_test
photo_test
values(17,16,'photo/cricket1.jpg');
values(17,17,'photo/base1.jpg');
values(18,18,'photo/forest1.jpg');
values(18,19,'photo/tree1.jpg');
values(18,20,'photo/flower1.jpg');
values(19,21,'photo/laptop1.jpg');
values(19,22,'photo/camer1.jpg');
insert into photo_test values(19,23,'photo/cybermbl1.jpg');

insert into photo_test values
(17,24,'photo/F1.jpg');
There are three groups of pgm_main_category_id each with a value of 17 (group 17 has
four records),18 (group 18 has three records) and 19 (group 19 has three records).
Now, if you want to select top 2 records from each group, the query is as follows:Collapse | Copy Code
select pgm_main_category_id,pgm_sub_category_id,file_path from

(
select pgm_main_category_id,pgm_sub_category_id,file_path,
rank() over (partition by pgm_main_category_id order by pgm_sub_category_id asc) as rankid
from photo_test
) photo_test
where rankid < 3 -- replace 3 by any number 2,3 etc for top2 or top3.
order by pgm_main_category_id,pgm_sub_category_id
The result is as:Collapse | Copy Code

pgm_main_category_id
17
15
17
16
18
18
18
19
19
21
19
22
pgm_sub_category_id
file_path
photo/bb1.jpg
photo/cricket1.jpg
photo/forest1.jpg
photo/tree1.jpg
photo/laptop1.jpg
photocamer1.jpg
(iii) Deleting duplicate rows from a table

A table with a primary key doesnt contain duplicates. But if due to some reason, the
keys have to be disabled or when importing data from other sources, duplicates come
up in the table data, it is often needed to get rid of such duplicates.
This can be achieved in tow ways :(a) Using a temporary table.
(b) Without using a temporary table.
(a) Using a temporary or staging table
Let the table employee_test1 contain some duplicate data like:Collapse | Copy Code
CREATE TABLE Employee_Test1
(
Emp_ID INT,
Emp_name Varchar(100),
Emp_Sal Decimal (10,2)
)
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INTO
INTO
INTO
INTO
INTO
INTO
INTO
Employee_Test1
Employee_Test1
Employee_Test1
Employee_Test1
Employee_Test1
Employee_Test1
Employee_Test1
VALUES
VALUES
VALUES
VALUES
VALUES
VALUES
VALUES
(1,'Anees',1000);
(2,'Rick',1200);
(3,'John',1100);
(4,'Stephen',1300);
(5,'Maria',1400);
(6,'Tim',1150);
(6,'Tim',1150);
Step 1: Create a temporary table from the main table as:Collapse | Copy Code
select top 0* into employee_test1_temp from employee_test1
Step2 : Insert the result of the GROUP BY query into the temporary table as:Collapse | Copy Code
insert into employee_test1_temp
select Emp_ID,Emp_name,Emp_Sal
from employee_test1
group by Emp_ID,Emp_name,Emp_Sal
Step3: Truncate the original table as:Collapse | Copy Code

truncate table employee_test1
Step4: Fill the original table with the rows of the temporary table as:Collapse | Copy Code
insert into employee_test1
select * from employee_test1_temp
Now, the duplicate rows from the main table have been removed.
select * from employee_test1
gives the result as:Collapse | Copy Code

Emp_ID
1
2
3
4
5
6
Emp_name
Anees
Rick
John
Stephen
Maria
Tim
Emp_Sal
1000
1200
1100
1300
1400
1150
(b) Without using a temporary table

;with T as
(
select * , row_number() over (partition by Emp_ID order by Emp_ID) as rank
from employee_test1
)
delete
from T
where rank > 1
The result is as:-

Emp_ID
1
2
3
4
5
6
Emp_name
Anees
Rick
John
Stephen
Maria
Tim
Emp_Sal
1000
1200
1100
1300
1400
1150

SQL SERVER - A Tricky Question and Even Trickier Answer - Index Intersection - Partition Function

Uploaded by

Copyright:

Available Formats

You might also like

SQL SERVER - A Tricky Question and Even Trickier Answer - Index Intersection - Partition Function

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SQL SERVER - A Tricky Question and Even Trickier Answer - Index Intersection - Partition Function

Uploaded by

Copyright:

Available Formats

SQL SERVER A Tricky Question and Even Trickier

Answer Index Intersection Partition Function

(i) Finding the nth highest salary of an employee.

(i) Finding the nth highest salary of an employee.

It is very easy to find the highest salary as:-

Collapse | Copy Code

The result is as :- 1200

(ii) Finding TOP X records from each group

insert into photo_test values(19,23,'photo/cybermbl1.jpg');

select pgm_main_category_id,pgm_sub_category_id,file_path from

The result is as:Collapse | Copy Code

(iii) Deleting duplicate rows from a table

Step3: Truncate the original table as:Collapse | Copy Code

gives the result as:Collapse | Copy Code

(b) Without using a temporary table

The result is as:-

Collapse | Copy Code

You might also like