Professional Documents
Culture Documents
Yuhik
Yuhik
Yuhik
For each order_id record respective minimum and maximum price of the product has
been added.
We can use each function separately too.
d. Sum():
Example:
Query:
SELECT order_id, name, product_id, price,
SUM(price) OVER (PARTITION BY order_id) AS Average_Order_Price
FROM retails
Explanation:
a. RANK():
The Rank of a value in a group of values based on the ORDER BY expression in the
OVER clause (refer Query 1).
Each value is ranked within its PARTITION BY expression (refer Query 2).
Rows with equal values for the ranking criteria receive the same rank.
Tie or same rank skip the consecutive rank eg. Rank (): 1,1,3,4,5.
Example:
Query 2: Rank the product based on their prices in each order (i.e. partition by order_id).
Explanation:
As we can see in both query, ORDER BY states the expression used to rank the values.
Query 1:
Query 2:
b. DENSE_RANK():
Similarly to Rank() function, Rank of a value in a group of values based on the ORDER
BY expression and the OVER clause and each value is ranked within its PARTITION
BY expression.
The difference is, Rows with equal values receive the same rank and Tie or the same rank
not skip the consecutive rank.
Example: Dense_Rank(): 1,1,2,3,4
EXAMPLE:
Query: Dense_Rank the product based on their prices in each order (i.e. partition by order_id).
Explanation:
As we can see ranking to each row done based on ORDER BY expression i.e. price
values also within each order_id i.e. (PARTITION BY order_id).
order_id 1114, have 5 products out of which 2 products having the same price hence rank
tie i.e. 1.
The next rank starts with 2 (this is the major difference between Rank() and
Desne_Rank() function).
Dense_Rank() not skip the consecutive rank number.
c. CUME_DIST():
Calculates Relative Rank of the current row within a window partition based on below
Formula:
EXAMPLE:
Query: CUME_DIST i.e. Relative rank the product based on their prices in each order (i.e.
partition by order_id).
Explanation:
Let's consider order_id 1112 having 3 products, relative rank calculated as discussed below
formula used:
Similarly, if product having same value or price then relative rank also same check order_id
1114 in output screenshot.
d. ROW_NUMBER():
An ordinal number of the current row within its partition based on ORDER BY
expression in the OVER clause.
Each value is ordered within its PARTITION BY expression.
Rows with equal values for the ORDER BY expressions receive different row numbers
non deterministically.
EXAMPLE:
Query: Assign Row_Number to the product based on their prices in each order (i.e. partition by
order_id).
Explanation:
As we can see in output screenshot, row number assign based on price (ORDER BY
expression) within each order (PARTITION BY order_id).
Not consider value same or not, just assign row_number to each row in the expression.
e. NTILE():
Divides the rows for each window partition, as equally as possible, into a specified
number of ranked groups.
Requires ORDER BY clause in the OVER clause.
The column or expression specified in ORDER BY clause, first all values has been sorted
in ascending order and then equally assign group number.
Example:
Query: Assign Group/cluster/bucket number to all row into 10 different groups based on the
product price.
Explanation:
Example:
Query: Calculate or assign percentage rank to all row based on the product price.
Explanation:
Let’s consider order_id 1114 having 5 products, relative rank calculated as discussed below
formula used:
Row no. 9— First product: (1–1)/(5–1) = 0
Row no. 10— Second product: (1–1)/(5–1) = 0
Row no. 11— Third product: (3–1)/(5–1) = 0.5
Row no. 12 — Forth product: (4–1)/(5–1) = 0.75
Row no. 13 — Fifth product: (5–1)/(5–1) = 1
a. LAG() or LEAD():
Syntax:
LAG or LEAD returns, value for the row value before or after the current row in a
partition respectively.
If no row exists, null is returned.
Example:
Query: Add new feature 1 step LAG or LEAD product price within each order (i.e.
PARTITION BY order_id)
Explanation: