Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

Ranking Functions: ROW_NUMBER vs RANK vs DENSE_RANK vs NTILE

  • 4 February 2010
  • Author: Marco Francisco
  • Number of views: 39054
  • 0 Comments

*** Edited ***

Videos now available with title Ranking Functions: http://www.sqlshare.com/profiles/26681/MarcoFrancisco.aspx

 

In SQL Server there is 4 ranking functions:

ROW_NUMBERReturns the rank of rows within the partition of a result set, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come before the row in question.
RANKReturns the rank of each row within the partition of a result set. The rank of a row is one plus the number of ranks that come before the row in question.
DENSE_RANKReturns the rank of rows within the partition of a result set, without any gaps in the ranking. The rank of a row is one plus the number of distinct ranks that come before the row in question.
NTILEDistributes the rows in an ordered partition into a specified number of groups. The groups are numbered, starting at one. For each row, NTILE returns the number of the group to which the row belongs.

below is the syntax:

ROW_NUMBER ( )     OVER ( [ ] )

RANK ( )    OVER ( [ < partition_by_clause > ] < order_by_clause > )

DENSE_RANK ( )    OVER ( [ < partition_by_clause > ] < order_by_clause > ) 

NTILE (integer_expression)    OVER ( [ ] < order_by_clause > )

 

Ranking functions returns a ranking value for each row either in a partition, or not. We can produce the very same result for all functions, but they have different purpose.

In attachment is the example I will work on. This example is the group stage standing of UEFA Champions League (http://www.uefa.com/competitions/ucl/standings/index.html). 

What I want to know is wich team was the best in group stage and which was the worst. To achieve this information I won't need the partition clause, because I don't want to restart ranking and it's all about to all teams in Champions League:

SELECT [Team]
      ,[Country]
      ,[Games Played]
      ,[Wons]
      ,[Draws]
      ,[Losses]
      ,[Goals]
      ,[Goals soffered]
      ,[Points]
	  ,ROW_NUMBER() OVER(ORDER BY Points desc) AS ROW_NUMBER
	  ,RANK() OVER(ORDER BY Points desc) AS RANK
	  ,DENSE_RANK() OVER(ORDER BY Points desc) AS DENSE_RANK
	  ,NTILE(2) OVER(ORDER BY Points desc) AS NTILE
  FROM [UEFA]

Notice that I've just ordered by Points in descent order, resulting:

 Order by Points

As you can see, row_number returns a count to each team, without any gaps. The total is 32.

Rank maintains the rank but is counting behind the scenes. You can see that among rows 5 to 9, there are 5 teams in the 4th place. But when there is a new rank (FC Porto), it introduces a gap because it jumped into 9th place. As the definition says " The rank of a row is one plus the number of ranks that come before the row in question".

Something that dense_rank does not. FC Porto happears in the 5th place. As in definition "The rank of a row is one plus the number of distinct ranks that come before the row in question".

The Ntile, since I've specified 2 in the parameter, it devides all data into 2 groups. 16 into 1st group and the other 16 to 2nd group. This is useful because with this I know which 16 teams went to the next stage of champions league.

Now, I will introduce more fields in the order clause so we can distinguish those teams repeated often (for example in rank=4). Columns are Wons and Goals:

SELECT [Team]
      ,[Country]
      ,[Games Played]
      ,[Wons]
      ,[Draws]
      ,[Losses]
      ,[Goals]
      ,[Goals soffered]
      ,[Points]
	  ,ROW_NUMBER() OVER(ORDER BY Points desc, Wons desc, Goals Desc) AS ROW_NUMBER
	  ,RANK() OVER(ORDER BY Points desc, Wons desc, Goals Desc) AS RANK
	  ,DENSE_RANK() OVER(ORDER BY Points desc, Wons desc, Goals Desc) AS DENSE_RANK
	  ,NTILE(2) OVER(ORDER BY Points desc, Wons desc, Goals Desc) AS NTILE
  FROM [UEFA]

Result:

 Order by Points, Wons and Goals

Now we can see that the previous teams in rank 4 were split into more ranks. Now only Lyon and Arsenal still in the same rank 5. Real Madrid took the 4th place alone. Sevila jumped into 6 and Manchester into 7.

 

Now I will introduce the partition clause. So, main objective changes. Now I want to know which team was the best in their country:

SELECT [Team]
      ,[Country]
      ,[Games Played]
      ,[Wons]
      ,[Draws]
      ,[Losses]
      ,[Goals]
      ,[Goals soffered]
      ,[Points]
	  ,ROW_NUMBER() OVER(PARTITION BY Country ORDER BY Points desc) AS ROW_NUMBER
	  ,RANK() OVER(PARTITION BY Country ORDER BY Points desc) AS RANK
	  ,DENSE_RANK() OVER(PARTITION BY Country ORDER BY Points desc) AS DENSE_RANK
	  ,NTILE(2) OVER(PARTITION BY Country ORDER BY Points desc) AS NTILE
  FROM [UEFA]

 Partition by country

You can see that Chelsea was the best English team. Manchester and Arsenal are drawn in the 2nd place.

In Italy, Fiorentina was the best at group stage, Milan and Inter also drawn at the 2nd place.

In Spain, Sevilha and Real Madrid took 1st place.

 

Print
Categories: Blogs
Tags:
Rate this article:
5.0

Please login or register to post comments.