Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

Using FIRST_VALUE and LAST_VALUE

  • 25 November 2012
  • Author: Kathi Kellenberger
  • Number of views: 18868
  • 0 Comments

I have written a couple of blog entries about the new T-SQL functionality introduced with SQL Server 2012. Today, I would like to talk about two more functions: FIRST_VALUE() and LAST_VALUE(). 

These functions are part of the group of Window functions that work with the OVER clause.  The FIRST_VALUE() function returns the first value within the partition and is relatively easy to use.  Here is the syntax from Books Online:

FIRST_VALUE ( [scalar_expression )
    OVER ( [ partition_by_clause ] order_by_clause [ rows_range_clause ] )

Hopefully, the syntax looks pretty straight forward to you, except for possibly the rows_range_clause. This clause was added with SQL Server 2012 and allows you to specify certain rows for the function to operate on. This is called the “frame”.  For FIRST_VALUE(), you will probably not need to specify the frame clause. Here is an example returning the first order date for each customer:


SELECT CustomerID, OrderDate,
      FIRST_VALUE(OrderDate) OVER(PARTITION BY CustomerID ORDER BY OrderDate) AS FirstOrderDate
FROM Sales.SalesOrderHeader;

It doesn’t matter if the query has an ORDER BY clause. The OVER clause determines how the FIRST_VALUE() function is applied to the data. Here are the results:


 
The LAST_VALUE() function is a bit trickier to use. If I just modify the previous query, replacing LAST_VALUE() with FIRST_VALUE(), I get results I didn’t expect.

 
SELECT CustomerID, OrderDate,
     LAST_VALUE(OrderDate) OVER(PARTITION BY CustomerID ORDER BY OrderDate) AS LastOrderDate
FROM Sales.SalesOrderHeader;

 
Instead of returning the last order date for the customer partition, I get the same order date. To get the results I expect, I will have to use the frame clause.  In this case, I need to specify the current row and the rows following the current row. The syntax for this is “RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING”. Here is the new query:

SELECT CustomerID, OrderDate,
     LAST_VALUE(OrderDate) OVER(PARTITION BY CustomerID ORDER BY OrderDate
     RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ) AS LastOrderDate
FROM Sales.SalesOrderHeader;

Here are the results:

The last order for CustomerID 11000 is November 11, 2007. That date is properly returned for each of the customer’s results when I use the frame clause.

So why did LAST_VALUE() not work as we expected without the frame clause? For functions that support the frame clause, the default value is “RANGE UNBOUNDED PRECEDING AND CURRENT ROW”.  Without specifying the frame, the LAST_VALUE() function is only seeing the previous rows and not the later rows.  So, if you don’t specify the frame, the LAST_VALUE() will be the last value up to the current row.

 

Print
Categories: SQL Server
Tags:
Rate this article:
4.5

Please login or register to post comments.