posted 8/20/2010 by MikeDavis - Views: [8604]
There is no native Transform built in to SSIS to get the previous row in the data flow. But with a little work you can do this and make it perform much better than a SQL cursor and you don’t have to use the dreadfully slow OLEDB Command transform.
In this example I have some data that shows the day a patient was admitted to the hospital and you want to figure how many days it has been since the person was in the hospital last. So you want to calculate the days since the last admit date. Here is the table.
You can see you are going to need the previous row to calculate the number of days since the patient’s last admit date. To accomplish this you are going to use the merge join transform and a little trick with row numbers.
In the data flow you will have two OLEDB sources from the same table. The query in the left source will be.
SELECT PatientID, Name, AdmitDate, DaysSinceLastAdmit, ROW_NUMBER() Over(Order by PatientID, AdmitDate) as RowNumFROM dbo.PatientOrder by PatientID, AdmitDate
The right source has the same query excecpt the row count has a one added to it:
SELECT PatientID, Name, AdmitDate, DaysSinceLastAdmit, (ROW_NUMBER() Over(Order by PatientID, AdmitDate)) + 1 as RowNumFROM dbo.PreviousRowOrder by PatientID, AdmitDate
Notice you are ordering by the patientID and then the admit date but the row number will be the top ordered by column. Here is the output of the left query.
Here are the results from the Right query. Notice the row numbers are off by one from the left query.
You will need to set up the OLEDB source as sorted in the advanced editor. The trick to this is to set the row number as sort key 1 and the Patient ID as sort key 2. If you need instructions on how to do this, check out my other blog here on using Merges. Don’t forget to set IsSorted to true.The Next transform will be a Merge join. The image below shows how the Merge Join is configured. It is using an Inner Join to eliminate null rows. It is getting the admit date from the right side which would be the previous row to the one on the left.
Here is the output from the Merge join.
Now you have the admit date and the previous admit date on the same row. It is just a simple derived column using the DateDiff function to get the days since last visit. Here is the code for that.DATEDIFF("d",[AdmitDate Prev],AdmitDate)The output after the Derived Column will look like the following image. Now you have the days.
The final Data Flow will look like the following image.
Now you can write this to a staging table and then update the Patient table using an Execute SQL Task.
Ha... Bloody brilliant work around for getting differences between current and previous record using SSIS.
Thank you for the helpful, clear explanation to this common requirement.
Here is an alternative solution with a Script Component to get the previous row values:
http://microsoft-ssis.blogspot.com/2011/04/compare-values-of-two-rows.html