Beta Release - Please Report Issues
Who is online? 47 guests and 0 members
Member login | Become a member
home » blogs » DevinKnight » Better Know A SSIS Transform - The Sort Transform
posted 11/7/2009 6:50:52 PM by DevinKnight
This is part 2 of my 29 part series called Better Know A SSIS Transform. Hopefully you will find the series informative. I will tell you a little about each transform and follow it up with a demo basic you can do on your own.
When developing SSIS packages you are bound find out that some transforms require your data to be sorted before it can accept new input rows (Ex. Merge & Merge Join). The Sort Transform may be one of the more frequently used transforms for that reason, but it really should be one of the most avoided.
Why Avoid the Sort Transform?
Hopefully after you read this section you will agree with me that the new title of this blog should be Better Know a SSIS Transform Better Not Know a SSIS Transform.
The Sort Transform is a fully blocking asynchronous transform, meaning no rows can pass this transform until all rows have been sorted. If you take a minute to think about it that makes sense. You cannot send rows further down the Data Flow until all the rows have been sorted, so the Sort Transform holds up the show.
The best way to visualize this is to think of a deck of cards. You have a Conditional Split that wants to split the face cards from the numbered cards. Before that can happen you have a Sort Transform that will sort all the cards by numbers and face value. None of the cards can reach the Conditional Split until all the cards have been sorted causing our package to stall at the Sort Transform until all the rows have been processed.
So we know now we want to avoid it when possible. The obvious way to avoid it is by using TSQL in your source, but what if we have a flat file source where you can’t write a TSQL ORDER BY. In this case we’re stuck using the Sort Transform.
How to Avoid
SELECT ProductID, TransactionDate, Quantity, ActualCost FROM Production.TransactionHistory ORDER BY ProductID
Strategies for Using
When you have no other option but to use the Sort Transform (Ex. a Flat File Source) then be sure to use it wisely. Think about the order of your Data Flow. If you already have an Aggregate Transform in the package that’s performing a Group By then make sure to use the Sort Transform is after the Aggregate Transform. This way instead of sorting 100,000 rows and then grouping them into 440 rows you can group 100,000 rows into 440 rows and only have to sort 440 rows. **The Aggregate Transform is also a asynchronous transform that should be avoided when possible**
Configuring the Sort Transform
These steps will walk you through a basic configuration of the Sort Transform.
Example Overview
Step 1: Configure Flat File Source
Step 2: Add Data Viewer
Step 3: Configure Sort Transform
Step 4: Add Data Viewer and Destination
DevinKnight (Member since: 10/27/2009 11:25:07 AM) Devin (MCTS) is a Senior BI consultant at Pragmatic Works Consulting. Previously, he has tech edited the book Professional Microsoft SQL Server 2008 Integration Services and was an author in the book Knight's 24-Hour Trainer: Microsoft SQL Server 2008 Integration Services. Devin has spoken at past conferences like PASS and at several SQL Saturday events. He is a contributing member to the Business Intelligence Special Interest Group (SIG) for PASS as a leader in the SSIS Focus Group. Making his home in Jacksonville, FL, Devin helps run his local users’ group (JSSUG) as a Vice President.
View DevinKnight 's profile
Comment (No HTML)
It's fast, easy and free! Submit articles, get your own blog, ask questions & give answers in the forums, and become a better developer, faster.
enter your email address: