View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.
You might have heard the news that SSIS 2008 has set an ETL World record of uploading 1TB of data in less than half an hour. But these can only be acheived with proper logic, planning and applying best practices. Going ahead, sharing with you all some of the SSIS best performance practices which we should apply for each ssis package development and make a habit of this. I have divided this post into different parts, since it contains numerous stuff on SSIS performance practices.
Pulling High Volumes of Data
Recently we had to pull data from a source table which had 300 millions records to a new target table. Initially when the SSIS package started, everything looked fine, data was being transferred as expected but gradually the performance degraded and the data transfer rate went down dramatically. During analysis we found that the target table had a primary clustered key and two non-clustered keys. Because of the high volume of data inserts into the target table these indexes got fragmented heavily up to 85%-90%. We used the online index rebuilding feature to rebuild/defrag the indexes, but again the fragmentation level was back to 90% after every 15-20 minutes during the load. This whole process of data transfer and parallel online index rebuilds took almost 12-13 hours which was much more than our expected time for data transfer.
Then we came with an approach to make the target table a heap by dropping all the indexes on the target table in the beginning, transfer the data to the heap and on data transfer completion, recreate indexes on the target table. With this approach, the whole process (by dropping indexes, transferring data and recreating indexes) took just 3-4 hours which was what we were expecting.
This whole process has been graphically shown in the below flow chart. So the recommendation is to consider dropping your target table indexes if possible before inserting data to it specially if the volume of inserts is very high.
Awaitng you valuable comments..
Other posts by abhi_MSBI