I would like to ask for some help in completing a simple SSIS package that will read data from an excel file, insert it into a staging table, and then perform a check for duplicates before inserting the data into the last destination sql server table.
I have completed all the steps, but I am having difficulty figuring out the best way to check for duplicates and insert the data into the last table.
Down below is a screen shot of my data flow. If anyone could explain how I can complete the last step, possibly with detail instructions or an example, that would be awesome.
Please let me know if I need to provide more details in order to get better help.
Thank you very much.
This is my first time creating a SSIS Package.
Couple of things.
First delete the sorts, they aren't doing anything for you and they slow the process down.
Second, the red line that reads 'ole db destination error output' is the error rows for the first db destination.
If you want to update two tables you will need to use a conditional split, multicast or lookup.
For what you are trying to do I would use a lookup. When you have a row ID that matches, you send it to an OLE DB Command and use an update statement. When you don't have a match send it to an OLE DB Destination.
Hi Shawn. Thank you for your reply. I have been working on this a full day and I am getting to nowwhere, I do not have much experience but want to learn. Please bear with me if I don't understand fully and fast.
I have removed the sorts, so my new DF looks like:
I do not want to update two tables, the staging table was just an idea to store records and then read from there so that it is easy for me to perform a check for duplicates before inserting to the final destination table. May be there is a better way to do it via SSIS that I do not need to have a stagin table?
What should I use after inserting the rows to the stagin table? the lookup component? would it be possible that you show me a mock up DF of how things should look like?
Thank you again for your reply.
You say you are checking for duplicates. Can you please elaborate, do you want to check for duplicates in the source Excel file, or check to see if a record is a duplicate to a record already in the target table. I would use a different process depending on where I expected a duplicate to happen.
Hi Daniel, thanks for your reply.
My basic SSIS package will upload data in an Excel format to a table in SQL Server. The problem is data in Excel may contain duplicate data already in the SQL table. Is there a way to upload the data but if it the record already exists not to import it? The columns on the destination table are:
[Commodity Code] ,[Date] ,[Description] ,[Market Price]
There can be multiple rows with same Commodity Code, but the Date has to be different, so this two columns combined, make each row unique.
I would not mind to change the process if it helps me to get where I need to be.
Thank you again.
I followed the example on this post
and got this far on my DF
my question is, the DF seems to run ok for each escenario, but no records are being written to the database table. What part am I missing, the OLE DB Destination? If that is what I am missing, where should I place that component?
Thank you very much for taking the time to follow this question.
Unless you are going to have multiple inputs to the Union All transforms there is no need for them.
Looking at your work I am making the assumption that you want to update dates when appropriate, and insert new records. Where the record exists and the date is the same there is no need to do anything.
If this is true then there is no need at all for the "All terminate flow 1" branch. You can delete your "All Terminate Flow 3" and replace it with a OLE DB Destination to get the data into your database. Although it is not the most performant option for the sake of simplicity you can replace "All Terminate Flow 2" with an OLE DB Command to update your data.
Hi again Daniel,
"Although it is not the most performant option". What would be a good performant option? it would be good for me to learn that way. Thank you for your feedback.
Regarding the UPDATE part of this process.
If I have a record:
Description: Dark Chocolate
Then re-submit the SSIS job with a change only in price field (update):
Description: Dark Chocolate
Only when the date is different and the item number is the same, then record the new record (insert)
Any examples are welcome.
Hi, dont know if you've sorted this yet, but there's a couple of ways you can do this:
You can also do a mix between the two.