posted 1/25/2010 by sqlrunner - Views: [3379]
The problem I am trying to solve is a rather simple one. I have an incoming field from a database that looks like this:
-S HCUS_SC12 -E -d COOP2000DB -q "EXIT(DECLARE @R INT EXEC @R=USP_LoadT800 SELECT @R)"
What I am looking to do is parse out the some of the parameters, the -S and -d. I found it somewhat complex to do in T-SQL, because there are numerous variations in the above text, sometimes with slashes, other times without either an S or a D and it made the code somewhat difficult to read. This piece was already part of a larger SSIS package, so moving some of the logic into SSIS seemed like a good decision. The difficulty arose when I realized the column was defined as TEXT. Before the hooting and hollering, this is a third party app, so I have no control over the column definitions :)
Within SSIS any field that is defined as DT_TEXT/DT_NTEXT comes through not as a CLOB, but as a BLOB. You would think it would be a CLOB where you just pull out the characters from a large array, but it isn't and you can't. DT_TEXT is an array of bytes. The extraction process is not quite intuitive, but nonetheless easy enough.
First thing you do, as always, is make the incoming stream accessible to the Script Task (remember this is in the Data Flow)
As you can see from the image above, we have a single READ_ONLY variable of type DT_TEXT. Now as I stated above I am looking to pull out two pieces of information, ServerName and DatabaseName. I create those output parameters here:
Now with my input and output parameters set for my task I can start processing. The trick to getting the strings out of the binary stream is the use of the System.Text.ANSSIEncoding object. For efficiency I create the encoding object once at the top of the script vs. in the ProcessInputRow method, which gets called for every row.
Now that we are set-up to convert byte[] to char[], let's dig in.
First thing I do is pull all of the bytes off the stream and into a byte[]. I then use the encoding object to convert those bytes into a String. While I'm at it I inline taking that one string, converting all spaces to a '|' and then splitting that String into String[], which is stored in the jobparams variable. Now all I need to do is extract the pieces I want and load them up into the downstream variables. Here is the rest of the code for the sake of being complete within the post.