Posted: 4/10/2012
I appear to have hit the wall on a package for a real-time OLTP dynamic calculations app where we cache aggregations for year, month, week and day every 15-minutes to keep it fresh for our Users.
The fact the SAME data are aggregated for these four periods means I can set up a parallel series of 4 SQL tasks, one for each, and maximize parallel use of the data in the SQL buffers as against the current serial paradigm where we run daily, weekly, monthly then yearly - the aim being to cut our run time to approx 30% or lower than the current 4 minutes per 15 wall time and when we run a full refresh down from 80 hours to about 24 (or less if I can configure the package to run in three parallel chunks on thirds of the site population (90 of them, all distinct)).
The first stage is to clear our stale data - we have an "intelligent cache management" mechanism that detects when it gets data that isn't in asscending time sequence. The "trigger records" tell the 15-minute refresh SP which Meters to clear out, including any other Meters that consume the base data in one or more formulae. Once a trigger record has been created, it won't be destroyed until the cache has been refreshed.
Anyway, I wanted to follow best practice of only clearing data at the closest point to the refresh of a given Meter. So I set up a Cache Connection that pulled the candidate DELETEs into an ADO Object. Fine so far.
The list of candidate REFRESH Meters comes from another SP that I pull into an Object that is the subject of my For Loop Container.
In order to then clear the stale data and repopulate, I had envisaged running a Lookup Transform for the ID's of the Refresh Meter vs. the Delete Meter Object. That's my problem! The Lookup Transform needs a "Source", which I had hoped could be the ID's passed to the Data Flow container from the For Loop iteration. Nope!
Is there a workaround I might use to "write" the ID's to "something" that the Data Flow can consume as its Lookup "Source", then gracefully hand the Refresh ID's plus Delete ID's to the next Control Flow task that will kick off the stale data DELETE then the four parallel SP execs for the recalculations (which use dynamic SQL due to the nature of the highly complex multi-level formulae involved).
I'm open to suggestions or even a confirmation that this is a no-go...
Many thanks, in hopes...
Hunting around MSDN a possibility occurred:
Feasible?