Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

«February 2018»

Whats New BI-wise

  • 10 July 2013
  • Author: Mike Milligan
  • Number of views: 47155
Whats New BI-wise
My Gift to My Readers
This is going to be the page I update when I have quick links and blurbs related to business intelligence that I'd like to share.  Put your email in the widget at the very bottom if you'd like to be notified when the content changes. 

BI from g8rpal on 8tracks.




How to Test SSRS MDX Queries in SQL Server Management Studio



SSAS Gotcha!  Today I got bit by this gotcha.  I created a default member for my date dimension for yesterday's date.  Suddenly, when I use the date dimension in the Report Filter quadrant of an Excel PivotTable it doesn't work.  It filters the data on the default member (yesterday) no matter what I select in the report filter.

Default Member Trap (be careful of what you filter)



Comparing Data Warehouse Design Methodologies for Microsoft SQL Server

I like this article for it's textbook definition of an ODS.




1. Exceeding the boundaries of a single screen

2. Supplying inadequate context for the data

3. Displaying excessive detail or precision

4. Expressing measures indirectly

5. Choosing inappropriate media of display

6. Introducing meaningless variety

7. Using poorly designed display media

8. Encoding quantitative data inaccurately

9. Arranging the data poorly

10. Ineffectively highlighting what’s important

11. Cluttering the screen with useless decoration

12. Misusing or overusing color

13. Designing an unappealing visual display

From:  Common Pitfalls in Dashboard Design - Stephen Few

More of the same:

Clutter, data overload put dashboard designs on path to failure

Data Visualization and Dashboard Design



"Yet standardizing the T-SQL code across an organization is essential to facilitating effective and efficient code reviews, troubleshooting T-SQL script, supporting joint development efforts, and handing off projects from one group to another. DB engineers might disagree about what the code should look like, but few will question the wisdom of implementing such standards."


"And there's another thought here as well. Within your code, it's best if you specify the schema name (even if it is the default schema or some other mechanism) along with the object when you select or activate it. It saves the Query Processor a step from having to resolve the schema name out by itself."

You can find more here:



Features Not Supported in a Future Version of SQL Server

The following SQL Server Database Engine features are supported in the next version of SQL Server, but will be removed in a later version. The specific version of SQL Server has not been determined.


Microsoft's new 3D data visualization BI product - GeoFlow



"The survey findings would seem to suggest that the CFO prioritizes business applications higher than the CIO does," said Bill Sinnett, senior director, research at FERF. "If the CIO does not understand this, then there's a chance the CFO will sponsor his or her own initiatives, and not coordinate them with the IT organization. This demonstrates the trend that BI is becoming less of a CIO responsibility and more of a CFO and line-of-business responsibility."

Gartner Says Business Intelligence/Analytics Is Top Area for CFO Technology Investment Through 2014


Analysis Services and the Case of the Standby Cache

Finding and killing SSAS Sessions

Dealing with Long Running SSAS Queries using Powershell


Power View for Cubes

Coming Soon: A New Edition of Information Dashboard Design



EffectiveUserName research

Using EffectiveUserName To Impersonate in SSAS

 This link says you do not need to configure Kerberos on the
SSAS server:

 How to configure dynamic security in Analysis Services and
Performance Point 2010 without Kerberos

 Use Analysis Services EffectiveUserName in SharePoint Server

 Use EffectiveUserName in PerformancePoint Services
(SharePoint Server 2013)

 Unit Testing Role Security in Analysis Services




I am speaking at SQL Saturday Jacksonville

Registration is Open!

My session starts at 9AM, Data Mining with DMX.


Get yesterday's date and more using MDX


                  + VBAMDX.Format(VBAMDX.Now(), "yyyyMMdd")+ "]")) ON COLUMNs
FROM [Adventure Works];


-- The First Calculated member is the value of NOW()
WITH  MEMBER [Measures].[Full Date] as 'NOW()'
-- The Second Calculated Member is the Day part of the first calculated member.
MEMBER [Measures].[What Day] as 'DAY([Full Date])'
-- The Third Calculated Member is the Month part of the first calculated member.
MEMBER [Measures].[What Month] as 'MONTH([Full Date])'
-- The Fourth Calculated Member is the Year part of the first calculated member.
Member [Measures].[What Year] as 'YEAR([Full Date])'
   {[Full Date],[What Day],[What Month],[What Year]} ON COLUMNS
FROM [Adventure Works];



SSIS VB.Net Script to download an FTP file from an AS/400 DB2 server.  (Credit goes to Michael Simon for piecing this together from various sources on the internet.)


    Public Sub Main()
Try 'Create the connection to the ftp server Dim cm As ConnectionManager = Dts.Connections.Add("FTP") 'Set the properties like username & password cm.Properties("ServerName").SetValue(cm, "") cm.Properties("ServerUserName").SetValue(cm, "Hugh") cm.Properties("ServerPassword").SetValue(cm, "Janus") cm.Properties("ServerPort").SetValue(cm, "21") cm.Properties("Timeout").SetValue(cm, "0") 'The 0 setting will make it not timeout cm.Properties("ChunkSize").SetValue(cm, "1000") '1000 kb cm.Properties("Retries").SetValue(cm, "1") 'create the FTP object that sends the files and pass it the connection created above. Dim ftp As FtpClientConnection = New FtpClientConnection(cm.AcquireConnection(Nothing)) 'Connects to the ftp server ftp.Connect() 'Build a array of all the file names that is going to be FTP'ed (in this case only one file) Dim FileName(0) As String FileName(0) = Dts.Variables("ftpFileAndMember").Value.ToString Dim FileDate As Date FileDate = System.DateTime.Today Dim currentDirectory As String currentDirectory = Dts.Variables("ftpDirectory").Value.ToString ftp.SetWorkingDirectory(currentDirectory) 'ftp the file Dim LocalPath As String LocalPath = Dts.Variables("localFilePath").Value.ToString ftp.ReceiveFiles(FileName, LocalPath, False, True) ' the True makes it overwrite existing file and False is saying that it is not transferring ASCII ftp.Close() Catch ex As Exception Dts.TaskResult = ScriptResults.Failure End Try Dts.TaskResult = ScriptResults.Success End Sub


Exciting PerformancePoint links!

PerformancePoint tutorial

Formatting your PerformancePoint Analytic Grid!

Using SSAS MDX Calculation Color Expressions



Extra Exciting Links!

Comparison of Queries Written in T-SQL and SQL Server MDX

SQL Server Analysis Services Period over Period Variance Analysis



On Saturday, April 27th at 9:00 AM I will be presenting a session at Jacksonville's SQL Saturday event on the DMX data mining query language.  Hope to see you there!

SQL Down Under - Be sure to check out Erin Walker's session on dimensional modeling!  Our favorite Jacksonville consulting company Pragmatic Works is mentioned.

Bunch of posts about PerformancePoint and using Excel w/ KPIs from SSAS

Unbelievable.  M$ and their licensing.  Looks like they are effectively killing PowerPivot and PowerView.

Furthermore, with the more expensive SQL Server 2012 licensing (per core) I have to wonder how many people are staying with SQL 2008 R2.  Our server has 80 cores!  Why would we switch?  Theoretically you could keep the database engine on 2008 R2 and upgrade your SSAS, SSIS, and SSRS servers to 2012.

Check out this post from Brent Ozar and in particular his recommendation that you NOT use order by and instead let your application do the sorting.  This is contrary to what used to be the best practice of letting the SQL engine do the work.  Those of you that work with SSRS should know that letting the SSRS engine sort the data takes longer than the SQL engine more often than not.  The reason?  Cost per core.

"Avoid ORDER BY; sort in the app instead.  To sort your query results, SQL Server burns CPU time. SQL Server Enterprise Edition goes for about $7,000 per CPU core – not per processor, but per core. A two-socket, 6-core-each server rings up at around $84k – and that’s just the licensing costs, not the hardware costs. You can buy a heck of a lot of application servers (even ones with 256GB or more of memory) for $84k."



PerformancePoint Gotcha!  - "PerformancePoint Services could not connect to the specified data source."  When creating a new dashboard, you MUST do a save all right after you create your data sources otherwise you won't be able to create anything that uses that data source.


Dinky Linkys

Starting a Bike Shop (like AdventureWorks!)

Research Quality Data Sets 

How to Create and Use Temp Tables in SSIS

Programmer Bob - Latter-day Tom Sawyer or Massive Security Risk




SharePoint Gotcha! - Creating SharePoint 2010 list from Exel 2010 spreadsheet

This needs to be a future blogpost.  Here's the solution:

Here is some of the crud I went through to get there:

Dadgum it Micro$oft!  This product is 3 years old!  Here's a bugfix: get rid of the button if you can't make it work!  I hope it's fixed in SharePoint 2013.  All kidding aside, what am I going to do?  Switch to Cognos.  LOL.


SSAS Gotcha! - Duplicate Attribute Key

Remove all hierarchies.  Add them back one by one.  Select the errors through the relational engine w/ each column in the hierarchy ordering from top down.  When you find inconsistencies such as: month belonging to more than one quarter (ha ha!), employee having more than one manager (master), then you either need to make a compound key and force the hierarchy or the hierarchy doesn't make sense to begin with.

This needs it's own post, time permitting.


Other interesting tidbits:

Do not Disurb!

Cost of Interruptions:

(strangely similar)


Game theory and Batman, RIP:




How crApple sets its prices:

Home Depot is looking at Pricing!






New Years Resolutions - Professional Goals

1.  Reboot

2.  Avoid Snowflakes!

3.  Learn more data mining and the DMX query language (and blog about it.)

4.  Learn more about Master Data Services (and blog about it.)

5.  Become a master at MDX

6.  Architect an enterprise solution in which Contoso buys AdventureWorks and Northwind (and blog about it.)

7.  One presentation per quarter and at least one really good blog entry per month

8.  Mentor someone outside of the company I work

9.  Keep a personal log in which I will plan each day and reflect on the previous day every morning. 

10.  Publish Segment Analyzer on codeplex (and blog about it.) 


Segment Analyzer is an idea I have to help analyze various customizable groups of customers, products, or any type of dimension in an analysis services cube.  The main force behind the solution is a .Net application that can run against any data warehouse that uses a standard table and column naming scheme.  It won't matter what the naming scheme is as long as it is consistent, that criteria would be configurable. 

An analyst uses a GUI interface to create criteria for segments to be analyzed.  The criteria is based on dimensional attributes, and aggregated and non-aggregated facts and supports complex parenthetical and/or logic at many levels.  Dropdowns are populated based on the distinct values in the dimensional attributes and mathematical operations for the numeric information.  Which attributes and which metrics are featured in the interface is handled through a one-time configuration process. Segments can have parent segments in which the criteria of the parent is inherited to infinite levels of children.  This will be accomplished by parsing the segment criteria into dynamic SQL that populates bridge tables to support many to many dimensions in Analysis Services.  The segment dimension itself uses the parent child hierarchy feature in SSAS.  Two known query performance killers, I know; but, the features would only be used on specialized cubes or databases just for this purpose.  

Examples of segments of customers from adventure works:

Customers that visited the store within the last six months who are female, are between 16 and 24 years old, live within 25 miles of the store and have spent at least $100 in the past six months and redeemed two promotional coupons.

Customers that visited the store within the last year, are male, live in the 12345 zip code, own one car, spent at least $200 in the past year, and redeemed 0 promotions.




Tip of the Day

How to find the SSIS Package that Loads a Table

Browse to the packages folder using Windows Explorer and type “*.dtsx” into the search box.

Open Editpad Lite or a similar multi-tabbed editor.

Press Control-A to select all the packages and drag them to your editor.

Press Control F and enter the table name into the search dialog.  Be sure to check the “All files” and “Loop automatically” checkboxes.

This also works for SSRS reports, stored procedures, views, etc.



Presenting on the SQL Server 2012 Data Mining Add-ins for Office 2010 at Jacksonville Code Camp on October 6th.

Free Food!



Reasons to use Money data type over


  1. 13% improvement in Analysis Services processing speed
  2. 20% improvement in SSIS processing
  3. In the context of SQL Server data compression, the money
    and small money data types tend to compress well when the
    absolute value is low




Presenting at SQL Saturday #130 on Saturday 4/28 at

University of North Florida

1 UNF Drive, Building 50, Jacksonville, FL, 32224


SQL Server 2012 Data Mining Add-ins for Office


Free SQL Server 2012 eBook:  Kindle and others.


Great Podcast!  Good discussion about the roles data scientists are playing.

Big Data Monetization


Some links I recently found useful...

Using Named Sets in Analysis Services to Deliver Special Date Ranges to End Users

Dynamic named sets

Filter the Source Data for a Pivot Table

New demo of Power View is Live!

SQL Server 2012 Demo Showcase


I received this question from a past bootcamper I thought I'd share...


"Would have to happen to have a split string function that can accept values with commas?

I have values like this that I will need to split (‘New York, Brooklyn’,’New Jersey,Hoboken’)

The functions that I currently have cannot handle data with commas."

 --First we need this function

 create function Split (@StringToSplit varchar(2048),@Separator varchar(128)) returns table as return with indices as

(select 0 S, 1 E union all select E, charindex(@Separator, @StringToSplit, E) + len(@Separator) from indices where E > S )

select substring(@StringToSplit,S,case when E > len(@Separator) then e-s-len(@Separator) else len(@StringToSplit) - s + 1 end) String

,S StartIndex       from indices where S >0


 --Now we use the function

 declare @mystr as varchar(255)

 set quoted_identifier off;

set @mystr = "'New York, Brooklyn','New Jersey,Hoboken'"

set quoted_identifier on

select @mystr

select REPLACE([String],char(39),'') from dbo.split(@mystr,''',''')



MS released the Excel Data Mining plug-in for Excel!  Woo Hoo!  I can't wait to try this out.  Once I do, you can be sure I'll be blogging about it.  I just can't wait to share this information.  Thanks Jorge Segarra for sharing this with me!

Also, Adam Jorgensen sent me this awesome link from Brent Ozar about table partitioning in SQL Server.  Thanks Adam!

How to Decide If You Should Use Table Partitioning

Brent Ozar's post from a long time ago, Rock Stars, Normal People and You, is my favorite blog post of all time.  It helped me pull the trigger on a decision I made about a year ago to leave a company before my first year was up and look for something more fulfilling.  Leaving a company before giving them a full year trial was something I normally would never have considered.  Now I am at Pragmatic Works and couldn't be more challenged.  I've never met Mr. Ozar; but, I have him to thank for that.  I hope I get the opportunity to thank him in person one day.




New blog post on PerformancePoint 2010 Time Intelligence!



Hey!  What happened to January!?!?  What a year already.  No time for a proper post so this will have to do for now. -------------------------------------------------------------------------------

Tip of the Day!  Use SQL Server Native Client for your SSAS data provider.  (per Adam J!  :-) )
"SQL Server native client is the combination of ODBC and OLEDB into to one driver. So you are really using OLEDB with the Native Client. So you should see no difference."  Some guy on the web
ZoomIt - Screen zoom and annotation tool for technical presentations
Great blog entry on the various types of testing.  Unit Test or Integration Test and Why You Should Care
Jorge Segarra ( shared this link with me recently:
Digital SCRUM board: Create a digital version of cards/sticky notes we use for agile-type projects. Cool part is you can share your boards so multiple people can work on projects. Recommend watching the short intro video to see the different stuff you can do.
SSRS - Action Item Link changes from Dev to QA to Prod
Our solution has a SSRS report that includes an action item link that launches a dashboard page.  Our issue is that we do not want to have to manually hardcode the link each time we move from dev to qa and from qa to prod.
I found a possible solution here:
However, our link pointed to a SharePoint dashboard page, not a SSRS report.  Solution was to use a switch statement like so:
 Globals!ReportServerUrl = ""
 ,Globals!ReportServerUrl = ""
 ,Globals!ReportServerUrl = ""
chart pac man
Using the MERGE statement in SSIS via Stored Procedure  - Sometimes your package will run faster if you leave the heavy lifting up the SQL Engine.  It really just depends on your environment so if you are dealing with a lot of data, my recommendation is to test it both ways.
TheRestartPage.Com - How many have you sat through?  One of my New Year's resolutions this year is that every time someone asks me a question about something not working, my first response will be to ask them to reboot their computer.  My wife looked at me a little funny though when the dishwasher would not turn on.

Happy New Year! It's been a long time since I've posted. Two terrible colds, Christmas, New Years, professional goals, and challenging projects are the culprits. One of my resolutions is to become a consistent blogger so you can expect more from me this year. 

My first real blog post of the New Year has just been posted:  Near Real-Time OLAP using SSAS Proactive Caching.

I need to clear my plate so here are some of the gotcha’s, tips, and links I’ve saved up over the past couple of months.

I came across the following link today browsing through Google Reader:
While I like the idea, you still have to drag the table on to the query analyzer editor.
A tool I like to use during the data analysis stage of a BI project is called database browser.
There is even a portable version you can stick on a USB stick or SD card.
Using it, all I have to do is single left click on the table to see the results.  Awesome!
Microsoft SQL Server Analysis Services has some query performance issues when dealing with many-to-many relationships; especially if the intermediate dimension is especially large.
Here are some links to help you with this issue.


Recently I was working on an Analysis Services solutions that used a snapshot based fact table for its main measure group.  We set the aggregation function to Last Child; but, nothing was working.  This cube had multiple date dimensions and I discovered that we needed to have the date dimension we needed to slice by listed as the first in the list in the dimension usage tab for the last child aggregation to work properly.
Squeeze one more remote desktop connection out of RDP!  Did you know that if you add “/admin” after your shortcut to a RDP connection you can have three simultaneous RDP sessions instead of two?  The user account you use must have administrative rights on the box for this to work.
How to add a link to a SharePoint page?  Use the content editor web part.
Try this.  Open SSMS and click the new query button.  Paste some SQL code in there and hold down your ALT button while left clicking your mouse button and dragging it to select text horizontally.  You can also insert pasted clipboard contents in the same fashion by holding down the ALT key.
SharePoint master pages:
 Want to use Live Writer with
Connect with the following url:
Lost your last post?
Try finding the .wpost files you want in My Documents\My Weblog Posts\Recent Posts, and move them to the
My Weblog Posts\Drafts folder. I believe that should stop Writer from polling the server when you open the draft.
Gotcha!  PerformancePont Time Intelligence and SSRS
I was trying to use a PerformancePoint time intelligence filter to pass values to a SSRS report.  I discovered that the MonthToDate, YearToDate, QuarterToDate syntax does not work with SSRS.  Instead use Month.FirstDay:Day, Year:FirstDay:Day, Quarter.FirstDay:Day
SSRS – weird error encountered when multiple charts stacked on each other w/ visibility toggled by an object.  Width of report was all messed up until I added a horizontal line (which I made invisible by making it’s color blend in w/ background.)
SSRS tip: 
To open out into a new tab/window you would use ="javascript:void('http://www..... '))" in the action event
To open in the same window use:
="javascript:void('http://www..... + "','_self'))"
Oh how I love a good checklist of best practices!
Excel - How to make those neat gradiant filled data bars!
On the Home ribbon tab, inside the Styles group, click Conditional Formatting, point to Data Bars, and then, in the Gradient Fill group, select the blue (first) design.
Gotcha!  SSIS - XML
This came from a friend.  "I found out that it is a limitation in SSIS that it will not read the top most node in an XML document and pull the data.
 My solution to this was to create a XML Document called DummyNode that contained only this:
 I then merged the DummyNode.XML with the Source XML file into a merged file with the XML Task in SSIS.
 This made the top most node DummyNode and then it read the node underneath it that contained all of the Season data.
 Here is the link that helped me identify the limitation or known issue in SSIS:
Gotcha!  SSRS  - 
"Anyone come across an issue assigning view only permissions to a report library that holds SSRS reports, in which reports are not displayed?  It seems that they are only displayed if the user is given edit item permissions, which is not what we want."
The report and data source needs to published as a MAJOR VERSION in order for users with "view" only permissions.
SSRS - Geo Links!
Turn your addresses into Latitude and Longitude coordinates!
Pet peeves
  1. Unsorted tables.  I like all *Key (sk) columns at top, then all *ID (alternate keys, biz keys), then attributes, metrics, then maint (last modified, etc.)
  2. Cubes built using views built on views.  (push this to the ETL!!)
  3. Cubes built upon views that do a bunch of joins or data cleansing – push to etl!
  4. No Schemas defined in dw.    Dbo.
  5. Dimensions w/ no hierarchies.
  6. Ssis packages that are ugly.   At least use the auto format diagram option!
  7. Ssis packages that use ole db source for a component name.  or "data flow task".  Or "ole db destination."
  8. Sql code that is not formatted.
  9. Ssas – named queries  AND views
Live PerformancePoint demo site! Super cool! To get the right click functionality you have to click the compatibility icon in Internet Exploder. Go here and click on demonstrations:
Live PowerPivot demo site!
Password: P0werP!vot
What is Pragmatic Works Foundation?
The Pragmatic Works Foundation is a non-profit organization created to find passionate people who are interested in joining the technology field but cannot make the financial investments needed for training and hardware. The goal is to provide candidates the opportunity to learn new technologies from industry experts. In addition to offering free training, we work with our onsite recruiter to help place candidates who complete training with a new career in technology.
The deadline for applications is today!

Many to Many Relationships

I finally saw the movie Moneyball.  It was pretty good and I'd recommend it.  I do have two complaints. 
1.  There was only one computer in the whole movie
2.  Brad Pitt didn't play the data analyst
Bill James's Sabremetrics takes a back seat to a story about a manager trying to change the way things have always been done by replacing subjectivity with science.  The movie is based on the true story of Oakland A's baseball team manager, Billy Beane played by Brad Pitt.  It turns out that Peter Brandt, the analyst, also starred in another movie called Superbad.  (I wonder if it's any good?)
This week I've been ramping up on my PowerPivot skills.  I've read two books cover to cover on it; but, the only way to truly know it is to try and solve problems with it.  I had a presentation started using data from that just needed a little more work (or so I thought.)  I did a little research on APBRmetrics, the Sabremetrics of basketball, and got to work adding value to the data in PowerPivot.  DAX is truly a whole new way of thinking and wrapping my head around how to do things in DAX instead of SQL or MDX is a real challenge.  I'd like to do a full blog entry in the future of my experience and post the spreadsheet here when I am all finished.
I added a new blog entry that compares the Task Factory Address Parse Transform Component with the SSIS+ equivalent.  Please check it out, it's a pretty good read especially in light of the recent USPS news.  We can all help them cut costs by making sure our addresses are good so they're not out wandering around in the rain trying to deliver mail to bad addresses.
mmmm looks good
Link of the day!
I have got to go see this movie today! I taught a business intelligence class last week and one of the students worked for a professional basketball team and brought their data. We used the Excel data mining add-in to explore how we could use it for draft pick analysis. An excellent webinar for doing the same for fantasy baseball can be found here:
If you are interested in this sort of technology, be sure to read SQL Server 2008 Data Mining. Chapter two especially because it shows you how to do everything using the Excel data mining add-in.
Some neat utilities I came across this morning:
RichCopy - GUI frontend for RoboCopy.  Useful for copying or transfering large files.
NoReplyAll plug-in for Outlook 2007 or 2010 - Primary function is to prevent people from doing a reply-all to your message, or forwarding it.  Additionally, it includes a check for email goofs such as omitting attachments or subject lines.
SpeedLaunch - With Speed Launch, you can simply drag any file, document, or website URL into the Speed Launch bull’s eye on your desktop and tag it with any name you want. Then, any time you need it again, you can just go through the bull’s eye, type or click on the name, and there it is.
Screen Recorder - Very easy-to-use screen-to-video capture program, developed on top of Windows Media Encoder, that lets you easily capture what is going on to a small video file, which you can then send via e-mail to the appropriate person.
I found all of these at the link below so there may be some others you might find interesting. 
One tool that I absolutely find indenspensible is ResophNotes.  I use it for code snippets and plain text notes.  It syncs with and there is an Android app called mNote and I'm pretty sure there is an Iphone app.  I like the way I can choose how and where to store the notes making it a great portable usb app by the way.  I can store them in a database file or individual text files in a directory I choose.  I'd love it if they came up w/ a way to have a group share of code snippets.  I guess you could use a shared folder on the network or a shared account w/ simplenote but if there was a way to enfoce moderator approval of edits, deletions, and additions that would be awesome.
Pragmatic Works Informercial for BIxPress!  This is too funny!
My first webinar!
Zero to Dashboard- Intro to PerformancePoint
Come check out the PerformancePoint hotness! Mike will demonstrate the functionality in PerformancePoint services in SharePoint 2010 and show you how to quickly build some dynamic dashboards and reporting functionality your end users will crave.
It's been a long time...
...Since I've updated this blog.  I'm going to blame it on the migration to the new software and my fierce dedication to the tasks at hand. 
What's New BI page format change... new posts at the top.  Make sense doesn't it?
First things first...
Last post I described in great detail a technique I encountered for performing parallel loading of massive amounts of data into a single source using SSIS.  It was my intention to provide an example the very next day; but, the procrastination devil on my shoulder finally gave me a good reason not to do so.
 Balanced Data Distributor:
I intend to do a whole blog posting on making SSIS Speedy in preparation for my upcoming webinar so please, stay tuned!
BI Bootcamp -
I taught my first BI Bootcamp in Jacksonville last week and I think it was a pretty good success.  A few minor hiccups and some technical difficulties; but, overall I think everyone was very happy.  I really enjoyed sharing my knowledge and the students were very enthusiastic about learning all of the tricks of the trade.  Talking all day made me very tired!
A duplicate attribute key has been found while processing...
Aaargh!  Gotcha moment today.  As soon as Mike Davis told me the answer I knew it sounded familiar...  SSAS doesn't differentiate between a null value and an empty string.  Boy, I felt silly. 
One quick fix is to change the underlying view the dimension is processing against (you are using views aren't you?) with a simple case statement:
SomeColumn =
       when someColumn IsNull then 'UNKNOWN'
       when someColumn = '' then 'UNKNOWN'
       else SomeColumn
Of course, you would eventually push this view logic to the ETL which brings me to why I seldom encounter this error when I am designing a cube... I am usually the one that designed the ETL and I do this sort of thing from the get go.  Please don't misinterpret that as arrogance.  Make no mistake, I learned the hard way.
I was curious what a simple google search would produce so I Googled: "ssas duplicate attribute key has been found" and checked the top three results.
The first link had a little bit of a different answer buried a little bit:
If you have NULL values, you may need to change the null processing for the attribute.  In the dimension editor, select your problem attribute.  In the properties window, under the key column property is "Null Processing".  By default it's set to "Automatic".  Change it to "Unknown Member" and you should be all set.
(Kudos to : Christine C. M_ )
The second link didn't help at all for this particular cause of the error. 
The third link was the best. 
Baby's crying... gotta run.
Wow!  The past two weeks have been a whirlwind.  I've been heads down working with a team of very bright individuals on an extremely interesting project.
Whoever it was that said consulting is like a rocket ship to your brain sure hit the nail on the head.  I am thoroughly enjoying the challenges presented to me and totally love the fact that I am not just delivering a solution; but, passing on valuable knowledge and experience.  The hours are long, being away from the family is hard; but, at the end of the day it is comforting to work with people that really want to learn, participate, and do the best they can for themselves and the people they work with every day.

I’d love to share more; but, it’s a top secret project: need to know basis. 

I’ve been doing a lot of thinking about parallel processes in SSIS.  I recently modified the meta data driven version of the ETL framework on codeplex to support additional parallel processes.  No big deal there; but, it reminded me of a package I’d recently seen from another very bright individual.  He told me he found the idea and some of the code from a Google search.  I’d love to know the link. 
Basically that person developed a package to allow for configurable parallel processes at the individual package level. 
Nutshell is that he used the:

T=SQL to partition the incoming records.  For our example let’s say 100K rows need to be loaded.  He had a SSIS variable that held the number of parallel processes to launch (let’s say 4) and another one to store the current iteration of those four.   

So starting with iteration 1:
control flow: grab the iteration variable and set it to 2.  100K divided by 4 is 25K, Get rows 1 – 25K.
launch 2 parallel processes:
first: OLE DB Source SQL gets the first 25K using the row_number() over syntax.

Second: launch another version of the package ITSELF
This second package which is a clone of itself, reaches out to the table that holds the current iteration and gets 2 then sets it to 3.  It follows the same logic but loads 25k – 50k.  Simultaneously, launching another version of itself that repeats but loads 50K – 75K, and of course, it launches another clone that loads 75K – 100K.  Basically what you get is 4 packages loading 4 equal parts at the same time.

Very cool stuff!  And looking at the log tables I could see that 4 instances of the same package took different variables and loaded the data into the destination at the same time.

Up until then when I thought of parallel processes using SSIS I thought of it at the master package level launching different packages that loaded different data into different tables at the same time.  The difference here is that you are launching multiple processes to load one data set into one destination in multiple pieces at the same time.

I was unsure at first and I still wonder about the contention between multiple packages inserting data into the same table.  Apparently it wasn’t a problem in this case; because the load times decreased.  I imagine there would be a sweet spot to find for each environment where you would want to stop launching additional parallel processing tasks when overall load times started to increase.

I’d like to, time permitting of course… to take this a few steps further.  Create a master package ETL framework that not only can determine how many packages to launch at the same time; but, determine the largest data loads and break them out into multiple processes at the package level. 

So…. Wouldn’t this blog post be great if I uploaded actual examples instead of theories?   Maybe in the near future… 

Here's a totally unrelated cool link I liked:
What a week!  Great to be home!  The word of the week is PowerPivot!
Got a new tip to share too.  My esteemed colleague Jorge shared this with me:
"Trace Flag 1117 makes it so that if a file needs to auto grow and there are multiple files, they will all grow together at the same time. How to set SQL Server startup options (this is where you enable trace flags):

Paul Randal post regarding TempDB and how 1117 helps:

Another post regarding how it helps (mentions SAP but applies to SQL Server databases in general, application agnostic):
"I would recommend trace flag 1117 and 1118 for uniform extents to reduce SGAM contention and growth of all files in a filegroup at the same time, also want instant file initialization turned on."
Link of the week:

Computer learns language by playing games

By basing its strategies on the text of a manual, a computer infers the meanings of words without human supervision.
PerformancePoint Tip - Right clicking an analytic grid in the leftmost columns allows the user to select measures.  Some people may not want that, so how do you disable that ability?  PerformancePoint Services Application Settings - Select Measures Control - set maximum value to 0.  (default is 1000).
SSAS Tip - Implement measure security without resorting to cell level security (which can be a huge performance hit.)
I will be teaching the BI Boot camp from 09/12/11 - 09/16/11.  Sign up now!
SSRS - Oh the joy I had working with SSRS again this past week.  Dynamic Grouping.  Fun topic I should blog about it in further detail in the future.  Here are some links if you can't wait and want to guess what I am going to say.
I will say this now. 
SSRS tip #1
SAVE OFTEN.  Furthermore, save more often than you think you have to.  Save w/ descriptive names.  By descriptive names I mean:
SomeReportForSomePurpose = SRFSP
Saving is so frustrating in SSRS because there are at least 5 steps to even do it enough to truly protect yourself.   As soon as you File Save as it insists on removing the old one from your solution.  So if you really want to keep all versions in your solution you then have to add the old one back.  I think next time I'll have to try it another way, perhaps using zip files.  UPDATE: Zip files work great!
SSRS tip #2
If an MDX report is taking too long and has calculated measures, consider converting those MDX calculated measures to be straight facts loaded through the ETL process.  Only use MDX calculated measures when you absolutely need the OLAP type functionality and pivot ability  If you can get the metric into the ETL process and pull it straight into the cube as a regular measure than it will improve query performance immensely.
Great Links:
10 General Programming Tips for Programmers
10. Do not Fear.
9. Divide and Conquer
8. Assume that coding is like writing a Novel and write it with lot of interest so the developers maintaining it would love to read it.
7. “Code as if every mistake you make will be discovered and you will be held accountable.” – Anonymous.
6. Be a Good listener first .
5. Be Open, learn to say no if it is not possible.
4. Diving into code is the best way to figure out things.
3. Be Relaxed even when solving one of the toughest problem .
2. Never Give up , you will get it .
1. Keep it Simple
This looks like a neat tool for a freebie!  I think I'll play around w/ it some more and create a blog post about it in the future.  Maybe compare it with Pragmatic Works BI xPress which I think has some similar functionality (and then some!)
Today I learned something new about SSIS, the OLE DB Destination component and the fast load option.  Someone told me the reason they didn't have the fast load option set for some of their OLE DB Destinations was because they wanted the package to fail if column truncation was going to occur.  That sounded a little funny to me so I did some research and the best I could find was something about how fast load wouldn't give you as much information about errors.  So I made a quick and dirty test and lo and behold he was right!  Using the OLE DB Destination component without the fast load would cause package failure if you insert a 40 character long string into a varchar 10 destination.  Using the fast load component, it just truncated the column and did not fail the package.  The SQL Server destination component behaved properly and also failed.
Categories: Blogs
Rate this article:
No rating

Mike MilliganMike Milligan

Other posts by Mike Milligan

Please login or register to post comments.