Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

Gotchas With Referenced Dimensions

  • 1 March 2013
  • Author: DustinRyan
  • Number of views: 9592
  • 0 Comments

image

A referenced dimension relationship allows us to create a relationship between a measure group and a dimension that is indirectly related to the measure group using an intermediate dimension. This is great, because we can now create relationships between our facts and dimensions that may not be directly related. Unfortunately, there are a couple of gotchas that you may not be aware of, which I read about here and here. I'm just offering my take on these issues, so thanks to Alberto and Jesse for the great content.

 

You can see several examples of referenced relationship types in the Adventure Works cube. One of those examples is the relationship between the Sales Target measure group and the Sales Territory dimension.

image

If you've ever created a referenced dimension relationship, you know that the Materialize option is checked by default. So one might assume this is the best option to go with.

To Materialize?

image

But before we can decide if the option to materialize the relationship is best, we must first understand what the materialize option really does. When we check the materialize option, the link between the referenced dimension and the facts is materialize. This leads to slower cube processing but better query performance. There is, however, an important thing to realize. The materialized relationship link between the facts and the correct attribute members is created during the processing of the partition, not the dimension! Why is this important? Imagine we need to update the relationships between the facts and the referenced dimension. We can no longer simply process the dimension. The link to the older referenced dimension attribute member still exists in the partition! The partition must be processed! This could be a huge issue if your cube is very large and you're only planning to process the most recent partition. If the relationships changed in the referenced dimension, you could find yourself forced to process more than just the most recent partition in order to ensure the relationships between the facts and the referenced dimension are correct.

To Not Materialize?

image

So what's the other option? We could leave the option to materialize the relationship unchecked. The downside to this is that query performance will suffer since the member will need to be looked up on the fly. The upside is that processing will be faster and we no longer need to process the partition to update the relationship since its looked up at query time.

But there is also a gotcha with using un-materialized referenced dimension relationships. Aggregations can not be created for a referenced dimension when the relationship is not materialized. If you run the Aggregation wizard, you'll notice that any referenced dimensions with un-materialized relationships will be left out. This is by design. An aggregation should not be created (which is possible to do manually) for an un-materialized referenced dimension relationship because the relationship is resolved at query time.

The Conclusion

So which option is best? In my personal opinion, neither. Don't snowflake your design and just go with a star schema. Then you won't even have to deal with the referenced dimension. 

Of course, if you don't have control over the dimensional model and you are forced to use a referenced dimension because your design is somewhat snowflaked, my personal recommendation would be to look at your cube and make an informed decision. If you're cube is extremely large featuring multiple partitions and your partitioning strategy involves you only processing a single or couple partitions, I would recommend to not materialize the relationship. This will prevent you from having to process the entire cube should something in the referenced dimension change.

If, however, your cube is relatively small or your data warehouse is completely reloaded each night, I would probably recommend materializing the relationships because the increased time to process won't be that big of a deal since the cube is small and your query times will benefit because of the materialized relationship. You'll also be able to create aggregations since the relationship is materialized. Of course, the "it depends" applies differently to every situation.

Print
Categories: Miscellaneous
Rate this article:
No rating
DustinRyan

DustinRyanDustinRyan

Other posts by DustinRyan

Please login or register to post comments.