Changes are A’comin’!
Over the last couple days you may have noticed some slight changes on my blog. Well believe you me when I say some more are coming! You may be asking yourself, “Self, what are all these changes for? Dustin’s blog was already great! Why mess with a good thing?” That’s a good question and one I hope to answer through this blog post.
Changes to my Blog
My WordPress blog site will be getting an upgrade! My current blog layout has served me well and I have always liked the look and feel but I think I could do for a better layout that makes the content more easily accessible to the reader. Because of this, I’ll be meeting with ladies in the marketing department at Pragmatic Works to have them give my blog and social media outlets a look to see what improvements I can make.
You may have also noticed that I’ve added my speaking calendar to my blog. At the top of my blog you’ll find a link to Dustin’s Calendar, as well as a smaller version of the calendar in my blog’s right sidebar. The purpose of the calendar is to make my speaking schedule more available to you. I love to have interaction with the audience and readers of my blog when I’m out speaking at SQL Saturday and other PASS events so hopefully I’ll be able to increase my availability and interaction with you through the calendar.
I’m also looking at creating some new and in-depth content for the blog including some new styles of posts and whitepapers, so definitely stay on the lookout for that material. If you have some great ideas of new content or style in a specific area of the SQL Server world send me a message! I’d love to hear from you!
For the longest time the only social media avenue I’ve really leveraged semi-professionally was my Twitter account. Sometimes I published technical stuff, links to my blog, interesting articles, but other times I published tweets unrelated to SQL Server. Going forward, I’ll be using my @SQLDusty Twitter account strictly for professional purposes.
I’ve always loved Twitter because the Twitter world provides you and I with a unique opportunity to connect, network, and gain access to incredibly qualified experts, professionals, and celebrities that work in our field who, without Twitter, we would never gain access outside of an expensive conference or training event. Almost anyone will respond to a 140 character message.
Three SSAS Best Practices to Follow
There is a lot of work that goes into performance tuning a SQL Server Analysis Services solution for a client. And even though there may be many steps involved in identifying performance and management issues with a large SSAS solution, there are a few things that we can quickly check and implement to give us quick wins for improving the performance of our cube. In this post, I’d like to quickly point out three best practices that we can follow to improve performance and create a more positive experience for our users. These are not meant to be the top three best practices to follow, but rather three (among many) very important best practices you should follow. I believe following these three best practices will make a difference in your solution.
Create Hierarchies with Attribute Relationships
In my opinion, creating natural hierarchies are the single most beneficial thing an SSAS developer can do to improve the performance and usability of a cube. There are several reasons correctly defined user hierarchies are beneficial, but here are a couple of the top reasons.
Increased Query Performance
Creating attribute relationships between attributes that are included in a user defined hierarchy improve the performance of queries using these attributes for a couple different reasons. In this case, we’ll look at the Calendar hierarchy in the Adventure Works Date dimension. The Calendar hierarchy has five levels with the Calendar Year attribute at the top level and the Date attribute at the bottom level of the hierarchy. We can also see the following attribute relationships created to give SSAS an understanding of how the members of these attributes relate to one another. Once created, these relationships give SSAS the ability to understand that a given date member is related to a single month member, a given month member relates to a single quarter, and so on and so forth.
This also means that during processing, special indexes are created that map the relationships between each member of each level in the hierarchy. This means that before a query is written, SSAS also knows to which month, quarter, semester, and year each date aggregates. These indexes are only created if you correctly define the attribute relationships between the attributes in the hierarchy.
Improved User Experience
User defined hierarchies also improve the user’s experience with the dimension becaus
TSQL Script to Find Foreign Key References to a Given Column
It’s always kind of a pain to have to hunt down all those foreign key references so you can address the issues. So I put this script together (based on a script found on StackOverflow) in order to help me find all the required information related to a particular column in a specified table. I’m mostly posting this for my own reference later and for anyone else that may find this useful, so enjoy!
SELECT OBJECT_NAME(f.object_id) as ForeignKeyConstraintName,
OBJECT_NAME(fk.referenced_object_id) as ReferencedTableName,
COL_NAME(fk.referenced_object_id,fk.referenced_column_id) as ReferencedColumnName
FROM sys.foreign_keys AS f
INNER JOIN sys.foreign_key_columns AS fk
ON f.OBJECT_ID = fk.constraint_object_id
INNER JOIN sys.tables t
ON fk.referenced_object_id = t.object_id
WHERE OBJECT_NAME(fk.referenced_object_id) = 'your table name'
and COL_NAME(fk.referenced_object_id,fk.referenced_column_id) = 'your key column name'
Here’s a picture of what the results look like. I ran this query against the ReportServer database used for SSRS in case you were wondering.
If you want to find every Foreign Key in your database, just eliminate the Where clause to bring back all the FKs. Hopefully you found this as useful as I did.
What’s New in SQL Server Analysis Services 2016?
There’s a load of new features that are included in the release of SQL Server Analysis Services 2016 CTP2. I’m pretty excited about these changes and while these changes have been public for a while now, I’d like to share my thoughts. I’ll say that these features are included in the SSAS 2016 CTP2 release. This release does not include all the enhancements to SSAS 2016 and these enhancements are subject to change. You can read about the enhancements here.
Parallel Partition Processing (Tabular only)
Holy smokes it’s about time! Previously to SSAS 2016 table partitions in SSAS tabular were only processed serially. Partitioning a table in a Tabular model gave us no increased performance aside from the ability to only process the partitions with changed data. But gone are those days! Now partitions are processed in parallel giving us a great increase in processing performance.
DBCC XMLA Command (Tabular & Multidimensional)
With the new DBCC XMLA command we can check for database corruption in our Tabular and Multidimensional databases. This has been a need for a while so it will be nice to have another tool in our belt to assist in diagnosing corruption issues.
New DAX Functions (Tabular, duh)
There are a ton of new DAX functions coming with the enhancements of SSAS. This may be silly but probably the new DAX function I’m most excited about is the CALENDAR and CALENDARAUTO functions. The CALENDAR function returns a single column called Date containing a contiguous set of dates based on start and end date parameters that you specify. The CALENDARAUTO does kind of the same thing except the dates included are based on the data in the model.
There are some other enhancements included in the SSAS 2016 CTP2 release that you can read about here but there are also a load of other changes coming not mentioned at that link but were announced at Microsoft Ignite this year. These coming enhancements have been covered extensively in other blogs like this one and this one but I figured I’d discuss my favorites.
Many to Many Relationship Support (Tabular)
Previously many to many relationships had to be handle with DAX which was always kind of a pain in the rear, but now many to many relationships will be supported natively. Yay!
Time Intelligence (Tabular)
Of course you could always write your own time intelligence calculations with DAX but now it’ll be easier. Pretty sweet.
Distinct Count Measure Group Performance (Multidimensional)
We always had to go through loads of work to optimize Distinct Count measure group but with the release of SSAS 2016 we can look forward to increased performance with the Distinct Count measure groups.
I hope you found this information useful. What enhancement are you most looking forward to?
Make a Difference as a Thought Leader: A Book Review
If you’ve ever spoken at an event like a SQL Saturday or Code Camp, answered a question on a forum, written a blog post, or helped lead a discussion at your place of work then on some level you are already a Thought Leader. If you’re asking what a Thought Leader really is, you’re probably a lot like I was when I first started reading Denise Brousseau’s Ready to Be a Thought Leader. A Thought Leader is somebody that drives thinking and learning in a particular industry, group, or profession. These people are widely recognized as an expert and authority on their subject matter and a go-to-person for learning and insight into their field. Thought Leaders are men and women that take the time to increase their knowledge, share what they’ve learned, and make a difference in the lives of others in their niche. Ready to Be a Thought Leader demonstrates a seven step pattern laid out by Denise Brosseau instructing the reader on how to become an innovative, forwarding-thinking, cutting edge Thought Leader.
Why should you want to be a Thought Leader?
But before I tell you what this book is about and why I think you should read it, why would you want to be a Thought Leader? ThoughtLeadershipLab.com (Denise Brosseau’s website) defines a Thought Leader as “…the informed opinion leaders and the go-to people in their field of expertise. They are trusted sources who move and inspire people with innovative ideas; turn ideas into reality, and know and show how to replicate their success.” Thought Leaders are leaders and innovators! As a Thought Leader, you have the opportunity to make a difference in the lives of people in your field, around you, and around the world. When we become that innovative, encouraging, leading person in a particular area that gives you a unique level of credibility that opens doors to opportunities to truly make a difference and inspire others in a special way. This credibility also has a profoundly positive impact on the career, business, and life of a person.
And that’s really what Read more
Learn Designing a Data Warehouse from the Ground Up at SQL Saturday 442 Orlando, FL
I’m very excited to announce that I’ve been selected to present a session entitled Designing a Data Warehouse from the Ground Up on October 10, 2015 in Orlando, Florida at Seminole State College of Florida! I’ll be presenting this session for the first time along side Mitchell Pearson (b|t). This is going to be an amazing event with tons of amazing, FREE training for everyone including SQL server newbies on up to those who have been in the profession for years.My session, Designing a Data Warehouse from the Ground Up, is designed to be an introductory level session that will teach you the basics of designing the perfect dimensional model. Designing a data warehouse is much different than designing a transactional system as the purpose of the two types of databases is much different. You’ll learn to approach any business process in you organization with a methodical and step by step approach in order to design a great data warehouse. As we walk through the steps of designing a data warehouse, we will discuss how our design decisions can affect any potential SSAS solutions we may develop in the future.
I’m very excited about presenting this material and I hope that you’ll register for SQL Saturday 442 in Orlando, FL and come visit with me! This will be a great opportunity to receive free training from some of the sharpest experts in the industry on everything from database administration, SSIS development, Power BI, SQL Server 2016, Big Data, Azure, and more.
3 SSAS Dimension Design Best Practices to Live By
After spending the last five years or so designing and performance tuning SSAS cubes and dimensions for the clients of Pragmatic Works, I’ve noticed there seems to be a recurring pattern with poorly designed dimensions. Because of this I wanted to take a few minutes to point out three easy things you can do to improve the performance and usability of your solution’s dimensions
Remember Less is More
One of the most common mistakes that I often see with poorly performing dimensions is large, bloated dimensions with attributes that are duplicated, unnecessary, and/or unused. These extra dimension attributes can seriously increase the amount of time required to process the dimension. It’s important to remember that for each attribute in a dimension, a Select Distinct query is executed against the data source. Depending on the number of rows in the table, unique values in the field and other factors, the processing time will be negatively impacted with each attribute that is added to the dimension.
I once worked for a client that had over 100 SSAS databases and very mature BI environment but had allowed the users to talk their team into making some poor dimension design decisions. One of their largest cubes featured an Employee dimension that included hundreds of thousands of members. The Employee dimension contained an unnecessary amount of attributes, many which contained the same data as other attributes or went unused. For example:
- Employee Name – Employee Number (ex. John Smith – 0004567)
- Employee Number – Employee Name (ex. 0004567 – John Smith)
- Employee Number Name (ex. 0004567/John Smith)
- Employee Name Number (ex. John Smith/0004567)
- Employee Number (ex. 0004567)
- Employee Name (ex. John Smith)
The reasons for the duplication of the data was that the dimension was leveraged by many different business groups. These users wanted the data formatted in the way that mirrored their previously used legacy systems. The above problem persisted in other dimensions in the cube, as well. As the data volume grew, more attributes were created, and more measure groups were added causing processing time to increase to unacceptable levels. Additional steps were necessary to performance tune the dimensions by eliminating the unused and unnecessary attribute hierarchies.
A better design strategy would have been to call a meeting to bring together members from the vested business groups in order to come to a consensus on how how the Employee Name and Employee Number attributes should be displayed with the purpose of cutting back on the number of attributes per dimension. Doing so would decrease processing time for the dimension and also provide better query performance as the query cache can be better utilized for a smaller number of attributes.
Use Member Properties
My Top Four Books for the MS Business Intelligence Professional
As a Business Intelligence Consultant, I do a decent amount of speaking, interacting with the community, and have written and contributed on a few SQL Server books. A question I’m often asked is if I can recommend any good books which brings me to this blog post. I wanted to make you aware of four books for learning data warehousing and other MS BI technologies that I’ve found incredibly helpful over the years I’ve spent designing and implementing enterprise data warehouse and business intelligence solutions.
Ralph Kimball & Margy Ross
In my opinion this should be required reading for everyone working in the business intelligence field. This book is money! The first five chapters are amazing if you need to learn how to design a dimensional model from scratch. The book is a (surprisingly) very easy read and the concepts are simple to follow because the examples used by the authors are simple. For example, the first case study used is a retail business, which most everyone can can wrap their head around with little effort. Another really nice thing about this book is that its written from a technology-agnostic standpoint so you don’t have to worry about getting lost in the minutia of any particular technology that you may or may not be familiar with. Although this list is in no particular order The Data Warehouse Toolkit book would be number one. :)
Irina Gorbach, Alexander Berger & Edward Melomed
Whether you’re a beginner or an SSAS pro, this book is superb. If you’re new to SSAS and need help getting your head wrapped around the basics of SSAS cube space and the difference between sets and tuples,
Excel Services in Power BI
**Information current as of 7/22/2015**
There’s lots of exciting news this week with Power BI so expect several short blog posts detailing each. Here’s one you can try now that I was pretty excited to see.
So here’s the news. You can now expose standard Excel workbooks as another asset in your reports library in Power BI. By selecting your workbook a link is created that allows you to launch Excel Services visualizing the workbook.
- The workbook must be stored in OneDrive for Business, which of course not everyone has.
- Doesn’t integrate with other Power BI visualizations. I’d love to see this in the future. PerformancePoint allowed visualizations from Excel Services and it’s own proprietary visualization to integrate together. Would love to see that again here. Currently it just launches Excel Services to the workbook
How it Works
- If you haven’t already, you’ll need to sign up at PowerBI.com.
- Once you’re signed up, you can sign into at PowerBI.com to get started.
- To get started select Get Data.
- Select that you want to import data from the Files option.
- Select OneDrive – Business. Again OneDrive for Business is the only way to currently do this.
- Select the file you want to import and then click Connect.
- Next you’ll choose: do you want your workbook to be a data source for a Power BI dashboard or do you want to bring your Excel workbook in as an Excel Services report. This blog is all about demoing t
Learning T-SQL day 2
Day 2 of losing my SQL sanity…
Okay, I might be exaggerating a tiny bit. To be completely honest, it still isn’t too bad. My teacher, Chris, says we still haven’t got to the fun part: Joining Tables and modifying tables.
Before I truly begin, you need to know a few things: “Select”, “From”, “Top”, “Order By”, and “Distinct”. I know you’re asking, “Water Thooose?”, and the answer is: Google it. These are really simple statements that I’ll actually explain so you don’t have to do any work :) You’re welcome! “Select” defines what column one wants to call upon. “From” defines which table you want your query to run in. “Top” orders the data based on the top results; it may be top 100, top 1000, etc. “Order by” orders the data based on the users requirements. “Distinct” calls upon data that is shown multiple times, and smashes it into one result so that there is no repeats. Now we can start with today’s lesson.
Today, we learned of things such as “Where” Clauses (Hohoho!), Complex “Where” Clauses and Sub-Queries. Now that you have a basic idea of what I learned, let’s get to know more about these SQL subjects.
“Where” Clause: What, When, Where, Who, Why? What is the “where” clause? It is a way of specifying the results one is trying to call upon. When do I use the “where” clause? One would use it when they are trying to find a specific result, such as “Where Kittens = ‘Too many'” and it would find a column called “Kittens” and call a result shown as “Too many”. Where does this clause go? One would use this clause below the from statement. Who actually uses this clause? Anyone who has and ever will be involved with T-SQL. Why should I use this clause? I would tell you but I feel that it’s relative. Trust me, you will use this clause.
There are Conditions, and Operators and Wildcards and all kinds of fun stuff used along side the “Where” statement. If you want to specify your “Where” even more, you are going to have to use “Operators” such as, ” ‘=’ , ‘!=’ , ‘<‘ , ‘<=’ , ‘>’ , ‘>=’ , ‘<>’ , and ‘BETWEEN’ ” For the Operators, you’re actually going to have to look them up if you want to know what they mean and do. “Wildcards” are fun and replace values with random characters. One would usually use them when they want to find a value that starts with a certain character, but they don’t care what it ends with. There is a “Not” statement which does exactly what you think. It does the opposite of what you want. For example ” where FirstName NOT = ‘John’ ”
“Null”… This statement could probably have its own blog. Null is not zero. “Zero” is a value. Null is not an empty cell. Empty cells have a value. Null is literally nothing. Null has no value. Don’t confuse it with a zero or an empty cell.
Power BI and Big Data
If you’re worked in the wide and diverse field of information technology for almost any amount of time, it probably hasn’t taken you long to discover that the one thing constant about IT is that the technologies and strategies involved change faster than you can learn them. And if you work in business intelligence like I do, you don’t have to look very far at all to see change. The Microsoft Power BI team rolls out a software update every month! If I want to stay learned up on the technology, I have to really be on top of things.
About ten years ago when Hadoop was first being developed at Yahoo, I don’t think anyone could have anticipated the size of the ripples (more likes cannonball sized splashes) being able to access Big Data could and would have on the IT industry. Hadoop (and other advances in hardware and software technologies) gave us something we never had before: The ability to access and report on data in real time on a scale never previously imagined gives an organization to identify and understand trends and patterns in the data and gain previously unknown insights. The organizations that are able to leverage big data will be the organizations that leave their competition in the dust.
Set Up and Configure the Hortonworks Sandbox in Azure
Not only does Power BI Desktop give us the ability to connect to Hadoop Distributed File System (HDFS) for reporting we can also mash it up with other more traditional and structured data sources with minimal effort required. But that’s not what this blog post is all about. This post is about setting up a virtual machine in Azure running Hadoop and connecting to our Hortonworks Sandbox with Power BI Desktop :).
The first thing you do if you don’t have access to a Hadoop cluster is to set up the Hortonworks Sandbox on Azure. The good news is its free (for the duration of the trial) and its super easy. Just follow the instructions at this link to set up the Hortonworks Sandbox.
Once that’s set up, you’ll need to add mapping for the IP address and host name to your hosts file. Devin Knight has a blog on this that you’ll find helpful.
Connecting to Hadoop with Power BI Desktop
Once your Hortonworks Sandbox is set up, you’re ready to set up your connection to Hadoop with Power BI Query. Start up the Power BI Desktop and click Get Data. Scroll down and select Hadoop File (HDFS) and click Connect.
From there you can follow the rest of the wizard to load the data into the semantic model.
Once the data is loaded, you’ll need to modify the query to navigate to the data you wish to use in your model.
In Power BI Desktop, go to the Home ribbon and click Edit Queries.