Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

«October 2015»

Data Warehouse from the Ground Up at SQL Saturday Orlando, FL on Oct. 10th

SQL Saturday #442SQL Saturday #442 is upon us and yours truly will be presenting in Orlando, Florida on October 10th alongside Mitchell Pearson (b|t). The session is scheduled at 10:35 AM and will last until 11:35 AM. I’m very excited to be presenting at SQL Saturday Orlando this year as it’ll be my first presenting this session in person and my first time speaking at SQL Saturday Orlando! If you haven’t registered yet for this event, you need to do that. This event will be top notch!

My session is called Designing a Data Warehouse from the Ground Up. What if you could approach any business process in your organization and quickly design an effective and optimal dimensional model using a standardized step-by-step method? In this session I’ll discuss the steps required to design a unified dimensional model that is optimized for reporting and follows widely accepted best practices. We’ll also discuss how the design of our dimensional model affects a SQL Server Analysis Services solution and how the choices we make during the data warehouse design phase can make or break our SSAS cubes. You may remember that I did this session a while back for Pragmatic Works via webinar. I’ll be doing the same session at SQL Saturday Orlando but on-prem! ;)

So get signed up for this event now! It’s only 11 days away!

Read more

Create Date Dimension with Fiscal and Time

Here are three scripts that create and Date and Time Dimension and can add the fiscal columns too. First run the Dim Date script first to create the DimDate table. Make sure you change the start date and end date on the script to your preference. Then run the add Fiscal Dates scripts to add the fiscal columns. Make sure you alter the Fiscal script to set the date offset amount. The comments in the script will help you with this.

This zip file contains three SQL scripts.

Create Dim Date

Create Dim Time

Add Fiscal Dates

These will create a Date Dimension table and allow you to run the add fiscal script to add the fiscal columns if you desire. The Create Dim Time will create a time dimension with every second of the day for those that need actual time analysis of your data.

Make sure you set the start date and end date in the create dim date script. Set the dateoffset in the fiscal script.

Download the script here:


Read more

Hello Operator: Key Lookup

  • 11 June 2013
  • Author: Kathi Kellenberger
  • Number of views: 5088

The T-SQL Tuesday topic this month, “Hello Operator”, is about execution plan operators. I would like to talk about the Key Lookup Operator.

So, the first thing you may be wondering about is if it is good or bad to see the Key Lookup operator in your plan. Well, that is one of those “it depends” kind of questions. Sometimes it is OK, and sometimes it is very, very bad.

Let’s take a look at why the Key Lookup operator would be used. Say you have a query with a WHERE clause.  In this case there is a nonclustered index the optimizer determines can be used to locate the rows. One of the benefits of nonclustered indexes is that, if all the columns required for the query results are part of the nonclustered index, the underlying table does not have to be accessed.  This is also called a "covering" index.

I’ll steal a great analogy (or is it metaphor?) from my friend, Jason Strate. Travel back in time when the public library catalog was paper based. To find a book, you looked at cards in cabinets. If you wanted to find the names of all the books written by Stephen King, you could get that information from the catalog. If you wanted to actually read one of those books, you would have to walk to the appropriate shelf to get the book.

SQL Server works the same way. If all the information required for the query is contained within the nonclustered index, it is not necessary to access the actual table. If some of the required columns are not found in the nonclustered index, SQL Server will have to pull data from some of the pages of the table. To see this in the execution plan, look for the Key Lookup operators.  If the nonclustered index is very selective and there are just a handful of rows returned, this is not so bad. But if the index is not selective or there are hundreds of thousands of rows, then this is bad. Sometimes it would be more efficient to just scan the clustered index or heap instead of using the index.

Here is an example using the AdventureWorks database. The Sales.SalesOrderDetail table has a nonclustered index on ProductID, but the index doesn’t contain the OrderQty column.

SELECT OrderQty, ProductID
FROM Sales.SalesOrderDetail
WHERE ProductID = 897;

This query returns two rows, but still 98% of the effort is expended by the Key Lookup. If I change to a ProductID that returns many more rows, the optimizer might do a  clustered index scan instead along with an index suggestion.

SELECT OrderQty, ProductID
FROM Sales.SalesOrderDetail
WHERE ProductID = 707;

The two queries are ad-hoc queries. If the queries were instead parameterized, such as within a stored procedure, we now have a problem called “parameter sniffing”. The optimizer will store the plan for the first time the proc is called and use that plan even if it would be better to use a different plan. The next example demonstrates this.

 --create the proc
CREATE PROC usp_GetQty @ProductID INT AS
 SELECT OrderQty
 FROM Sales.SalesOrderDetail
 WHERE ProductID = @ProductID;
--Call with 897
EXEC usp_GetQty @ProductID = 897;

The call to the stored procedure uses the Key Lookup. That is what we expect since the Key Lookup is efficient when only two rows are returned. Now call the proc with 707.

EXEC usp_GetQty @ProductID = 707;
In this case, the optimizer reuses the original plan, which is not the best option for a ProductID of 707. Using the SET STATISTICS IO option, we can compare the IO required to return the rows with the two different methods.

SELECT OrderQty, ProductID
FROM Sales.SalesOrderDetail
WHERE ProductID = 707;
EXEC usp_GetQty @ProductID = 707;

How can we solve this problem? There are a few ways to solve it involving query hints, but in this case, I would just add the OrderQty column to the ProductID index as an included column, especially if this query runs frequently. This will eliminate the need to get the OrderQty values from the table. The index suggested  with the clustered index scan should not be used. This suggestion will actually create a duplicate index, but that is a story for another day...

NOTE: There is also an operator call RID Lookup that is used in the same way with HEAPS.





Categories: SQL Server
Rate this article:
No rating

Please login or register to post comments.