It’s not you; it’s me.
Sometimes dealing with data, especially data you are not familiar with, feels like detective work. I ran into two situations lately that took a bit of digging to figure out what was going on. I often see people blaming SQL Server for something that isn’t going right. Probably 99.999% of the time (let’s face it, no software is perfect) there is something that the DBA or developer is doing wrong or looking in the wrong place. Often they think a view has a different definition than it really does or a connection string is really pointing to a different server than they think it is.
As a consultant, I work for many different companies and many different industries on mostly short term projects. When I was a DBA at one place for eight years, I knew the databases I supported very well. In the consulting arena, I go in not understanding the customer’s databases, processes or business rules.
Recently a colleague asked me to take a look at an execution plan for a query that joined two tables (heaps) and to figure out why SQL Server was doing a table scan on both of the tables in the join when there was a covering index for the query on each table. The query was taking several minutes to run with no results before it was killed.
The tables were in a staging database as part of loading a data warehouse. This reason is often cited for using a heap since it might load a large amount of data faster than a clustered index. I could understand why a table scan would be used if one of the columns were missing from the index because of the number of rows to be returned, and RID lookups would be extremely expensive.
Since we were working with a local copy of the data, we decided to change the indexes in question to clustered indexes. Surprisingly, the execution plan now chose to scan one of the non-clustered indexes that did not even include the columns involved in the queries. Again, we killed the query after several minutes.
Digging a bit deeper, we realized that the natural primary key, also the keys in the join predicate, was NULL in every row. There was a step in the process that needed to run before attempting this query. So, the index we were looking at was just a big mess of NULL values – useless! After running the process to fill in the NULL values, the query behavior and execution plan lined up to what we expected to see.
A second example came a couple of days later. A new column was added to one of the tables, and I wrote an update statement to populate it. I ran the statement and saw a large number of rows were affected. Thinking everything was fine, I sent the statement to my colleague. He messaged me back saying that the column we were trying to update was still filled with NULL values! After reviewing my code and deciding that it was correct, I ran it again with the same results. What could keep a table from being updated? A policy? A trigger? Realizing that a trigger was the simplest and most likely answer I looked. Sure enough, an “instead of” trigger was in place that was keeping the new column from being updated.
So, even when you think SQL Server is doing something odd, it’s probably not SQL Server. The moral of the story is to keep digging or get a second pair of eyes to take a look with you.