Date Dimension and Keys

Who is online?  0 guests and 0 members
Home  »  Forums   »  microsoft business intelligence   »  data warehouse design   » Date Dimension and Keys

Date Dimension and Keys

Topic RSS Feed

Posts under the topic: Date Dimension and Keys

Posted: 5/9/2011

Jedi Youngling 98  points  Jedi Youngling
  • Joined on: 5/24/2010
  • Posts: 39

According to Kimball the date dimension should have a smart date key which is an integer formatted like so: YYYYMMDD. It supports easy partitioning and only requires 4 bytes of storage. Date Time Datatype requires 8 bytes. 

 

However, SQL 2008 supports a Date Datatype which requires 3 bytes of storage. Some Kimball folks indicate that you should still use a INT key because it makes it easier to designate erroneous or missing date values from the source. 

 

Has anyone tested the join performance of INT vs Date datatype keys? 

 

What are you folks using right now? 


Posted: 5/9/2011

Padawan 317  points  Padawan
  • Joined on: 12/11/2009
  • Posts: 63

Actually, it's R2 that supports a Date datatype, that feature was ripped out of SQL 2008. I haven't tested it yet, but for most people the INT type will still make sense until they're up on R2.  I haven't tested the performance, but it seems like it would be more efficient in the long run, if only by a small amount. Really wish we could take advantage of that change, but it's just not to be for some time over here. :(

As for why you would still use an INT, I don't quite get that one. It's still a sequential set of dates, but using DATE would actually be constrained to real dates as opposed to someone chucking an extra digit into the INT range, thus making it an invalid date.


Posted: 5/9/2011

Jedi Knight 2333  points  Jedi Knight
  • Joined on: 2/19/2010
  • Posts: 312

All my dimensions have INT data types for their keys.  I have a key in all my dimensions (including my Date dimension) that represents an unknown value.  I find this useful for a variety of reasons.  How would I do this if I used the Date data type as my Date Dimension Key?  This is why I think I will stick with INT.


Posted: 5/9/2011

Jedi Youngling 98  points  Jedi Youngling
  • Joined on: 5/24/2010
  • Posts: 39

What value are you using for your unknown value anyway? I typically use 19000101 which is also acceptable as a Date datatype value as well. 


Posted: 5/9/2011

Padawan 317  points  Padawan
  • Joined on: 12/11/2009
  • Posts: 63

OK - quick note. I'm apparently wrong about DATE and TIME not being in SQL 2008, but I really could have sworn they were not in the RTM version. Anyone know if these were added in a service pack or CU? I'll definitely be checking these out going forward now that I know they exist for 2008.  In the meantime, we're using INT values because that's our current standard and I don't know that we'll refactor this anytime soon.


Posted: 5/9/2011

Jedi Knight 2333  points  Jedi Knight
  • Joined on: 2/19/2010
  • Posts: 312

I use an INT value of zero for my Unknown Date ID.


Posted: 6/13/2011

Jedi Youngling 10  points  Jedi Youngling
  • Joined on: 3/25/2010
  • Posts: 5

Haven't used Date Datatype Keys. I use INT for all my Keys and Zero to represent the unknown value. INT keys for date (YYYYMMDD) have worked great till now. They are also helpful when writing ad-hoc SQL queries on the DWH and trying to filter across date dimension because of the simplicity in which they are stored.

On a side note for unknown values - For some other dimensions, we have used multiple unknown rows. For example, we want to clearly distinguish between unknown values, blank values, pending values and so forth in the dim tables. For some business these values mean very different things and while reporting they help us identify the incoming data and the transformed data very easily.


Page 1 of 1 (7 items)