T-SQL Tuesday #001 – Dates and Time

Adam Machanic, who blogs at SQLBlog.com (lots of great sql bloggers over there) started a neat tradition: T-SQL Tuesday. Read his post here to see the rules and information but it’s basically a weekly (2nd Tuesday of each month) blog topic where we can all have a post about the same topic. I like this because it’s a great way to go in depth on a topic of interest to the community at large. I am not sure if the focus needs to remain on T-SQL or if we can bring up DBA topics, etc but I’ll find out if I get a chance to host the “event”.

I also really Like the numbering convention Adam used (“00n”), it shows he really wants to make this continue for awhile.

Dates and Times 101

 

This month’s topic is about dates and times. From Adam’s original post: “the topic for this month is Date/Time Tricks. Write a blog post that talks about dates and times–this can be based around T-SQL programming, data modeling, ETL, reporting, or whatever else you’re using dates and times for (and who isn’t?)”

Check back at his initial post to see all of the responses that come in on Tuesday. I will be visiting and expecting to learn a lot. I wanted to keep my response simple here so some of the true T-SQL gurus can have more of your time (and my time) when reading the posts.

So I am going to cover a couple basic points that have been on my mind for some reason or another recently…

We Are The World

Working for a global company with developers on three continents writing procedures and processes for users on at least 5 continents, dates and times can sometimes be confused. A lot of our databases are centrally hosted here in the states and tend to be in the default collation. So we don’t have a lot of language/special character issues but there are times when Dates can be confused.

Today is 8/12/2009. No, Today is 12/8/2009.

Who is right? Well really both answers are right depending on where you are and where you are from. To someone in the United States the second answer seems more right (Unless you are in the military and when using your pen, ball-point, black; you may be used to writing a date on a report as 08December2009). To someone in a lot of Europe the first may seem more right.

Keep this in mind when querying with dates.

The following date format will always be interpreted correctly by SQL Server regardless of your regional settings and collation. It is the ISO format for dates: YYYYMMDD. Today, by that standard, is 20091208.  When looking at the Cast/Convert books online topic and the chart, you’ll see this is format 112.

Format 126 (ISO 8601) is also recognized globally and it includes time. Check out that books online article and get to know and love these date formats, especially if you are working with folks across the globe.

You Don’t Have to Cluster the Primary Key!

 

This is something that should be common knowledge but through interviewing folks, working with developers and seeing newsgroup/forum questions, it still isn’t. The default behavior in SQL Server is to make all Primary Keys the clustered index. There is some merit there but it isn’t a requirement.

Sometimes a date column, even if the date is not very unique (perhaps not tracking times) or even in some cases where the “date” is really just an integer of YYYYMM. Can be a good clustered index in my experience.

Where? Mostly in data warehousing situations in my experience. Especially if folks are typically querying for a range of time, for transactions on a particular day or wanting to sort results by date. I find that in Fact tables a clustered index on something to do with the date can be helpful.

Why? How often do you query a Fact Table by a surrogate key? Yes some could argue well indexed dimensions and non clustered indexes on all the dimension keys in the fact table (perhaps on some commonly used measures) is enough. I find that the dates help for the reasons describe above on querying the data. It also helps when loading the warehouse if the dates come in in date order. You still have much of the benefit of an ever increasing clustered key reducing insert overhead from page splits,  and having to look all over. Yeah there will likely be some dupes, so there is a 4byte uniqueifier overhead in the clustered index. I would also imagine that when partitioning by date this clustered index strategy would make life a bit easier for you.

Of course, you should test in your environment and use what works for you. Empirical Evidence speaks volumes when trying scenarios out. Experiment with your indexes and see what works best for your work load, your insert behavior, your query behavior, etc.

Off To Read The Other Posts

 

Ok well I’m not really off to read them at this moment. I am writing this post on 20091204 and will, on 20091208, go and read other posts to learn a lot about dates and times. Check the original post for all the trackbacks of interesting date/time information.

 

Tags: , , , , ,

2 Comments

Leave a comment
  1. Fatherjack December 10, 2009 at 09:00 #

    Ummmm, if its the 2nd Tuesday of each month doesnt that make it monthly? ;P

    My preference when dealing with dates has been to fire them into a database in the format dd-mmm-yyyy (eg 08-Dec-2009, 12-Aug-2008) and leave the database engine decide how it wants them to be stored. Whenever I am working with developers I get them to code the applications to always display in this format and only accept user input in that format. Noone does the conversion to a ‘digital date’.

    Does this conflict with your advice and the use of the ISO standard? I have never experienced any problems with my solution but would happily revise my ideas if it is a risk..?

    Thanks for the post and the reminder about the Primary Keys, they so often are clustered its easy to fall in to the routine by default.

    Cheers

    Jonathan

  2. Mike Walsh December 10, 2009 at 13:06 #

    Hey Jonathan – Isn’t that funny. In a post dedicated to dates and times for a "blog party" (I don’t know what you call it in proper Blog etiquette) all about dates/times, I messed up that simple date period logic! Thanks for the catch and I’ll even leave it so folks reading the comments don’t think you are strange.

    The only problem I could see is that date format is affected by location/regional settings. Today is DEZember in Germany, for instance. If the servers language setting is US English then you are okay with someone using 10-DEC-2009 but then you run into the question of how folks input/how the apps work/etc.

    Specifying the month name, as you suggest, eliminates much of the problem with the DD-MM-YYYY|MM-DD-YYYY differences that divide the world. The ISO format (or the ODBC Canonical format) are both internationally liked and accepted regardless of code page on the server, regional or localization settings. But then there may need to be some converting/etc.

    So while I use it as a best practice, I look at the bigger picture… Is this app international? Will It be? If it is how will local settings/regional settings and multiple data sources affect it? I try to play to the widest audience with the dates and play it safe. That being said, I have also been involved in projects where we used the format you specify or something similar.

    The good news is that either approach is still stored as a datetime in the database so either format should have no impact on the use (or not) of an index.

Leave a Reply