Friday, February 8, 2013

Database handling for TimeZones

Today I’ve been working with Postgresql and MySQL trying to figure out how they handle date-time values and timezones. This is actually quite tricky, so I wanted to write it down for later. First, both databases have two different types of columns:
  • One that stores the date-time without any time zone information
  • One that stores the date-time with time zone information
For Postgresql, these are:
  • timestamp without time zone
  • timestamp with time zone
  • respectively.
    For MySQL, these are:
    • datetime
    • timestamp
    There are a number of things to consider when dealing with date-time values:
    • What if the timezone of the server changes?
    • What if the server moves physical locations thereby indicating a new time zone?
    • What if the timezone of the database server changes (different than the timezone of the server)?
    • How does the JDBC driver handle timezones?
    • How does the database handle timezones?
    To figure all of this out, it is important to understand how the date-time value goes from the application, to the database, out of the database and back to the application (for inserts and later selects). Here is how this works for values without timezone information:
    Insert
    1. Java creates a java.sql.Timestamp instance. This is stored as UTC
    2. Java sends this value to the JDBC driver in UTC
    3. The JDBC driver sends the value to the database in UTC
    4. The database converts the date-time from UTC to its current timezone setting
    5. The database inserts the value into the column in the current timezone (NOT UTC)
    Select
    1. The database selects the value from the column in the system timezone
    2. The database converts the value to UTC
    3. The database sends the UTC value to the JDBC driver
    4. The JDBC driver creates a new java.sql.Timestamp instance with the UTC value
    For this scenario, you will run into major issues if the the server or database timezone changes between Insert #5 and Select #1. In this case, the value will not be correct. The only way to make things work correctly for this setup is to ensure no timezone settings ever change or to set all timezones to UTC for everything.
    For the types that store timezone information, here is how the data is passed around:
    Insert
    1. Java creates a java.sql.Timestamp instance. This is stored as UTC
    2. Java sends this value to the JDBC driver in UTC
    3. The JDBC driver sends the value to the database in UTC
    4. The database calculates the offset between the current timezone and UTC
    5. The database inserts the value as UTC into the column along with the offset it calculated
    Select
    1. The database selects the value from the column in UTC along with the offset
    2. The database sends the UTC value to the JDBC driver
    3. The JDBC driver creates a new java.sql.Timestamp instance with the UTC value
    Since the JDBC driver only handles UTC values, there is no potential for data mangling here since everything is stored in UTC inside the database. Although the database stores the timezone information along with the date-time value in UTC, it isn’t ever used because the JDBC driver doesn’t care what the original timezone was and ignores that value.

No comments:

Post a Comment