Tuesday, January 10, 2006

Time based UUID Generation Algorithm

We had a requirement recently that we should map files to UUIDs. This gives us the flexibility to refer to a file without using a name thereby enabling us to rename it. So we dug a lil bit on UUIDs. java.util.UUID is the UUID implementation in java and this is the RFC its linked to.

So basically a UUID (java.util.UUID) represents a 128-bit value. These bits are split as

32 bits time_low
16 bits time_mid
16 bits time_hi_and_version
16 bits clock sequence
48 bits node

Timestamp is a 60 bit value of the UTC as a count of 100-nanosecond intervals since 00:00:00.00, 15 October 1582.

Clock Sequence is used to help avoid duplicates that could arise when the clock is set backwards in time or if the node ID changes.The clock sequence MUST be originally (i.e., once in the lifetime of a system) initialized to a random number to minimize the correlation across systems. If the previous value of the clock sequence is known, it can just be incremented; otherwise it should be set to a random or high-quality pseudo-random value.

Node For UUID version 1, the node field consists of an IEEE 802 MAC address, usually the host address.For UUID version 3 or 5, the node field is a 48-bit value constructed from a name. For UUID version 4, the node field is a randomly or pseudo-randomly generated 48-bit value.

There are four different basic types of UUIDs: time-based, DCE security, name-based, and randomly generated UUIDs. These types have a version value of 1, 2, 3 and 4, respectively. Lets look at the time based UUID generation algo.

Time based UUID creation Algorithm

These are the steps as present in the RFC. All italics are my comments..

1) Obtain a system-wide global lock - How ? Simple use a java.nio.channels.FileLock ! This is what the java.util.logging framework uses to ensure log entries are not overwritten when used from multiple JVMs.
2) From a system-wide shared stable store (e.g., a file), read the UUID generator state: the values of the timestamp, clock sequence, and node ID used to generate the last UUID.
3) Get the current time as a 60-bit count of 100-nanosecond intervals since 00:00:00.00, 15 October 1582.
4) Get the current node ID.
5) If the state was unavailable (e.g., non-existent or corrupted), or the saved node ID is different than the current node ID, generate a random clock sequence value - If someone deletes the state store file, then since we start off with a random number we can still get a unique UUID.
6) If the state was available, but the saved timestamp is later than the current timestamp, increment the clock sequence value. - Ingenious !! This means if u revert back ur clock the UUID will still remain unique.
7) Save the state (current timestamp, clock sequence, and node ID) back to the stable store.
8) Release the global lock.
9) Format a UUID from the current timestamp, clock sequence, and node ID values.

The algorithm looks very simple and elegant, and though there are other ways of getting UUIDs this is the one that is easy to understand.

Subscribe to comments for this post

Friday, January 06, 2006

Effective Developer Productivity

Just came across this article on productive coding and moving forward

The gist of the article is all about how productive a developer is each day. Everyone of us knows we dont code for 8 hrs a day. When we are in the mood (or in the zone) we can get close to 5 hours of productive coding. And when we're not, ah well, let's admit it we generate a lot of internet traffic, go write some documents .. you get the idea !

And when we see that we're behind on the plan and are going to be in trouble pretty soon, we create a big imaginary boot to kick us out of the stupor and get back to work.

This piece of truth generates a lot of insights

1) Pair Programming - Each pair would generate more quality code per day than what they would have done individually. Why? Coz if one of the developer enters his zone and starts programming, he enthuses the other enuf to get him productive. Left alone the second developer would be well twiddling his thums, picking his nose and waiting for something interesting (like evening !) to happen.

2) Task Swaps - When I am in the zone, working furiously, and i get a small task to do in addition to what im doing, i lose it. Face it, you're in the zone, your manager walks in and asks you to monitor the build every hour or create some quality (CMM ??) documentation before the end of the day, and that task is enuf to break me out of the zone. Period. So Ensure when you're in the zone to a) disable mail alerts b) close all IM's c) not respond to any requests for squeezing in one teeny weeny bit of work into ur day.

3) Getting into the zone - Many a time, i've come to office and did nothing productive for the whole day and left at the end of the day. How do you ensure that you can get in atleast a few hrs of produtive work every day just to make sure you can satisfy the daily plan tracking manic manager ? Getting started on something is the hard part. Start on something small and easy so that a) on completing it you feel better b) more importantly, if u try anything big/hard in which you cant get something to work in the first half-hour, will end up getting u further away from the zone. Have a lot of small wins and a few hours later, you can bring on your big guns and start blazing away at those nasty bugs/functionalities.

Happy Coding !

Subscribe to comments for this post

Clicky Web Analytics