activerecord - Is COMB GUID a good idea with Rails 3.1 if I use GUIDs for primary keys? -
i'm using rails 3.1 postgresql 8.4. let's assume want/need use guid primary keys. 1 potential drawback index fragmentation. in ms sql, recommended solution use special sequential guids. 1 approach sequential guids combination guid substitutes 6-byte timestamp mac address portion @ end of guid. has mainstream adoption: combs available natively in nhibernate (nhibernate/id/guidcombgenerator.cs).
i think i've figured out how create comb guids in rails (with of uuidtools 2.1.2 gem), leaves unanswered questions:
- does postgresql suffer index fragmentation when primary key type uuid?
- is fragmentation avoided if low-order 6 bytes of guid sequential?
- is comb guid implemented below acceptable, reliable way create sequential guids in rails?
thanks thoughts.
create_contacts.rb
migration
class createcontacts < activerecord::migration def create_table :contacts, :id => false |t| t.column :id, :uuid, :null => false # manually create :id underlying db type uuid t.string :first_name t.string :last_name t.string :email t.timestamps end execute "alter table contacts add primary key (id);" end # can't use reversible migration because try run 'execute' again def down drop_table :contacts # drops primary key end end
/app/models/contact.rb
class contact < activerecord::base require 'uuid_helper' #rails 3 not autoload lib/* include uuidhelper set_primary_key :id end
/lib/uuid_tools.rb
require 'uuidtools' module uuidhelper def self.included(base) base.class_eval include instancemethods attr_readonly :id # writable on new record before_create :set_uuid end end module instancemethods private def set_uuid # ms sql syntax: cast(cast(newid() binary(10)) + cast(getdate() binary(6)) uniqueidentifier) # current time object utc_timestamp = time.now.utc # convert integer milliseconds: (seconds since epoch * 1000) + (6-digit microsecond fraction / 1000) utc_timestamp_with_ms_int = (utc_timestamp.tv_sec * 1000) + (utc_timestamp.tv_usec / 1000) # format hex, minimum of 12 digits, leading zero. note 12 hex digits handles year 10889 (*). utc_timestamp_with_ms_hexstring = "%012x" % utc_timestamp_with_ms_int # if supply uuidtools mac address, use rather retrieving system. # use regular expression split array, insert ":" characters "looks" mac address. uuidtools::uuid.mac_address = (utc_timestamp_with_ms_hexstring.scan /.{2}/).join(":") # generate version 1 uuid (see rfc 4122). comb_guid = uuidtools::uuid.timestamp_create().to_s # assign generted combination guid .id self.id = comb_guid # (*) note on maximum time handled 6-byte timestamp includes milliseconds: # if utc_timestamp_with_ms_hexstring = "ffffffffffff" (12 f's), # time.at(float(utc_timestamp_with_ms_hexstring.hex)/1000).utc.iso8601(10) = "10889-08-02t05:31:50.6550292968z". end end end
- does postgresql suffer index fragmentation when primary key type uuid?
yes, it's expected. if you're going use comb strategy won't happen. rows in order(that's not entirely true, bear me).
also, performance between native pgsql uuid vs varchar not different. point consider.
- is fragmentation avoided if low-order 6 bytes of guid sequential?
in test i've found uuid1(rfc 4122) sequential, there's timestamp added in generated uuid. yes, adding timestamp in last 6 bytes reassure ordering. that's did anyway, because apparently timestamp present not guarantee of order. more comb here
- is comb guid implemented below acceptable, reliable way create sequential guids in rails?
i'm not using rails, i'll show how did in django:
import uuid, time def uuid1_comb(obj): return uuid.uuid1(node=int(time.time() * 1000))
where node
48-bit positive integer identifying hardware address.
about implementation, 1 of main advantages of using uuid's can safely generate them outside database, so, using helper class 1 valid way it. can use external service uuid generation snowflake, may premature optimizacion @ point.
Comments
Post a Comment