ruby on rails - ActiveRecord: Handling DB races among workers -


i have rails 3 project running on top of postgresql 9.0.

use case: users can request follow artists name. this, submit list of names rest resource. if can't find artist name in local collection, consult last.fm information them, , cache information locally. process can take time, delegated background job called indexartistjob.

problem: indexartistjob run in parallel. thus, possible 2 users may request add same artist @ same time. both users should have artist added collection, 1 artist should end in local database.

relevant portions of artist model are:

require 'services/lastfm'  class artist < activerecord::base   validates_presence_of :name   validates_uniqueness_of :name, :case_sensitive => false  def self.lookup(name)   artist = artist.find_by_name(name)   return artist if not artist.nil?    info = lastfm.get_artist_info(name)   return if info.nil?    # check local db again corrected name.   if name.downcase != info.name.downcase     artist = artist.find_by_name(info.name)     return artist if not artist.nil?   end    artist.new(       :name => info.name,       :image_url => info.image_url,       :bio => info.bio   )   end end 

the indexartistjob class defined as:

class indexartistjob < struct.new(:user_id, :artist_name)   def perform     user = user.find(user_id)      # may return new, uncommitted artist model, or existing, committed one.     artist = artist.lookup(artist_name)     return if artist.nil?      # presume thread pre-empted here long enough time such     # work done worker violates db's unique constraint.     user.artists << artist    rescue activerecord::recordnotunique  # lost race, defer winning model     user.artists << artist.lookup(artist_name)   end end 

what i'm trying here let each worker commit new artist finds, hoping best. if conflict occur, want slower worker(s) abandon work did in favor of artist inserted, , add artist specified user.

i'm aware of fact rails validators no substitute actual data integrity checking @ level of database. handle this, added unique index on artist table's lowercased name field handle (and use searching). now, if understand documentation correctly, ar's association collection commits changes item being added (artist in case) , underlying collection in transaction. can't guaranteed artist added.

am doing correctly? if so, there nicer way it? feel structuring around exceptions accentuates fact problem 1 of concurrency, , bit subtle.

sounds use simple queuing mechanism. using database table:

  1. when "front-end" thread discovers missing artist, have write artist name table status "waiting" (have unique index on artist name can happen once).

  2. meanwhile background thread/process sits in loop , queries table new jobs:
    a) start transaction
    b) find first artist status="waiting"
    c) update artist status "processing"
    d) end transaction

  3. the background thread indexes artist. noone else try because can see status "processing".

  4. when finished, background thread deletes artist table.

using method, run multiple background threads increase concurrency on artist indexing.

also @ beanstalk manage process. see http://railscasts.com/episodes/243-beanstalkd-and-stalker.


Comments

Popular posts from this blog

objective c - Change font of selected text in UITextView -

php - Accessing POST data in Facebook cavas app -

c# - Getting control value when switching a view as part of a multiview -