Backgrounding Work – How to write good background jobs in Ruby On Rails?

background_work

In this article, I will be diving into the guidelines and best practices in writing performant background jobs in your Rails application.

Firstly, when should we move our web transactions to be processed in the background? The below are the three criteria where we should use background jobs to process the transactions instead of having to perform it immediately and having the users to wait:-

  1. The transaction always takes more than your average response time to complete 
  2. The transaction contact an external service over network
  3. The user does not care if the transaction is completed immediately

Now, let’s talk about how we can write safe, performant and reliable background jobs. There are basically few characteristics and best practices that we can follow:-

Idempotency

In programming, the term “idempotent” describes an operation that will produce the same output if executed once or multiple times. In our case, a background job must produce the same result (without side effects) regardless of how many times we run it.

Below is an example of non-idempotent background job that we can find in most Rails applications:-

class RegistrationMailJob < ActiveJob::Base 
  queue_as :default  
  def perform(user) 
    UserMailer.signup_email(to: user).deliver  
  end 
end 

If we run the job above twice, we will be sending two welcome email, which is bad from a user perspective. When writing a background job, we must always assume that it’s possible for any given job to be run more than once as most background job processors cannot guarantee that any given jobs will not be run more than once. The workaround for this is to implement row-level database lock like shown below:-

# app/models/user.rb
class User  
  after_commit :send_signup_email

  def send_signup_email     
    UserMailer.signup_email(self).deliver  
  end 
end

# app/jobs/registration_mail_job.rb
class RegistrationMailJob < ActiveJob::Base 
  queue_as :default  
  def perform(user) 
    UserMailer.signup_email(to: user).deliver  
  end    

  around_perform do |job, block|    
    user = job.arguments.first    
    user.with_lock do      
    return if user.signup_email_sent       
    if block.call        
      user.update_attributes(signup_email_sent: true)      
    else        
      retry_now      
    end     
  end  
end

Referring to the example above, the around_perform block will prevent the following scenarios:-

  • If the RegistrationMailJob job is enqueued more than once. The email will only be sent once. The first job will set user.signup_email_sent to attribute to true, then the second job will exit after checking user.signup_email_sent 
  • In a rare case where two jobs with the same user are executed at the same time, the with_lock block will block the second worker from executing the job until the first job is completed (once the first job is completed, user.signup_email_sent  will be set to true and the second job will exit, as per the first point)
  • If the deliver method fails, we’ll set up a job to retry.

Writing the smallest job possible

You should write your job to be as small as possible in terms of lines of code and execution time. When possible, instead of bulk processing a bunch of objects in a single job, try to split them into multiple jobs. The rules of thumb is, every job in a queue should have the same average execution time. There should not be any job that takes significantly longer time than others.

Set timeout aggressively

To prevent the worker from being stuck on a job with extraordinary long response time, we can set aggressive timeout. Also, since we are writing jobs that are idempotent as mentioned earlier, there is really no reason to have a long timeout as we can always retry the job without drawback.

Say no to job uniqueness

If you designed your job to be idempotent, you do not need to be concerned about the uniqueness of any job since an idempotent job can be executed infinite number of times without changing the output. On the other hand, you can achieve what you want by using throttling instead.

Proper error handling

You should always implement an error handler for any given job when an exception occurs. It is good enough to write jobs inside a database transaction or a database row-level lock. The point is, we do not want to leave transactions incomplete such as the transaction should either fail completely and do nothing, or succeed and all work should be completed. 

Use red flag on problematic job

Oftentimes, some jobs will keep failing and will never complete. In such a case, you need to set up a flag so that we will be notified when a certain number of failures occur and further action can be taken. 

For example, Sidekiq has a “retry” queue and by default, Sidekiq will retry your job for maximum 25 times before moving the job into the “dead” queue. 

Summary

We should always use background job when services involving external network are been used, when the action need not to be completed immediately or when the action will take very long time to complete. Background jobs should always be idempotent in which we can run the job multiple times without breaking anything. We should breakdown our background jobs to as simple and small as possible instead of chunking everything into one job. In term of setting timeout, we should be take a more aggressive approach in setting timeout as it is better to fail fast than waiting for a very slow background job to response. Last but not least, we should have red flag to notify us on problematic job that have failed for more than certain amount of times.

Share on twitter
Twitter
Share on telegram
Telegram
Share on facebook
Facebook
Share on linkedin
LinkedIn
Share on email
Email

Leave a Comment

Your email address will not be published. Required fields are marked *