A Better Way To Import Legacy Data In Rails
Do you have to run a Rails app alongside the legacy app it’s replacing? The legacy app remains the authority for some data, such as users, while the Rails app is the authority for other data.
I’m in this situation on one project. My Rails app is replacing a Mambo CMS piece by piece. Once the Mambo app is fully replaced the Rails app will be the authority for all data in the system. Until then, the Rails app must keep itself synchronised with the Mambo app’s data for those models where the Mambo app remains the authority.
In my Rails app eight models’ data must be imported from five legacy tables. Other models refer to those eight models so I need to maintain referential integrity.
Acts As Importable
My first effort at writing the import code was alright. It worked. Recently, though, I came across Tim Riley’s excellent acts_as_importable plugin and this has made all the difference. Before his plugin my code was a little ugly and its intent was obscure. Now, using his plugin, my code is clean, clear, and trivial to maintain.
Tim wrote a clear article explaining how to import legacy data into Rails. Two of his design decisions stood out for me:
- The legacy models know how to turn themselves into non-legacy models, not vice-versa.
- The legacy models don’t export themselves; a dedicated class has that responsibility.
I made three additions to Tim’s recipe.
First I made my legacy models read-only, using Jason Frame’s read_only_model plugin, just to be on the safe side. My client would be displeased if I accidentally wrote over any of his data.
Second I turned off Thinking Sphinx’s delta indexing before importing any data, turning it on again afterwards and re-indexing. Leaving the delta indexing on during the import would be a waste of resources and would significantly slow down the import.
Third I turned off auditing (using the acts_as_audited plugin) during the import process for a similar reason. The audit trail of who updated which record would become next to useless if swamped with tens of thousands of entries each night.
So my importer looks like this:
class Legacy::Importer
def self.run
disable_auditing
disable_delta_indexing
# The import (see Tim Riley's article)
# ...
ensure
enable_auditing
enable_delta_indexing
end
private
def self.disable_delta_indexing
ThinkingSphinx.deltas_enabled = false
end
def self.enable_delta_indexing
ThinkingSphinx.deltas_enabled = true
end
@@audited_models = [ Member, Report ] # etc
def self.disable_auditing
@@audited_models.each { |m| m.disable_auditing }
end
def self.enable_auditing
@@audited_models.each { |m| m.enable_auditing }
end
end
Anyway, if you are in the same boat you should try out acts_as_importable. It certainly improved my import process.

0 Comments
Jump to comment form