I’ve been using acts_as_list since the very first release of my app. Lately I’ve noticed that it occasionally creates duplicate position values (a position of a record in a sorted collection should be unique). Besides, the algorithm for reordering items is not exactly optimal. For example, moving a record from the last position to the first will update every record in the list. Good reasons to migrate to ranked-model which uses a more sophisticated algorithm to minimize DB writes.

Migrating data

My app works with galleries that have many images ranked by their position value.

While things should just work out of the box, it is a good idea to migrate data to work well with ranked-model.

Firstly, to get rid of duplicate ranks.

Secondly, if we deploy this while ranks are sequential, every user will trigger a rebalance almost instantly.

To prevent this extra load on the database, let’s space the records further apart:

WITH cte AS (
  SELECT id, ROW_NUMBER() OVER (PARTITION BY gallery_id ORDER BY position, updated_at) as new_position
  FROM images
)
UPDATE images
SET position = 100000 * cte.new_position
FROM cte
WHERE images.id = cte.id;

ROW_NUMBER is a window function that returns the number of the current row within its partition, counting from 1. We partition by galleries and we’re taking care of any duplicate values by sorting by postion and also updated_at values.

Multiplying the row number will ensure images are ranked in increments of 100,000. This number is arbitrary but also guarantees no record exceeds the upper bound of the integer type (2147483647). My app’s biggest gallery has less than 2,000 images so this will work ok.

Safety first

Updating every record in production database feels intimidating but it doesn’t have to be.

First of all, create a backup in case anything goes wrong. This is self-explanatory.

backup

Another tool in our tool belt is pulling production database to a local copy to test the update. Heroku has a handy command for this:

heroku pg:pull DATABASE_NAME local-database-name -a app-name

Run the SQL command above locally using psql local-database-name and verify it updates data correctly. Then we’re ready to update live data.

To give you an idea of how long this might take, here’s what I got:

update

380k records in about 7 seconds, not bad.