Skip to content

Cursor-based Rails/ActiveRecord pagination with multi-column sorting and custom cursor token format support.

License

Notifications You must be signed in to change notification settings

jsonb-uy/rotulus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rotulus

Gem Version CI codecov Maintainability

Cursor-based pagination for apps built on Rails/ActiveRecord

Cursor-based pagination is an alternative to OFFSET-based pagination that provides a more stable and predictable pagination behavior as records are added, updated, and removed in the database through the use of an encoded cursor token.

Some advantages of this approach are:

  • Reduces inaccuracies such as duplicate or skipped records as records are being manipulated.
  • Can significantly improve performance(with proper DB indexing on ordered columns) especially as you move forward on large datasets.

TL;DR See sample usage for Rails here .

Features

  • Paginate records sorted by any number of columns
  • Paginate records sorted by columns from joined tables
  • NULLS FIRST/NULLS LAST handling
  • Allows custom cursor format
  • Built-in cursor token expiration
  • Built-in cursor integrity checking
  • Supports MySQL, PostgreSQL, and SQLite
  • Supports Rails 4.2 and above

Installation

Add this line to your application's Gemfile:

gem 'rotulus'

And then execute:

bundle install

Or install it yourself as:

gem install rotulus

Configuration

Setting the environment variable ROTULUS_SECRET to a random string value(e.g. generate via rails secret) is the minimum required setup needed.

More configuration options

Create an initializer config/initializers/rotulus.rb:

Rotulus.configure do |config|
  config.page_default_limit = 5
  config.page_max_limit = 50
  config.secret = ENV["MY_ENV_VAR"]
  config.token_expires_in = 10800
  config.cursor_class = MyCursor
  config.restrict_order_change = false
  config.restrict_query_change = false
end
Configuration Description
page_default_limit Default: 5
Default record limit per page in case the :limit is not given when initializing a page Rotulus::Page.new(...)
page_max_limit Default: 50
Maximum :limit value allowed when initializing a page.
secret Default: ENV['ROTULUS_SECRET']
Key needed to generate the cursor state needed for cursor integrity checking.
token_expires_in Default: 259200(3 days)
Validity period of a cursor token (in seconds). Set to nil to disable token expiration.
restrict_order_change Default: false
When true, raise an OrderChanged error when paginating with a token that was generated from a page instance with a different :order.
When false, no error is raised and pagination is based on the new :order definition.
restrict_query_change Default: false
When true, raise a QueryChanged error when paginating with a token that was generated from a page instance with a different :ar_relation filter/query.
When false, no error is raised and pagination will query based on the new :ar_relation.
cursor_class Default: Rotulus::Cursor
Cursor class responsible for encoding/decoding cursor data. Default uses Base64 encoding. see Custom Token Format.

Usage

Basic Usage

Initialize a page

users = User.where('age > ?', 16)

page = Rotulus::Page.new(users, order: { id: :asc })
# OR just
page = Rotulus::Page.new(users)
Example when sorting with multiple columns and :limit:
page = Rotulus::Page.new(users, order: { first_name: :asc, last_name: :desc }, limit: 3)

The gem will automatically add the table's PK(users.id) in the ORDER BY as the tie-breaker column to ensure stable sorting and pagination.

Access the page records

page.records
=> [#<User id: 11, first_name: 'John'...]

Check if a next page exists

page.next?
=> true

Check if a previous page exists

page.prev?
=> false

Get the cursor to access the next page

page.next_token
=> "eyI6ZiI6eyJebyI6..."

In case there is no next page, nil is returned

Get the cursor to access the previous page

page.prev_token
=> "eyI6ZiI6eyJebyI6..."

In case there is no previous page(i.e., first page), nil is returned

Navigate to the page given a cursor

Return a new page instance pointed at the given cursor
another_page = page.at('eyI6ZiI6eyJebyI6...')
=> #<Rotulus::Page ..>

Or to immediately get the records:

page.at(next_page_token).records
Return the same page instance pointed at the given cursor
page.at!('eyI6ZiI6eyJebyI6...')
=> #<Rotulus::Page ..>

Get the next page

next_page = page.next

This is the same as page.at(page.next_token). Returns nil if there is no next page.

Get the previous page

previous_page = page.prev

This is the same as page.at(page.prev_token). Returns nil if there is no previous page.

Extras

Reload page

page.reload

# reload then return records
page.reload.records

Cursor tokens hash

page.links

=> { previous: "eyI6ZiI6efQ...", next: "eyI6ZiI6eyJ...."}

If the token is nil, the corresponding key(previous/next) isn't included in the hash.

Print page in table format for debugging

Currently, only the columns included in ORDER BY are shown:

puts page.as_table

+------------------------------------------------------------+
|   users.first_name   |   users.last_name   |   users.id    |
+------------------------------------------------------------+
|        George        |       <NULL>        |      1        |
|         Jane         |        Smith        |      3        |
|         Jane         |         Doe         |      2        |
+------------------------------------------------------------+

Advanced Usage

Expanded order definition

Instead of just specifying the column sorting such as { first_name: :asc } in the :order param, one can use the expanded order config in Hash format for more sorting options that would help the library to generate the optimal query:

Column Configuration Description
direction Default: :asc. :asc or :desc
nullable Default: true if the column is defined as nullable in its table, false otherwise.

Whether a null value is expected for this column in the result set.

Note:
- Not setting this to true when there are possible rows with NULL values for the specific column in the DB won't return those records.
- In queries with table (outer)JOINs, a column in the result could have a NULL value even if the column doesn't allow nulls in its table. So set nullable to true for such cases.
nulls Default:
- MySQL and SQLite: :first if direction is :asc, otherwise :last
- PostgreSQL: :last if direction is :asc, otherwise :first

Tells whether rows with NULL column values comes before/after the records with non-null values. Applicable only if column is nullable.
distinct Default: true if the column is the primary key of its table, false otherwise.

Tells whether rows in the result are expected to have unique values for this column.

Note:
- In queries with table JOINs, multiple rows could have the same column value even if the column has a unique index in its table. So set distinct to false for such cases.
model Default:
- the model of the base AR relation passed to Rotulus::Page.new(<ar_relation>) if column name has no prefix(e.g. first_name) and the AR relation model has a column matching the column name.
- the model of the base AR relation passed to Rotulus::Page.new(<ar_relation>) if column name has a prefix(e.g. users.first_name) and the prefix matches the AR relation's table name and the table has a column matching the column name.

Model where this column belongs. This allows the gem to infer the nullability and uniqueness from the column definition in its table instead of manually setting the nullable or distinct options and to also automatically prefix the column name with the table name.
Example:
order = {
  first_name: :asc,
  last_name: {
    direction: :desc,
    nullable: true,
    nulls: :last
  },
  email: {
    distinct: true
  }
}
page = Rotulus::Page.new(users, order: order, limit: 3)

Queries with JOINed tables

Example:

Suppose the requirement is to:

  1. Get all Item records.
  2. If an Item record has associated OrderItem records, get the order ids.
  3. Item records with OrderItems should come first.
  4. Item records with OrderItems should be sorted by item_count in descending order.
  5. If multiple rows have the same item_count value, sort them by item name in ascending order.
  6. If multiple rows have the same item_count value and the same name, sort them by OrderItem id.
  7. Sort Item records with no OrderItem, based on the item name in ascending order (tie-breaker).
  8. Sort Item records with no OrderItem and having the same name by the item id (also tie-breaker).
Our solution would be:
items = Item.all      # Requirement 1
            .joins("LEFT JOIN order_items oi ON oi.item_id = items.id")  # Requirement 2
            .select('oi.order_id', 'items.*')                            # Requirement 2

order_by = { 
  'oi.item_count' => { 
    direction: :desc,        # Requirement 4
    nulls: :last,            # Requirement 3
    nullable: true,          # Requirement 1
    model: OrderItem 
  }, 
  name: :asc,                  # Requirement 5, 7
  'oi.id' => {
    direction: :asc,         # Requirement 6
    distinct: true,          # Requirement 6
    nullable: true,          # Requirement 1
    model: OrderItem
  },
  id: :asc                    # Requirement 8
}
page = Rotulus::Page.new(items, order: order_by, limit: 2)

Some notes for the example above:

  1. oi.id is needed to uniquely identify and serve as the tie-breaker for Items that have OrderItems having the same item_count and name. The combination of oi.item_count, items.name, and oi.id makes those records unique in the dataset.
  2. id is translated to items.id and is needed to uniquely identify and serve as the tie-breaker for Items that have NO OrderItems. The combination of oi.item_count(NULL), items.name, oi.id(NULL), and items.id makes those record unique in the dataset. Although, this can be removed in the configuration above as the Item table's primary key will be automatically added as the last ORDER BY column if it isn't included yet.
  3. Explicitly setting the model: OrderItem in joined table columns is required for now.

An alternate solution that would also avoid N+1 if the OrderItem instances are to be accessed:

items = Item.all                       # Requirement 1
            .eager_load(:order_items)  # Requirement 2

order_by = { 
  item_count: { 
    direction: :desc,        # Requirement 4
    nulls: :last,            # Requirement 3
    nullable: true,          # Requirement 1
    model: OrderItem 
  }, 
  name: :asc,                # Requirement 5, 7
  'order_items.id' => {
    direction: :asc,         # Requirement 6
    distinct: true,          # Requirement 6
    nullable: true,          # Requirement 1
    model: OrderItem
  }
}
page = Rotulus::Page.new(items, order: order_by, limit: 2)

Rails Usage

Controller example 1:
def index
  page = Rotulus::Page.new(User.all, order: index_order, limit: params.dig(:page, :limit))
                      .at!(params[:cursor]) 
  render json: { data: page.records }.merge!(page.links)      # `page.links` contain the `cursor` value for next/prev pages.               
end

private

def index_order
  { first_name: :asc, 
    last_name: { direction: :desc, nulls: :last },
    email: { direction: :asc, distinct: true } }
end

APIs usually allow clients to specify which columns to sort through a parameter. You may use the sort_param gem to support this:

Controller example 2:
def index
  page = Rotulus::Page.new(User.all, order: index_order, limit: params.dig(:page, :limit))
                      .at!(params[:cursor])
  render json: { data: page.records }.merge!(page.links)                     
end

private

# Allow clients to sort by first_name, last_name, and/or email.
# example sort values:
# a. params[:sort] = +last_name,-email
# b. params[:sort] = -first_name
def index_order
  SortParam.define do
    field 'first_name'
    field 'last_name', nulls: :last, nullable: true
    field 'email', distinct: true
  end.load!(params[:sort].presence || 'first_name')
end

Errors

Class Description
Rotulus::InvalidCursor Cursor token received is invalid e.g., unrecognized token, token data has been tampered/updated.
Rotulus::Expired Cursor token received has expired based on the configured token_expires_in
Rotulus::InvalidLimit Limit set to Rotulus::Page is not valid. e.g., exceeds the configured limit. see config.page_max_limit
Rotulus::CursorError Generic error for cursor related validations
Rotulus::InvalidColumn Column provided in the :order param can't be found.
Rotulus::MissingTiebreaker There is no non-nullable and distinct column in the configured order definition.
Rotulus::ConfigurationError Generic error for missing/invalid configuration.
Rotulus::OrderChanged Raised when passing a token to Page#at or Page#at! methods of a page instance, and the token was generated from a page instance with a different :order definition. Can be enabled by setting the restrict_order_change to true.
Rotulus::QueryChanged Raised when passing a token to Page#at or Page#at! methods of a page instance, and the token was generated from a page instance with a different :ar_relation filter/query. Can be enabled by setting the restrict_query_change to true.

How it works

Cursor-based pagination uses a reference record to fetch the relative set of previous or next records. This gem takes care of the SQL query and cursor generation needed for the pagination. To ensure that the pagination results are stable, it requires that:

  • Records are sorted (ORDER BY) by columns.
  • In case multiple records with the same column value(s) exists in the result, a unique non-nullable column is needed as tie-breaker. Usually, the table PK suffices for this but for complex queries(e.g. with table joins and with nullable columns, etc.), combining and using multiple columns that would uniquely identify the row in the result is needed.
  • Columns used in ORDER BY would need to be indexed as they will be used in filtering.

Sample SQL-generated snippets to fetch the next set of records

Example 1: With order by id only
Ruby
page = Rotulus::Page.new(User.all, limit: 3)
SQL:
WHERE 
  users.id > ?
ORDER BY
  users.id asc LIMIT 3
Example 2: With non-distinct and not nullable column first_name
Ruby
page = Rotulus::Page.new(User.all, order: { first_name: :asc }, limit: 3)
SQL:
WHERE
  users.first_name >= ? AND
  (users.first_name > ? OR
    (users.first_name = ? AND
     users.id > ?))
ORDER BY
  users.first_name asc,
  users.id asc LIMIT 3
Example 3: With non-distinct and nullable(nulls last) column last_name
Ruby
page = Rotulus::Page.new(User.all, order: { first_name: { direction: :asc, nulls: :last }}, limit: 3)
SQL:
-- if last_name value of the current page's last record  is not null:
WHERE ((users.last_name >= ? OR users.last_name IS NULL) AND
  ((users.last_name > ? OR users.last_name IS NULL) 
  OR (users.last_name = ? AND users.id > ?)))
ORDER BY users.last_name asc nulls last, users.id asc LIMIT 3

-- if last_name value of the current page's last record is null:
WHERE users.last_name IS NULL AND users.id > ?
ORDER BY users.last_name asc nulls last, users.id asc LIMIT 3

Cursor

To navigate between pages, a cursor is used. The cursor token is a Base64 encoded string containing the data on how to filter the next/previous page's records. A decoded cursor to access the next page would look like:

Decoded Cursor

{
  "f": { "users.first_name": "Jane", "users.id": 2 }, 
  "d": "next",
  "c": 1672502400,
  "cs": "fe6ac1a1d6a1fc1b7f842b388639f63b",
  "os": "62186497a8073f9c7072389b73c6c60c",
  "qs": "7a5053198709df924dd5ec1752ee4e6b"
}
  1. f - contains the record values from the last record of the current page. Only the columns included in the ORDER BY are included. Note also that the unique column users.id is included as a tie-breaker.
  2. d - the pagination direction. next or prev set of records from the reference values in "f".
  3. cs - the cursor state needed for integrity checking, restricting clients/third-parties from generating their own (unsafe)tokens, or from tampering with the data of an existing token.
  4. os - the order state needed to detect whether the order definition changed.
  5. qs - the base AR relation state needed to detect whether the ar_relation has changed (e.g. filter/query changed due to API params).
  6. c - cursor token issuance time.

A condition generated from the cursor above would look like this:

WHERE users.first_name >= 'Jane' AND (
  users.first_name > 'Jane' OR (
    users.first_name = 'Jane' AND (users.id > 2)
  )
) LIMIT N

Custom Token Format

By default, the cursor is encoded as a Base64 token. To customize how the cursor is encoded and decoded, you may just need to subclass Rotulus::Cursor with the .decode and .encode methods implemented.

Example:

The implementation below would generate tokens in UUID format where the actual cursor data is stored in memory(in production, you would use a distributed data store such as Redis):

class MyCustomCursor < Rotulus::Cursor
  def self.decode(token)
    data = storage[token]
    return data if data.present?

    raise Rotulus::InvalidCursor
  end

  def self.encode(data)
    storage_key = SecureRandom.uuid

    storage[storage_key] = data
    storage_key
  end

  def self.storage
    @storage ||= {}
  end
end
config/initializers/rotulus.rb
Rotulus.configure do |config|
  ...
  config.cursor_class = MyCustomCursor
end

Limitations

  1. Custom SQL in ORDER BY expression other than sorting by table column values aren't supported to leverage the index usage.
  2. ORDER BY column names with characters other than alphanumeric and underscores are not supported.

Considerations

  1. Although adding indexes improves DB read performance, it can impact write performance. Only expose/whitelist the columns that are really needed in sorting.
  2. Depending on your use case, a disadvantage is that cursor-based pagination does not allow jumping to a specific page (no page numbers).

Development

  1. If testing/developing for MySQL or PG, create the database first:
MySQL
mysql> CREATE DATABASE rotulus;
PostgreSQL
$ createdb rotulus
  1. After checking out the repo, run bin/setup to install dependencies.
  2. Run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment. Use the environment variables below to target the database

By default, SQLite and the latest stable Rails version are used in tests and console. Refer to the environment variables below to change this:

Environment Variable Values Example
DB_ADAPTER Default: :sqlite. sqlite,mysql2, or postgresql DB_ADAPTER=postgresql bundle exec rspec

DB_ADAPTER=postgresql ./bin/console
RAILS_VERSION Default: 8-0

4-2,5-0,5-1,5-2,6-0,6-1,7-0, 7-1, 7-2, 8-0
RAILS_VERSION=5-2 ./bin/setup

RAILS_VERSION=7-2 bundle exec rspec

RAILS_VERSION=7-2 ./bin/console



To install this gem onto your local machine, run bundle exec rake install.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/jsonb-uy/rotulus.

License

The gem is available as open source under the terms of the MIT License.

About

Cursor-based Rails/ActiveRecord pagination with multi-column sorting and custom cursor token format support.

Topics

Resources

License

Stars

Watchers

Forks

Languages