Recently, I've been building a full-stack web application from scratch (with help of the Yesod framework).
As my web development knowledge is close to zero, I've been learning a lot of new things (e.g. the Post/Redirect/Get pattern, cookies, gotchas in i18n). One new thing I've learned as well are a method of pagination queries known as keyset paging or the seek method.
The word "pagination" predates computing; it is a general term describing the process of dividing a document into discrete pages, such as the pages of a physical book. However, in the modern world, pagination often refers to splitting up data into multiple pages of a web browser: the page numbers after a Google search, or the infinite scrolling of Facebook.
So how do we create the backend queries responsible for loading data on demand?
We can think of using LIMIT
and OFFSET
:
SELECT * FROM my_table ORDER BY id LIMIT 100 OFFSET 100000;
While this method is simple, it can be inefficient. Since many RDBMS's use B-trees as their indices, trying to select the 100001th row may require you to traverse the entire tree to find where the 100001th row starts.
An alternative method -- the keyset paging method -- is to use the value returned by the previous query:
SELECT * FROM my_table WHERE id > ? ORDER BY id LIMIT 100;
This way, when the RDBMS is traversing its B-tree, it knows where to branch and can easily discard searching through a large portion of its index tree.