@iKlsR

Competition Ranking with PostgreSQL

I recently had to work with some ranking data and had some time to look at ranking models used for competitions. It seems rather dull but presents a few different ways to think about ranking as opposed to the traditional first, second and third place.

As per ranking

It is not always possible to assign rankings uniquely. For example, in a race or competition two (or more) entrants might tie for a place in the ranking. … A common shorthand way to distinguish these ranking strategies is by the ranking numbers that would be produced for four items, with the first item ranked ahead of the second and third (which compare equal) which are both ranked ahead of the fourth.

We will look at how to do Standard competition ranking, Dense ranking and Ordinal ranking using Postgres' window functions since these deal with whole numbers.

Let's create a table of scores and users, btw I recommend pgcli if you're getting into postgresql.

CREATE TABLE scores (
    id int4,
    score int4
);

Let's add some random scores and hopefully tie a few

INSERT INTO scores
    (SELECT generate_series(1, 10) AS id, round(random() * 10) AS score);

Looking at our table now, we have a good aggregate of random scores

    SELECT * FROM scores;
 id | score
----+-------
  1 |     3
  2 |     5
  3 |     4
  4 |     0
  5 |     3
  6 |     7
  7 |     8
  8 |     5
  9 |     8
 10 |    10
(10 rows)

Ordinal ranking

This is the generic ranking model and is not very useful or fair in some circumstances and the score that is given precedence is arbitrarily determined by the scorer. We will use row_number which returns the number of the current row within its partition, counting from 1.

    SELECT id, score, row_number() OVER (order by score) AS rank;
 id | score | rank
----+-------+------
  4 |     0 |    1
  5 |     3 |    2
  1 |     3 |    3
  3 |     4 |    4
  8 |     5 |    5
  2 |     5 |    6
  6 |     7 |    7
  7 |     8 |    8
  9 |     8 |    9
 10 |    10 |   10
(10 rows)

Here we have ranks 1 to 10 where scores tied at 3 and 8 are ranked in turn instead of being scored as one.

Standard competition ranking

Here, we will make use of postgres' rank(), rank returns the rank of the current row with gaps; same as row_number().

    SELECT id, score, rank() OVER (order BY score)
    FROM scores ORDER BY rank;
 id | score | rank
----+-------+------
  4 |     0 |    1
  5 |     3 |    2
  1 |     3 |    2
  3 |     4 |    4
  8 |     5 |    5
  2 |     5 |    5
  6 |     7 |    7
  7 |     8 |    8
  9 |     8 |    8
 10 |    10 |   10
(10 rows)

Notice the order of the rank column, scores tied get the same rank and the score after is offset by the number of scores tied for that rank. To further show, [1, 2, 2, 2, 2, 5] would rank as [1, 2, 6].

Dense ranking

Dense ranking is slightly more useful and my favourite as it relates to normal ranking (having consitent numering).

    SELECT id, score, dense_rank() OVER (order by score) AS rank
    FROM scores ORDER BY rank;
 id | score | rank
----+-------+------
  4 |     0 |    1
  5 |     3 |    2
  1 |     3 |    2
  3 |     4 |    3
  8 |     5 |    4
  2 |     5 |    4
  6 |     7 |    5
  7 |     8 |    6
  9 |     8 |    6
 10 |    10 |    7

The behaviour is similar to rank except we don't have gaps.

In summary, we have looked at a very simple case of how to accurately rank table data. If you are storing rank on disk in your db, it might be worth considering usage of one of these instead of rolling your own rank function and doing inserts/updates. There are other interesting models that I'd love to explore in the future such as fractional and percentage rank and maybe even roll my own query/algorithm to test these.