Sunday, November 11, 2007

The Glicko System

By now, it would probably be of little surprise to anyone who has been keeping up with this blog that a key point of discussion has been the interplay of accuracy and Glicko rating on the CTS. I introduced this topic a few postings ago. Since then, I have spent some time carefully studying Professor Glickman's original publication on his rating system, and, though I have yet to complete my analysis of his paper, I am coming to the conclusion that the Glicko system is ill-suited to rate tacticians by the time they spend to solve problems. Several reasons exist, but the most fundamental is that the Glicko system rates players based on three possible outcomes: win, loss, and draw. The CTS attempts to approximate these outcomes by using a time component to create fractional wins and losses, just as the Glicko system estimates a draw as half a win and half a loss. At first glance, this time-based approximation seems reasonable. However, at times > 30 sec, the CTS method does not distinguish between correctly solving a problem and failing it outright. But, as anyone who has attempted greater than 70% accuracy on CTS can attest, correctly solving a problem after 30 seconds and failing it outright are two fundamentally different phenomena. To accurately rate players solving problems, a rating system should therefore treat these two phenomena differently.

Bear in mind, however, that the Glicko system is suitable for what it was designed, namely to rate players who compete in head-to-head competitions, like chess tournaments.

Strangely enough, though, the Glicko system seems also to satisfactorily rate the problems. (I will discuss why in the near future.) A cursory inspection of the rating distributions of the problems and the tacticians reveals that the problems have a nearly Gaussian (normal) distribution across all rating categories while the rating distribution of the tacticians shows significant skew towards the lower ratings. Thus, the shape of the rating distribution peaks suggests that the CTS method rates problems accurately but not tacticians. While I have not confirmed the source of this skew yet, my theory right now is that the excess of tacticians below 1500 arises from a natural tendency in tacticians to work with accuracy in mind, lowering their ratings in general and skewing the distribution.

2 comments:

Unknown said...

It seems to me that the problem could be solved or helped by having the rating change non-linear with time to complete the problem.
maybe something along the lines of k/t^n-c where t is time (possibly in 3 second blocks or capped at minimum 3 seconds), k - c is the maximum increase in rating, c is the maximum penalty, and n is a number based on the difference between the user's rating and the problem's rating (maybe the ratio of them?)

This could be tweaked to very closely match a straight line above 0 rating change, but if you take longer the rating penalty would approach, but never reach the penalty for failing (ie. failing change is -10 you get -5 for 20 seconds, -7.5 for 40 -8.775 for a minute and so on)

This way you are rewarded (or at least not penalised as much) for getting a problem out regardless of how long you take, but past 30 seconds the reward is negligable.

Unknown said...

I'd imagine something like this
[URL=http://img406.imageshack.us/my.php?image=screenshothx3.png][IMG]http://img406.imageshack.us/img406/4269/screenshothx3.th.png[/IMG][/URL]
(hopefully that worked, I don't know if images are allowed)
also, higher values for k seem to give more useful curves, so i'd imagine larger k, cap it, and use n and c to adjust for difficulty (n could be based on the rating ratio, c to move the curve until the minimum value is right