Friday, November 16, 2007

The Sweet Spot

I need to reevaluate my sanity for starting a blog entry at 3:00 am. But I have been promising a blog on my data mining of CTS tactician stats. I've done a lot of calculations, but tonight I'll focus on my most provocative findings. In the following graph, rating is plotted against accuracy for two groups of tacticians: those who have done between 1000 and 10,000 problems (blue circles) and the those who have done over 10,000 problems (red circles).

The idea behind having two groups differing in number of problems attempted is that the red group is representative of tacticians who have done a lot of problems and the blue group is representative of tacticians who have done only a few. So, the way I interpret this plot is that solving at just above 80% accuracy gives one the greatest score improvement over time. Interestingly, this interpretation is consistent with observations made by wormwood on the Message Board several months ago. His observations led me to experiment with accuracy, though I have taken it somewhat to an extreme--working for 98% lately, but achieving just over 96% in actuality.

I should note a couple of caveats about this graph. First, the two red points on the extremes (97.5% bin and 52.5% bin) should be more-or-less ignored because too few tacticians fall into these bins to give reliable statistics. Second, the error bars do not necessarily represent uncertainty about the actual rating of the group. These "errors" arise from the natural score distribution of the groups and their magnitudes are related to the robustness of the statistics and not related to their uncertainty. I have yet to perform proper error analysis on these numbers, but my feeling is that the numbers are robust.

One conclusion from the above graph may be that solving problems at greater than 90% accuracy results in little progress, given the information in the 92.5% bin. Its difficult at this point for me to argue with that conclusion. And anyone who has taken a gander at the message board lately will know I am eating my words right now.

But, despite the obvious conclusions one might make from the above graph, my belief is still quite the contrary. I believe that tacticians solving at greater than 90% accuracy are getting much more out of their training than tacticians solving at less than 90% accuracy. Difficult (and perhaps embarrassing) for me, however, is that this benefit is not revealed by analysis of rating alone as I have done thus far. So I am still short of proof on my theory.

Sunday, November 11, 2007

The Glicko System

By now, it would probably be of little surprise to anyone who has been keeping up with this blog that a key point of discussion has been the interplay of accuracy and Glicko rating on the CTS. I introduced this topic a few postings ago. Since then, I have spent some time carefully studying Professor Glickman's original publication on his rating system, and, though I have yet to complete my analysis of his paper, I am coming to the conclusion that the Glicko system is ill-suited to rate tacticians by the time they spend to solve problems. Several reasons exist, but the most fundamental is that the Glicko system rates players based on three possible outcomes: win, loss, and draw. The CTS attempts to approximate these outcomes by using a time component to create fractional wins and losses, just as the Glicko system estimates a draw as half a win and half a loss. At first glance, this time-based approximation seems reasonable. However, at times > 30 sec, the CTS method does not distinguish between correctly solving a problem and failing it outright. But, as anyone who has attempted greater than 70% accuracy on CTS can attest, correctly solving a problem after 30 seconds and failing it outright are two fundamentally different phenomena. To accurately rate players solving problems, a rating system should therefore treat these two phenomena differently.

Bear in mind, however, that the Glicko system is suitable for what it was designed, namely to rate players who compete in head-to-head competitions, like chess tournaments.

Strangely enough, though, the Glicko system seems also to satisfactorily rate the problems. (I will discuss why in the near future.) A cursory inspection of the rating distributions of the problems and the tacticians reveals that the problems have a nearly Gaussian (normal) distribution across all rating categories while the rating distribution of the tacticians shows significant skew towards the lower ratings. Thus, the shape of the rating distribution peaks suggests that the CTS method rates problems accurately but not tacticians. While I have not confirmed the source of this skew yet, my theory right now is that the excess of tacticians below 1500 arises from a natural tendency in tacticians to work with accuracy in mind, lowering their ratings in general and skewing the distribution.

Saturday, November 10, 2007

On Perfection

Prelude: Focus on the Tactics

I blogged Chess Vortex for over a week pretty intensely and completely ran myself down, getting far too little sleep for my own good. I did learn a lot from the experience and gleaned excellent comments from fellow tacticians. However, I spent about five days traveling and experienced a mental crash of sorts, manifesting itself most noticeably on 11/6/07 as a 42p/5f day (89.36%). It has taken me a few days to get back on track. I have focused on sleep primarily and on re-training my thought processes for careful problem solving. Tonight I reaped the rewards--after over 100 sessions training for accuracy and almost 15,000 problems, I finally had a perfect 100 problem session:

100p
100% @ 1394 ± 104 ; 1371


Of course I hemorrhaged a few points at the end of the session, but the last 50 problems of any session are always more difficult than the first 50, mostly because of fatigue. Moreover, tonight I was dropping points at the expense of perfection because I felt like I could get that 100p/0f.

So what's the moral of this story? Focus on solving chess problems first! Killing one's self blogging does not get chess problems solved correctly, so my new resolution is to blog only when it will not tire me for chess problem solving.

More Fun with Binomials

Because of my perfect session, I'm going to have a little fun using VassarStats to get a rough idea of what tonight's performance means in terms of my accuracy as a 1394 ± 104 Tactician:

If I am a...Probability of 100 Correct
99.5% Tactician0.606
99% Tactician0.366
98% Tactician0.133
96% Tactician0.017

So this table means that there is over a 98% chance that I am at least a 96% Tactician for problems rated 1394 ± 104. I guess I can live with that.

Conclusion

Given that my focus on sleep has helped my tactical vision, it should come as no surprise that I saw a lot deeper into my problems today than I usually do. This was an intensely rewarding experience and, in my opinion, is reason enough to focus exclusively on accuracy. A lot of my point loss tonight actually came from my looking at alternative lines. Mostly, I saw the main line immediately, which is not entirely remarkable given that my problem set tonight was rated less than 1400 on average. However, most of the pleasure was in the variations--and tonight the problem of the day has its beauty buried in the notes:

Chess Tactics Server Problem of the Day
p58952

Black to Move

Here's the solution and why I like it (start selecting text following the colon): 2...Bxg2. Now 3.Kxg2? leads to a problem-like mating net that can be difficult to see: 3...Qh2+ 4. Kf3 Qh3++.