Chess Tempo

Username:
Password:
/ Register

User Details

Username:
Blitz Rating:
Standard Rating:
Logout
September 08, 2008, 04:03:51 am *
Welcome, Guest. Please login or register.
News: SMF - Just Installed!
 
Pages: [1]
Print
Author Topic: Bug in choosing problems to serve?  (Read 285 times)
drahacikfm
Sr. Member
****
Posts: 251


View Profile
« on: May 25, 2008, 06:49:08 pm »

I noticed that in my last 100 standard-rated problems, I was given only 2 problems over 1883 rating.  That's 2%.  But scrolling though the View Problem list, sorted by standard rating, I found there are 1054 problems out of 17016 rated higher than 1883.  That's more than 6%.  So I was served a lot fewer tough problems than if the problems had been chosen randomly from the whole problem set!   But supposedly with my high rating, I should be given tough problems more often than a random sampling, not 3 times less.
Logged

FIDE Master Drahacik
richard
Administrator
Hero Member
*****
Posts: 771



View Profile
« Reply #1 on: May 25, 2008, 07:06:29 pm »

Hi,

This is a known (at least to me :-) )  problem with the current selection procedure.  The current selection model skews to the left, I'm a little surprised you are getting such a small number of those upper problems, but this is not uncommon when a number of new problems are added to the set (for reasons that will only serve to embarrass  me should I go into details :-)  ).

I'm hoping to move to the new selection algorithm within the next week (might be slightly over a week), this will be able to produce a much better normal distribution of problems around the players current average rather than the (inadequate) aproximation at the moment. Under the new method (which will probably also utilize your don't repeat in last N idea) there may still be skews (but much smaller) to the left for very highly rated players , but only in the situation where there is not enough problems at the high end of the range to avoid repeating problem selections too often.

Regards,
Richard.
Logged
drahacikfm
Sr. Member
****
Posts: 251


View Profile
« Reply #2 on: May 25, 2008, 10:22:47 pm »

All right, did you spoil my day by giving me what I wanted?!  Now there are 12 problems rated over 1883 in my last 100, instead of 2.  And my rating is down, grrr  Smiley
Logged

FIDE Master Drahacik
richard
Administrator
Hero Member
*****
Posts: 771



View Profile
« Reply #3 on: May 25, 2008, 11:17:42 pm »

No changes yet :-)

Hopefully that means the previous 100 was an unrepresentative sample and things are not quite as bad as the 2 out of 100 suggested.  I'm still planning on releasing the improved problem selector as soon as I have time to do tests and add the non-repeating aspect.

Richard.

Logged
tmr
Jr. Member
**
Posts: 58


View Profile
« Reply #4 on: May 26, 2008, 12:54:59 am »

It seems as if the selection algorithm is roughly centered between 1400 and 1600 depending on the players rating.  1500 is the median rating for my last hundred problems which is about 300 points less than my current rating.  Looking at some of the top rated players (~2100) the median is around 1575 to 1625.  The median for some lower rated players is around 1400.  Assuming these 100 problem views are a sufficient representation of the selection algorithm, they indicate that the center of selection algorithm only shifts right about 0.2 points for every 1 point increase in a players rating.  I'm not a statistician but it would seem as this would tend to inflate the ratings at the higher end (albeit at the effort of doing a lot of problems to rack up those fractional rating increases).
Logged
richard
Administrator
Hero Member
*****
Posts: 771



View Profile
« Reply #5 on: May 26, 2008, 05:24:58 am »

tmr: I'm not a statistician either and it shows :-)

The main problem with the current selection algorithm is that it doesn't ignore the distribution of the current set.  If the current set had an even distribution of problems across all ratings then it would produce a rating distribution centered around the users current rating. Unfortunately the current set's distribution is also (roughly) normally distributed (with an average rating of 1400).  So the current set drags the selection algorithm towards it's current mean.  I've been aware of this for a while and have had the new selection algorithm in mind for some time, however I've been cautious rolling it out as one advantage of the current selection code is that it doesn't present high rated users with a very small pool of problems ( due to the normal distribution of problem ratings, there are far fewer at the high end). For this reason I was waiting to get a larger problem set size before going with the new selection procedure (which would by itself produce a near perfect distribution around the users current rating).  I'm hoping that by coupling the new selection process with a "recently served this problem" detector I'll be able to go with the new selector without waiting for more problems.  Using the duplicate detector will probably push the selected problem mean a little to the left if the user does a lot of problems in a short period of time, but not as far left as is currently the case.

I imagine people might be in for a bit of a rollercoaster when the new selection process arrives.  I'm not sure what the underlying statistical theory suggests should happen here, in an ideal world a rating system should be able to be tolerant to situations where some players mostly play weaker players instead of players at their own or greater strength, but I'm not sure how robust glicko is in this respect.  Obviously you get less points for winning and more for losing, but whether you lose often enough at the low end to even things out is unclear to me.  It is possible that the rollercoaster will not arise and glicko will just smooth things out when users at the high end suddenly start to get harder problems (and users at the low end start to get easier problems).

Regards,
Richard.
Logged
drahacikfm
Sr. Member
****
Posts: 251


View Profile
« Reply #6 on: May 26, 2008, 09:52:06 am »

Maybe you don't want to center the problems served exactly on the player's rating.  For example, I'm about 2170 standard, and I sure would not want half of the problems I get to be over 2170, because it seems any problem over 2000 standard is very hard and takes a long time to solve.  But I would like a higher center-rating than I get now, which seems to be around 1600.  A center around 1900 or 2000 would be good.  So at least in standard, maybe some formula like this:  If you are X points above the average rating of the problem set, the center of problems served will be 80% of X above the average.
Logged

FIDE Master Drahacik
richard
Administrator
Hero Member
*****
Posts: 771



View Profile
« Reply #7 on: May 26, 2008, 10:08:10 am »

Maybe you don't want to center the problems served exactly on the player's rating.  For example, I'm about 2170 standard, and I sure would not want half of the problems I get to be over 2170, because it seems any problem over 2000 standard is very hard and takes a long time to solve.  But I would like a higher center-rating than I get now, which seems to be around 1600.  A center around 1900 or 2000 would be good.  So at least in standard, maybe some formula like this:  If you are X points above the average rating of the problem set, the center of problems served will be 80% of X above the average.

It is unclear to me at this stage but I suspect there will be a rating drop that means you'll not end up getting 50% of your problem above the current level.  I'll do some testing before rolling this one out, but I may start up with centering on users rating and quickly move to something closer to what you are suggesting if ratings don't fall.

In theory , centering on the user's rating should be the right thing to do, unfortunately the current selection system has created an artificial situation , but usually glicko should make your most closely matched opponent around the same rating as you.  The problem is that the current situation may take a while to settle down again as users and problems settle at new rating levels so I might be forced to have a kludge at the start and move towards a "centered on user rating" approach over time.

Regards,
Richard.
Logged
Pages: [1]
Print
Jump to: