Chess Tempo

Username:
Password:
/ Register

User Details

Username:
Blitz Rating:
Standard Rating:
Logout
December 02, 2008, 11:53:02 pm *
Welcome, Guest. Please login or register.
News: SMF - Just Installed!
 
Pages: [1]
Print
Author Topic: rating of problems  (Read 605 times)
uri blass
Full Member
***
Posts: 149


View Profile
« on: September 25, 2008, 05:56:08 am »

I think that the rating of problems is not correct.

some problems when you need to calculate many moves forward are below 2000 when the last problem that I solved at the time of this post had rating of more than 2300 inspite of the fact that I needed to calculate only line of 3 plies(check escape capture)

The reason that not many people  solved the >2300 problem is simply that there was simple tactics to win a piece.

Fortunately before making it I decided to counter material and found that black is only a pawn up that may not be enough to win so I did not make it and searched for a better move.

I talk about the following problem

http://chesstempo.com/chess-problems/43654

Qb6+ is an easy move if you only care to count material and I guess that many chose Qxg5 that wins a piece.
 
« Last Edit: September 25, 2008, 05:58:57 am by uri blass » Logged
andreacoda
Full Member
***
Posts: 113


View Profile
« Reply #1 on: September 25, 2008, 05:59:12 am »

Uri,

that typically happens when a problem has been attempted only a few times: in these cases, the rating of the problem is only temporary, and it will swing a lot in the first attempts, settling on the "correct" rating once the number of people having attempted it becomes high enough.

Cheers,

Andrea
Logged
uri blass
Full Member
***
Posts: 149


View Profile
« Reply #2 on: September 25, 2008, 08:53:50 am »

Uri,

that typically happens when a problem has been attempted only a few times: in these cases, the rating of the problem is only temporary, and it will swing a lot in the first attempts, settling on the "correct" rating once the number of people having attempted it becomes high enough.

Cheers,

Andrea

I do not think that it is the reason
Here is the statistic of the problem

I think that there is no justification for blitz rating above 2300

Problem Blitz Rating: 2321.3
Blitz Average seconds: 17.9
Blitz attempts: 64
Blitz success rate: 32.81%

--------------------------------------------------------------------------------
Problem Standard Rating: 1975.4
Standard Average seconds: 51.7
Standard attempts: 88
Standard success rate: 28.41%
Logged
revenant
Full Member
***
Posts: 167


View Profile
« Reply #3 on: September 25, 2008, 09:18:02 am »

This one has had time to settle.  So the question is, why do many problems show a blitz rating so much higher than their standard rating.  I'm not sure the difference in this case (350 points) is all that much higher than average but even if it is, it's an interesting psychological question and not an issue with the Chess Tempo system which is simply doing its work.

I ran into 43654 yesterday too and I happened to get it right but I've gotten many similar ones wrong and the reason is usually the same.  In blitz you're in a hurry and you can get too focused on the left or right half of the board (or a corner where you're trying to shuffle pieces around to deliver mate).  If you see a good combination the impulse is to fire it off (and you could argue that in the long run you net more points that way, *in blitz*).

Whereas in standard you can slow down and do the ritual at the beginning (Whose move is it?  What's the initial material count so we know what standard we have to meet?) and follow the traditional advice "If you see a good move look for a better one" and focus on the other side of the board for a moment where there may be a check or zwischenzug.
Logged
richard
Administrator
Hero Member
*****
Posts: 991



View Profile
« Reply #4 on: September 25, 2008, 10:33:00 pm »

Hi Uri,

It is not really an issue of the problems rating being "wrong". Ratings are just a reflection of how users perform on the problem (or rather how the problem performs against users :-) ).  Some problems may be objectively easier than their rating, but the fact is that lots of people are getting that problem wrong (premium users that can view the wrong moves details can see that there were 106 mistakes made on that problem (across blitz and standard) and that 96 of all of those was the 1...Qxg5 move you mentioned, the average blitz rating of those making that mistake was 2168.

Blitz ratings tend to be higher than standard ratings as the user's (i.e. the problem's opponents) blitz ratings are relatively higher due to the time bonuses involved in finishing problems quickly. However some problems are also especially tricky under time pressure , but much easier when the clock is not ticking, so the gap between blitz and standard rating is sometimes greater than would be otherwise predicted.  In this example I'd say it is mostly the former rather than the later, especially given that the standard success rate is actually lower (something I'd expect to see change in the longer term).

Regards,
Richard.
Logged
uri blass
Full Member
***
Posts: 149


View Profile
« Reply #5 on: September 29, 2008, 02:35:47 pm »

My opinion is that problems should have a maximal rating based on complexity regardless of failure of users to solve it.

Same is also for problems of the other site

http://chess.emrald.net/psolution.php?Pos=1937

This problem got high rating of 1854.3 in that site only because many users missed the fact that it is a check but I think that the rating is simply too high.

My opinion is that results of the users should be one of the components in rating of a problem but not the only component.

My opinion is that there should be some maximal rating for problems based on
the depth that the computer needs to solve them when you disable extensions.

Uri
Logged
richard
Administrator
Hero Member
*****
Posts: 991



View Profile
« Reply #6 on: September 29, 2008, 05:41:03 pm »

Hi Uri,

By depth I assume you mean the number of moves in at which the outcome of the tactic becomes clear. For mates this is obviously the same as the length of the mate. For non-mates this is hopefully the number of moves in the tactic , although this is not always the case as sometimes the tactic gets pruned too early.

I don't think depth by itself is a great measure of problem difficulty, I currently use depth as a means of "guessing" (via a regression equation) what the starting rating of a new problem will be.  Overall I would say this number is considerably less accurate than using the rating derived from user performance.  Numbers such as nodes examined divided by depth looked at for a position can give a measurement of complexity , a higher number here showing more branching at each ply. However there are lots of reasons why humans find some problems to be more complex than others and these kind of fairly crude measures are in my opinion less accurate than seeing how users actual perform on a position (especially as the opponents rating is taken into account).

Regards,
Richard.
Logged
uri blass
Full Member
***
Posts: 149


View Profile
« Reply #7 on: September 30, 2008, 10:55:46 am »

Hi Richard,
I think that it is better if people learn simple things first and this is the reason that I think that rating should not be only based on failure of users to solve the problem.

After more thought about it
I think that it is better if people learn first to calculate 1-2 moves forward before they get problems of significantly more moves.

I suggest to give problems more than one rating

1)rating based on complexity
2)rating based on ability of humans to solve it.

I think that weak players should get often problems with low rating based on complexity including problems that  the default rating of the problem(based on ability to solve) is high.

Uri
Logged
drahacikfm
Sr. Member
****
Posts: 418


View Profile
« Reply #8 on: September 30, 2008, 12:11:13 pm »

Uri,

I agree that beginning players or low-rated players should solve easier problems first.  Those are the building blocks for solving harder problems.  But this is not a question of some deficiency in the current rating system.  The current system works fine, and there is no need for two separate ratings for each problem.

With a silver or gold membership, users can choose exactly which problems they want to solve.  For example, only 1-move problems, or problems with 3 moves or less, or mate-in-2 problems, etc.  Users can also choose to solve only the problems with ratings less than some amount, such as 1200.  All these things are very easy now on Chess Tempo, and there is no need for any new ratings based on some vague concept of "complexity".

Who would decide if a problem is complex?  Number of moves is not always a good measure of complexity.  Some 1-move problems on here are very hard to solve.  They are one move because the pruning algorithm decided to stop the problem after one move, not because the problem is easy.  It would be impossible to measure accurately the "complexity".  The best way to measure that is obviously how people did against that problem, and that's the way it works already.

If some players want easy problems, the best way is to make a personal problem set with Standard rating less than 1200, and solve all of those.  Very simple to do.
Logged

FIDE Master Drahacik
richard
Administrator
Hero Member
*****
Posts: 991



View Profile
« Reply #9 on: September 30, 2008, 11:19:33 pm »

Hi Uri,

In addition to Drahacik's comments, I think that there is also likely to be a very high correlation between complexity and "human ability to solve". There is occaisonally a problem where users can tend to overthink a problem and miss the obvious , but generally speaking problems with high complexity have high ratings and problems with low complexity have low ratings.  The rating system already matches up lower rated users with lower rated problems. The default behaviour of the rating system is that users will be given problems such that they should on average solve about 50% of them.  Higher rated users tend to have higher success rates than the rating algorithm would normally give , especially in standard mode due to there not being enough very difficult problems to give the highest rated users so they are more often given slightly easier (for their ability) problems.

Drahacik and others have made good arguments that there is benefit in doing a larger number of easier problems which as Drahacik points out can be done with the silver and gold membership features. 

Regards,
Richard.
Logged
Pages: [1]
Print
Jump to: