Chess Tempo

Username:
Password:
/ Register

User Details

Username:
Blitz Rating:
Standard Rating:
Logout
December 02, 2008, 01:32:10 pm *
Welcome, Guest. Please login or register.
News: SMF - Just Installed!
 
Pages: [1]
Print
Author Topic: incremental solving and deviating from the "solution"  (Read 566 times)
mindbreaker
Newbie
*
Posts: 7


View Profile
« on: July 17, 2008, 03:13:52 am »

Punishment for "incremental solving" as well as deviation should be modified.  Often I find a clear winning line before I start but the computer does a silly defense that is crap but slows me down.  I also hate it when I find a way to win the queen deep into the line only to be flagged for not seeing the mate in 2 from there instead from the start.  In any real game winning the queen would be sufficient even if the mate were missed and certainly sufficient to justify going into the line.
Here is what I propose.  The computer should play along so long as the result would not likely be compromised.  A good guess can be made at this by the position evaluation and the rating of the player.  Consider that at different levels of strength different advantages are sufficient to insure a win.  For master and above +2 should be sufficient.  For expert +3.5 unless it is a simple position say endgame where +2 should suffice.  For A +5, B +6, C +7, D +8, E and lower +9.  If a player of any strength does not deviate from the solution it would not be implemented in that puzzle.  The +2—+ 9 can be linear with rating rather than tiered.
Logged
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #1 on: July 17, 2008, 03:47:45 am »

The computer defenses may look crap on the surface but they always lose the least amount of material in the long term (well within the horizon of what the computer engine can see anyway). 

You should not be marked wrong for a missing a queen take versus a mate in 2, unfortunately there was a bug with the last generator which did ignore material winning alternatives in some rare situations. I've fixed this problem and I'm currently running the generator over the current set to remove/fix this issue. If you remember the problem number (or can see it in your history) it would be useful as I can make sure the bugfix applies correctly to that particular problem.  The generator does reject ALL material winning alternatives if you missed a mate in 1.  This is by design and is based on the idea that if you missed a mate in 1 then "good move try again" is probably being a little too generous.

Having different thresholds of winning for different rating levels is an interesting idea. One of the issues with the old generator was that it had a 2.75 pawn threshold for alternative moves which in hindsight was way too high.  The new generator has lowered this to 1.75.  I'm going to wait and see how this pans out in practice before going to something more complicated involving differentiated evaluation thresholds.

Regards,
Richard.
Logged
tama
Full Member
***
Posts: 135



View Profile WWW
« Reply #2 on: July 17, 2008, 03:48:54 am »

Your method of getting points for problems makes no scenes because the RD factor is not involved. Also i don't think we should classify by letters we already got a rating system that works good.

I really think that the site lakes in extremely difficult problems like 2700 standard level and up problems, I want some crazy variation that make the user calculate 20 moves deep and has a barrage of tactical madness!
Logged

tmr
Jr. Member
**
Posts: 58


View Profile
« Reply #3 on: July 17, 2008, 05:47:50 am »

I'm not sure a variable winning threshold that depends on user ratings makes much sense.  The goal here has been to find the best move, not find a move that enough for a particular user to win. 

Say there was a problem where an obvious exhange yields +2 while a much less obvious line yields a queen.  According to the suggested method a highly rated player only has to take the obvious exchange while a lower rated player would get marked wrong for this (or get "there is a better move").  Also the lower rated player would have to use more time to find the queen take versus the higher rated player with the easy exchange.  This means that the blitz rating statistics would have to be tied to rating level.

I think a better method of accomodating this idea would be to allow the user to select an option where a problem attempt would be marked successful at less than full rating credit for moves other than the best line.  They could even set their own threshold.  To maintain the integrity of the problem set ratings on the problem side could remain as normal, that is the best line would reduce the problem rating, all others would increase it, each at full value as normal.  This way the user could select to avoid the distraction of "there is a better move" but pay for it with less rating credit.  The difficulty here would be to determine the proper credit so as not to create a separate class of player.
Logged
boybawang
Newbie
*
Posts: 19


View Profile
« Reply #4 on: July 17, 2008, 07:34:45 am »

Richard,
How about treating points as per move rather than per problem?  Meaning we still gain points even if i we miss the last move.

The per problem rating is more applicable to CTS because their problem solutions are very short. Chesstempo solutions are quite longer allowing many ways to go wrong.

I know OTB chess don't rate ELO points as per good-move basis.  But nobody is declared lost if he fails to see a good one.

regards..
Logged
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #5 on: July 17, 2008, 08:24:47 am »

I agree there are a few issues with the user level differentiated idea and I think there are better generator threshold based solutions to the issues mindbreaker mentioned (some of which will be seen in the next problem set update, i.e. better threshold tuning and a couple of bug fixes).

The easiest way to deal with partial credits for alternatives is probably just to create a new rating type, that way you don't have the 'different class of player' issue. I prefer the partial credits for alternative lines over the partial credits for getting part of the move sequence right.

CTS solutions are short so often because they too often don't require the user to play the actual tactical point of the position. CT has a few of these also, but not as many. I don't think users should be rewarded for only getting half the problem ,as often the second half is the difference between winning and losing.  If players decide on an OTB move sequence , play the first 2 moves (which were correct) and then blunder on the third move then they will indeed be punished. This is the reason why I'm not really in favour of partial credits just for getting the first few moves right, getting the next few moves correct is often the difference between winning and losing games. 

Regards,
Richard.
Logged
boybawang
Newbie
*
Posts: 19


View Profile
« Reply #6 on: July 17, 2008, 10:07:29 am »

play the first 2 moves (which were correct) and then blunder on the third move then they will indeed be punished.

But not all are blunder. Just none optimal move most of the time.
What about generating all blunders in each position? That way you can dynamically control to the degree of punishment instead of treating both blunders and none-optimal-move as equal.
Logged
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #7 on: July 17, 2008, 11:01:32 am »

Boybawang: If I understand your correctly I think you are suggesting a distinction between neutral moves and "bad moves" and that neutral moves should not be marked wrong and get partial credit?  I agree that in a game you don't get punished for moves that keep the position equal, however given that the goal of doing the tactics is to find a move with a positive tactical outcome I think it is slightly different than a game situation, here you KNOW there must be a tactic so if you find a move that keeps you neutral then really you've made a mistake and it is reasonable to apply some "negative feedback".

Regards,
Richard.
Logged
mindbreaker
Newbie
*
Posts: 7


View Profile
« Reply #8 on: July 17, 2008, 11:16:50 pm »

Replies:

“The computer defenses may look crap on the surface but they always lose the least amount of material in the long term (well within the horizon of what the computer engine can see anyway). “

In a real game with live humans if one is going to try to foil the guy who has made the start of a combination you don’t hand over the goods right away, you have to play assuming the guy will get tripped up in variations and if he does you can recover.  The computer has already decided it is lost and assumes that the opponent sees the solution.  If that is the frame of mind of a player he or she would simply resign they would never play that way.

A smart algorithm would leave the greatest chance to recover if the opponent misses a move.  It would also favor moves requiring more variations and requiring the greatest change in the legal set of moves and maximize more psychologically difficult to see moves (usually moves that physically move away but actually come closer, lines that have a piece touch the same square more than once, and moves that change in a way that is relevant to the lines the legal options of pieces other than the moving one.  A little more difficult to program would be favoring Bishop or Queen moves that go toward the opponent’s camp but attack coming back, Rooks or Queens sliding long distances from one side to the other through the center of the board).  It also would go for length to a clear position but I think this is the least important.

This approach is better.  This is one case where the min-max approach is suboptimal.  I think it is better in all cases where a computer plays a human and when humans are playing humans.  Only in computer vs. computer would min-max be better and even that is not certain.  A strong engine may see that the opponent has a combination so it sacs material now to avoid greater loss in some obscure deep variation.  It has now given the opponent the game even though the opponent may have been too weak to see the diabolical variation and may not stumble into it.

If the site at least in part was conceived to help players get stronger, it is better to use the variations that would actually be played rather than some defeatist line.  It helps when paying real players.  Those who emulate the practical play of a computer rather than defeatist play of a computer are going to win more by turning the tables on opponents in sharp positions.

In short; it is almost always truly better to play psychological in lost positions as that is your only real chance to change the outcome.

“The generator does reject ALL material winning alternatives if you missed a mate in 1.  This is by design and is based on the idea that if you missed a mate in 1 then "good move try again" is probably being a little too generous.”

A player really has not missed a mate in one if they solved the problem from the start and are just entering the moves quickly to avoid being penalized.  You can’t have it both ways…either you punish for going incrementally or you punish for not.  To do both is not reasonable.  If I saw a mate in 2 or I grab a Queen instead of a mate in one it really makes no practical difference.  If that was the start position well I’ll grant that, but deep into a combo…no.

“Say there was a problem where an obvious exchange yields +2 while a much less obvious line yields a queen.  According to the suggested method a highly rated player only has to take the obvious exchange while a lower rated player would get marked wrong for this (or get "there is a better move").  Also the lower rated player would have to use more time to find the queen take versus the higher rated player with the easy exchange.  This means that the blitz rating statistics would have to be tied to rating level.”

The assumption is that such problems would be included in the set.  I think a good set would not have an easy win to see and a difficult one to see and if there were the easiest one would be the favored one.

The math part is more interesting.  If, as I suggested the amount of advantage required was tied to rating linearly rather than tiered, then the system works out just fine and more realistically, for it may seem unfair but in reality the master does not have to be as flashy to win.  By linear I mean that a player rated 1700 midway between B class and A class would not have to maintain an advantage in the problem of +6 but rather +5.5 because that is between +5 (A level) and +6 (B level).  The required amount of advantage would be more precise: 1733 would require +5.33 etc.  Because it is linear there should be no humps to try to get over and then easy movement higher.  The problems get harder...if you solve harder ones then you go up...it is that simple.  It really would not be significantly different...certainly not compromising the rating system.

“here you KNOW there must be a tactic so if you find a move that keeps you neutral then really you've made a mistake and it is reasonable to apply some "negative feedback".”

The thing is, neutral is maintaining the won game…even every step in the solution is neutral…but yes, I still want to know there was something “better” assuming it is shorter or there is a bigger prize to be had.  I think punishment should be minimal though.  It is not like there is anything saying exactly how long a problem is…it is not always obvious that there is a shorter route by one move or two.  In one problem I found a forcing line to mate but was one move longer than the solution which was by definition ‘forcing” but allowed any one of a dozen replies just none worked.  In a real game, a human would not dream of allowing the opponent a dozen possible defenses when you can be 100% sure they are forced into a mate with an easy to calculate line.  I have played a couple hundred not registered, so I don’t know which problem this was. 
Logged
drahacikfm
Sr. Member
****
Posts: 417


View Profile
« Reply #9 on: July 18, 2008, 12:06:20 am »

“The generator does reject ALL material winning alternatives if you missed a mate in 1.  This is by design and is based on the idea that if you missed a mate in 1 then "good move try again" is probably being a little too generous.”

A player really has not missed a mate in one if they solved the problem from the start and are just entering the moves quickly to avoid being penalized.  You can’t have it both ways…either you punish for going incrementally or you punish for not.  To do both is not reasonable.  If I saw a mate in 2 or I grab a Queen instead of a mate in one it really makes no practical difference.  If that was the start position well I’ll grant that, but deep into a combo…no.

I can't really agree with this one.  Strong players have an unwritten rule, that you never just play though a line that you have calculated.  After each move in the line is played, you look at the resulting position to see if you missed anything, and to see if you have something better than what you were planning at the start.  You never just blindly play what you calculated at the beginning.  Every move in the line produces a position you have not seen on the board, but only in your mind.  You should look at it objectively and find the best move.  If there's a mate in one, and the player doesn't see it, then he is playing without thinking, and that should not be rewarded.

I repeat this rule to my son every time it comes up in one of his games.  And I'm guilty of doing this too here, missing a mate in one because I'm just playing what I calculated from the starting position.  I don't take it as a deficiency of the site, but as a wake up call that I was getting sloppy and not following the basic rule of looking at each position with a fresh eye, even if only for a few seconds.
« Last Edit: July 18, 2008, 12:10:57 am by drahacikfm » Logged

FIDE Master Drahacik
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #10 on: July 18, 2008, 12:58:53 am »

Hi mindbreaker,

I complete agree with your discussion on the computer defense line. I only mentioned that the computer line loses least material as some users see some of the computer defenses as outright blunders.  They may be blunders from a human psychology point of view, but the computer should always have a method in their madness. I would like to be able to chose more human like lines in some of these situations but it is not a particular easy problem to solve.  I have a few ideas and some users have contributed some ideas worth looking at here and I hope to be able to improve the generator in this area in the future.  It is a fairly interesting area in computer chess in general. I'm sure in computer vs human games that the computer could improve it's performance by following some of the heuristics you and others have suggested, does anyone know of any computer engines that can be configured to try and play the less optimal line from a material point of view in the hope that the human will miss a tricky combination?

Some of your other points are more of a philosophical nature and I don't think the 'right' approach is completely clear.  I do agree with drahacikfm's response that there isn't much difference between missing a mate in 1 at the start or at the end, either way they indicate there was something fairly important in the position at that point that was overlooked.  As I mentioned, at the moment this is a stylistic issue at the moment, it would be easy for me to treat mate in 1s the same as other alternative situations , however for now I'm reasonably comfortable with the current treatment.

I don't think the different problems for different ratings is an idea without merit, however it is more complicated and thus more difficult to understand and reason about.  I can imagine there may be complexities that might arise that were not foreseen before rolling out such a system.  At the moment the benefits of moving to such a system are not (in my mind) large enough to justify the development time and more importantly user perceived complexity of the problem selection/rating/solving process.  While not perfect the existing rating/problem selection system does a reasonable job of serving up problems appropriate for the user's current rating. This is a lot more course that tuning the path through the problem based on user rating but is obviously a lot easier for users  (and me :-) ) to understand.

Regards,
Richard.
Logged
mindbreaker
Newbie
*
Posts: 7


View Profile
« Reply #11 on: July 18, 2008, 01:36:52 am »

Perhaps if I knew more about how one is docked for incrementally solving problems, I could propose a better solution.  If I knew each move I was making I had 5 second for before deductions start in (like a Fischer clock), I would be ok with that issue of the mate in ones.  If the total time after the first move is given is what is being scored then I think that is a poor solution.

A sort of "clock" on the side that starts to count down how much you can gain/loose at that moment with tenths or hundredths of an Elo point (if it were solved right then) would be nice. 

It may give away the difficulty somewhat but without the actual rating of the problem displayed, that damage should be minimal.
Logged
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #12 on: July 18, 2008, 05:47:02 am »

Hi mindbreaker,

A few users have requested to see the time allocation, I do think that it does expose a fair bit about the difficulty of the problem.  Time punishment after the first move is very simple at the moment. Any time you spend after the first move is simply added on to your total time, in other words time after first move is essentially counted twice.  The FAQ gives a high level overview of how elapsed time is treated in relation to the problem average solve time.  I should probably review the FAQ in this area as I've recently made a few tweaks in this area (although the statements are still generally true some of the details may need updating).

Regards,
Richard.
Logged
slacker00
Jr. Member
**
Posts: 63


View Profile
« Reply #13 on: July 26, 2008, 09:42:33 am »

mindbreaker, can you give an example problem #?  I'm having difficulty understanding.  I think the current problem set seems pretty good in terms of forcing the user to find a "winning" move.  I know sometimes a problem may allow one to win an exchange, but there is a different variation which wins much more.  I think it's fair to ask the user to find the much better solution.  (for example, 38250)

As far as the computer making "bogus" moves.  I think it's hilarious.  But, often it makes the problem easier.  Also, when playing in a real chess game, the opponent often will not play moves that one expects. 
Logged
Pages: [1]
Print
Jump to: