Chess Tempo

Username:
Password:
/ Register

User Details

Username:
Blitz Rating:
Standard Rating:
Logout
December 02, 2008, 01:47:02 pm *
Welcome, Guest. Please login or register.
News: SMF - Just Installed!
 
Pages: [1]
Print
Author Topic: Feedback  (Read 1745 times)
chesseric
Newbie
*
Posts: 15


View Profile
« on: December 20, 2007, 06:53:20 pm »


Hi and thx for a great chess problem site!  Smiley I thought I would give some feedback, some of it might have been brought up before, I don't know, but here goes...

The primary issue I've been thinking about is how well solving the problems here corresponds to a realistic situation OTB (Over the board). This is usually always a problem with chess exercises because in the exercise situation it is self evident that you are expected to find some clever move - which of course increases the probability that you will find it tremendously compared with a OTB situation. (There are exceptions, I hear some exercise collections contain examples where no clever move can be found - to keep the solver "honest"). So one way to make the exercises harder would be to have more variation in them. Specifically I was thinking of adding positions where a defencive move is required instead of an offensive one. (From my experience with the exercises here I think the solution is an offensive move 100% of the time, with the consequence that I know what to search for and can often guess the first move even if I don't see the whole solution. This narrow selection of problem types might be the reason I get rated 1900+ here, while in real life I'm probably 1200-1400   Shocked ). Would it be harder to have the exercise constructor construct/find other types of positions?

There is also some explicit information given visibly for each problem that gives clues about what to search for. For instance the rating - some times with funny effects though, like for example a problem involving a hanging piece getting rated 1900+  Cheesy (I'm guessing people peek at the rating and then exclude the possility of a trivial solution).

Another give away that I think is far too explicit is the tag(s). I suggest that information is kept hidden as default. This problem will increase as people add tag information.

About the Quality parameter: It seems to me that people rate the quality of problems completely subjectively. I have seen many completely logical problems that have been voted bottom quality - and I wonder why?  Undecided Because the player didn't find the solution perhaps? If this is how it is used, I think the quality information serves no purpose and could be removed.

A more reasonable case for giving a problem a low quality - and this brings me to the last subject - is the problem with "missing moves". I think you are aware of this, I think it was even discussed in some thread? The problem being that some times its hard to see why a certain move is good. This might be hard to fix? Maybe it's tricky to get the exercise constructor to actually pick the opponent's less strong, but more educational variation? Maybe it would be a good idea to comment all such positions, with information as to why the move works (= telling why the opponent's ostensive counter reply does not work).

Thx
/ chesseric
Logged
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #1 on: December 21, 2007, 05:15:31 am »

Hi Chesseric,

Thanks for the feedback.  Removing rating and tag information from the problem whilst it is in play is already on the todo list (although it still has a few things above it).  The quality setting can definitely be problematic. As you point out, some people mark the problem as low quality because their solution was rejected, but have not seen that their solution was not just not the best, it was outright wrong.  The hope is that the longer the site is around for the more people will have rated problems thus diluting the erroneous ratings.  "Quality" will always be somewhat difficult in that different users have different ideas on what constitutes a quality problem. At the moment it helps me identify issues in the problem generator, i.e. if several people rate a problem as low quality then it's worth looking at trying to avoid generating that type of problem (after verifying that the raters haven't just missed the point).

The missing moves issue is a difficult one, at the moment it's a compromise between allowing move sequences to reach their natural conclusion and stopping them from going on beyond the tactical point. Its been hard for me to be 100% successful at both goals so the current situation is a compromise.  I try to err on the side of too short over too long as its fairly frustrating to be told you're wrong after getting the tactic correct but having to produce one (or more moves) in the tactical sequence.  Whilst I don't have the information to tell users explicitly why a move may be wrong, there is the "analysis" button  (only shows up for non-mate tactics) to show what the computer thinks is the line showing best play for both players from the last position in the tactic.  This is often useful in seeing where the position was heading (although it doesn't show the threats you avoid which is often the point).

Showing defensive as well as offensive problems would require fairly fundamental changes to the problem generator.  Currently it works by looking for large changes in position evaluation in lots  of real games.  When it finds one, it decides a tactical possibility has been created and begins to look for it.  Defensive moves work on the notion of preventing negative evaluations arising so my current approach will not find these types of moves. I can think about how I could augment the current generator or create a new one to produce some of these types of positions, but at the moment there is a lot of work to be done in other areas.

Anyways, thanks again for the feedback, hopefully I can get to some of these things in the new year.

Regards,
Richard.
 
Logged
chesseric
Newbie
*
Posts: 15


View Profile
« Reply #2 on: December 30, 2007, 01:49:40 pm »

Just to discuss the problem in hope to clarify. The kind of problems that "need more moves", are those where for instance the tactic includes taking a piece that is ostensibly defended. So the user's reaction might be to ask: why can't he just take back and equalize? In this case it is not about getting the moves right or wrong after you find the tactic, rather the user never really saw (as in understood) the tactic but was guessing. The technical problem is, as I understand, that your generator looks for the opponent's best reply, and that's why the example ends (opponent doesn't take back). The generator would have to let the opponent respond with a less than optimal response (ie let it take back), in order for the tactic to become explicit in the problem. Another problem with this reasoning though, is that "what is an explicit" tactic is relative to how good you are to identify tactics. For instance, in the case of an overload attack most would say it is completely obvious why the taking of a piece works. But there are many problems where it is really hard to see why it works (ie. the idea of the tactic becomes explicit first some moves later).

The other problem of finding defensive moves. Just an idea: as I understand your generator basically searches for blunders. What if, when you find a blunder, instead of just considering how to exploit it, you back up one move, and check the scores for the two best moves the player had (during the blunder). If then, the difference between those two scores are great, then you know that there was a unique way to avoid the blunder, and hence might be a good defensive problem. What do you think?

Btw, it would be interesting to know about what chess engine you use and how you program it (what language?). I thought it would be cool to make a private problem collection out of my own games only  Wink Perhaps you also could get help with the work by good programmers? As long as the site is free, I think there would be ppl willing to help out.
Logged
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #3 on: December 30, 2007, 06:25:59 pm »

Hi Chesseric,

You are correct on the technical problems involved.  It would be possible to try and look also at the "wrong moves" but its a tradeoff between more information and CPU time to find more problems.  Once the problem set gets to a size and quality I'm happy with, it becomes more feasible to revisit the problem set and analyse it further to provide more information to the problem solver.

The blunder idea sounds interesting but I think your approach doesn't really differentiate between defensive moves and alternative non-defensive tactics by the person who made the blunder.  That is the reason for the large difference between best and second best may simply be that there was an important offensive tactic that the "blunderer" missed, rather than actually avoiding the blunder as such.  I think its actually quite hard to work out when a move is a good defensive move, from the computers point of view it just keeps play even.  it is hard for a non-human to tell if the move was obvious in the situation, or an unusual move which saved a sticky position.  I think eventually allowing human produced problems is the best way of dealing with this, this is still a while off though.

The tactics finder is written in C. At the moment it talks to any UCI compatible chess engine to get position evaluations and n-best move suggestions (I currently use Toga 1.3, but have tried a couple of others). I had experimented with modifications to the chess engine I was using to try and improve problem generation but decided it was better to have chess engine independence and do all of the post-processing in my own code.  Currently its about 4.5K lines of fairly messy code.  The problem generation was much harder than I'd first thought and hence the code wasn't really designed upfront to deal with the complexities that arose whilst trying to improve the generator.  I'd like to be able to distribute the code at some point (if for no other reason that to get others to contribute CPU time, the code is distributed in that multiple clients running on different machines analyse positions given to them by a central server , when results a ready they are sent back to the server and a new position given out. I usually have about 4 machines working on problem generation at any time.).  However the code is in such a state that I wouldn't be happy distributing it as is, so this is very much a long term thought.

Regards,
Richard.


« Last Edit: December 30, 2007, 06:45:48 pm by richard » Logged
chesseric
Newbie
*
Posts: 15


View Profile
« Reply #4 on: December 31, 2007, 03:00:39 am »

Hi Richard,

Thanks for the insight into your programmer's issues and technicalities.  Smiley The way the work is distributed from a server sounds impressive.  Cool I recognize that programming problems almost always turn out to be more complicated than one first thinks. I wrote a yahoo to pgn chess notation converter and thought it would be easy, but found out that the things that seem so obvious when you look at the board can get really complicated when formalising it in a program  Cheesy It took more than 400 lines in the end.

As for the "more moves" problem, I understand the issue with having to spend more CPU time. Actually, when thinking about it, I'm not even sure how to find / identify the instructive continuation. I mean, its not like a chess engine thinks like a human ("why doesn't he do that obvious move??". Hmm, maybe one could identify them if one could force the engine to calculate a score from only a very short depth, so that if when only looking 2 ply the score is much higher than when lookin 6 or more, then the 2 ply variation might be the "instructive" one.

I was thinking the same as you, that a problem found by "backing up" might actually be a forfeited offensive tactic. This might be the case if several blunders were made after each other, but if it is the first in a series, say at move nr N, I feel that a simple inductive reasoning gives, if that blunder was a forfeited offensive exploit, then the reason that the exploit existed must be because of a blunder at move N-1 or even earlier - contradicting that it was the first in the series - hence it couldn't have been a forfeited offensive. But I haven't thought this through 100%!

Anyway, I was also wondering, isn't it really a sufficient criteria to have found a possible chess problem position, if the score difference between the two best moves is large? But I agree, that in general that criteria does not only give offensive tactics, but all kinds I guess. It would be nice if one could tell the chess engine to show the scores for the two best moves. That might not be a common feature. I don't think Crafty has a command for that.

Happy New Year folks!  Cool
Logged
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #5 on: December 31, 2007, 06:02:05 am »

Hi Chesseric,

The difference in evaluation between the best and second best moves is part of the story, but you also need to look at it relative to previous position's evaluations. For example if the evals are 0 and -10 then this simply means that there is only one good move and the second best move would be a blunder.  Also if they were +10 and 0 then the best move might be a nice tactical play, but you need to know what the evaluation was before the opponents last move.  So if the eval was already +10 before the opponents last move then the +10 just keeps the position in its current status, with the second best move (0 evaluation) being a big blunder.

So the key is really changes in evaluation between moves.  If the position is even then suddenly +10 (assuming no material has been given up yet) then you can assume there is a tactical possibility in play that makes the +10 evaluation possible (remembering that these are no static valuations of the current piece value, but the value that could be achieved down the track given best play by both sides).

Actually finding when a tactic starts turned out to be relatively easy, the tricky bit was knowing the best point to cut off a tactic. This is not really perfect as it relies on a number of heuristics but I think this is one of the areas where CT is a little stronger than CTS. I find CTS problems are cut short much more often than CT problems  (CT also has weaknesses compared to CTS in other areas, in particularl CT has more sequences that look like they would never be played by a human (very long mating sequence for example)).

Currently I only use the best move versus second best move evaluation to work out if a position is ambiguous.  If the best and second best moves are too close then even if I know there is a tactic in play I discard that position.  UCI enabled chess engines have a "multipv" option which allows seeing the evaluation for the n-best lines. Not all engines support this, as you point out, crafty unfortunately does not (Toga is considered stronger than crafty in any case).


Regards,
Richard.
Logged
roq
Jr. Member
**
Posts: 50


View Profile
« Reply #6 on: January 02, 2008, 03:55:47 pm »

From my experience with the exercises here I think the solution is an offensive move 100% of the time, with the consequence that I know what to search for and can often guess the first move even if I don't see the whole solution. This narrow selection of problem types might be the reason I get rated 1900+ here, while in real life I'm probably 1200-1400

The accepted wisdom seems to be that there is no connection between your rating on Chess Tempo / chess tactics server and your actual playing strength. Personally, I think this is rubbish since immediately you admit that stronger players more often than not have higher Tempo ratings than weaker ones you are admitting that there is some kind of correlation between the two systems you are comparing. 

I think that the correlation between Tempo and say ICC blitz rating would be quite strong, although I can’t prove it since its too hard to obtain accurate comparable data on people’s ratings. It’s my contention that tactics/analysis strengh is by far the major constituent of your chess rating when you are in the 1000-2000 ELO range and that a model that ignores the other factors would still give a reasonable correlation. Computers, for instance, are lousy at strategy, but they consistently beat all but the strongest players using their tactical strengh. Some time ago I wrote a simple chess program that pretty much just used material and space as the two factors in its position evaluation function and it wasn’t that bad... What chess programmers tend to find is that given a time constraint, searching for tactics at an extra ply of depth usually results in a stronger program than adding a more sophisticated evaluation function that takes longer to apply to each node of the move tree.             

Note that, of course, you can’t just say: “I have a Glicko of 2000 on Tempo therefore my ICC blitz rating will be about the same” since there are other factors that a model would need to consider, such as accuracy and number of tries. However  a naïve linear model (ignoring number of tries for now, also the affect of accuracy is certainly not linear and the model does not take account of differing levels of inflation over the rating curve) which would be a good place to start (and might be a reasonable approximation over a narrow rating range) might be:

ICC_Rating = (Glicko_On_Tempo + X) - Y * (100 – Success%_On_Tempo)
       
In my opinion X would be negative for Chess Tempo, because Tempo tends to overrate compared to other systems (Richard is such a nice guy that he likes to flatter his users), whilst Chess Tactics Server tends to underrate its users so X would be positive on a comparable formula for CTS.

e.g. if for chess tempo we set X = -200 and Y = 10, then for  ChessEric with Glicko 1950 and success 65% we get:

ICC_Rating = (1950-200) – 10 * (100 – 65) = 1400

OK so it’s a bit of a fudge, but I think the approach is reasonable.

Note that Tempo credits success better than chess tactics server (which massively favours solving speed) and Y would be much higher for CTS.

However, I think that even on Tempo increasing the penalty for inaccuracy would be a good way of reducing the guessing that Chesseric mentions. IMHOP guessing is the major thing that reduces the benefit you get from solving tactics problems.
Logged
chesseric
Newbie
*
Posts: 15


View Profile
« Reply #7 on: January 03, 2008, 02:53:02 am »

Hi,

I still think using the best vs second best score would yield good problems. The only thing I would fear is that some of the positions so found would be trivial problems, but on the other hand, your collection already contains many trivial problems (taking a hanging piece). I don't think looking at the changing of the evaluation then would be required: having a position where there is a unique move that avoids a blunder, is in itself a guarantee that some tactic is involved, offensive or defensive, or otherwise, only risk being (as i mentioned) that the tactic involved is trivial. Using this criteria might even solve the problem of how how many moves of the variation that you should include: include moves as long as there is a unique best move (but then there was that tricky problem with "more moves needed" but that's another issue). The defensive problems so created will probably in most cases be only one move long.

Or put it another way: Looking for a blunder will only generate problems out of moves where someone really made a blunder, but looking for unique best moves will find the positions where tactics are involved, independent of whether the player blundered or actually found the move.

Roq, yes I see that the CT ratings and chess playing ratings shouldn't be directly compared, for several reasons, also others than those you mentioned (In fact I'm not even 1200 at blitz at FICS). I agree that tactics is very important to improve at chess at my level. My point was that doing these CT problems, is somewhat unrealistic, because OTB, you can't have an attitude every move that "there's gotta be some offensive tactic exploit here", because not considering defense would leave you vulnerable (not seing the defensive considerations is the reason of most my blunders I think). So all I'm saying is, that the problem set would be even better (= more realistic) if there also were problems where a defensive move is called for. That way you would be challenged to think more carefully about the position.

Cheers!
 
Logged
roq
Jr. Member
**
Posts: 50


View Profile
« Reply #8 on: January 05, 2008, 09:21:59 pm »

Yes I agree with you that defensive problems would be a great addition and they are often very tricky. Unfortunately I know that Richard's enhancement list is really long (I've also been annoying him with tricky enhancement requests Smiley) and I don't think adding defensive problems would be very easy at all. 

Over on CTS some guys have developed a solving technique where instead of concentrating on your rating you try and get your accuracy as high as possible. What you find is that if you really push your accuracy you get to a stage where you *must* analyze the problems to maintain your accuracy level and guessing won't work. On CTS I have accuracy at about 92% , but i find that even at that level one still guesses on many problems and I'm attempting to push it up over 95%. On Chess Tempo getting high accuracy is much harder than on CTS partly due to the wider range of problems you get at at a particular rating, but also I think because Tempo has more unusual problems than CTS. BTW - If you do adopt this approach, expect your rating to plummet down!         
Logged
chesseric
Newbie
*
Posts: 15


View Profile
« Reply #9 on: January 12, 2008, 11:48:39 pm »

Hi Roq,

Problem nr 18087 illustrates my point well. I could do NxR without hesitation, because I "knew" the problem would not include defensive issues. But in order to learn something from it I afterwards also analysed why black's counter by pushing e2 doesn't work out. In a real game, I'm not sure if I had dared NxR. At least it would have taken much longer for me to be sure.

About the defensive problems issue. I'm not sure if it is that hard. My suggestion above was, that instead of searching for moves where the position score changes alot, search by using the "there is a unique best move" criterion (the score difference between the best and the second best move is big). I haven't got a proof, but it seems to me that that would generate both offensive and defensive problems.
Logged
roq
Jr. Member
**
Posts: 50


View Profile
« Reply #10 on: January 13, 2008, 01:35:16 am »

Hi again

That's an interesting one and i agree with you that it is not a very good problem as it stands.  I think the reason the computer doesn't give more moves is that after the knight takes there are several different possible next moves that the computer regards as being close to equal with each other in evaluation. It can't really distinguish between them so it has to end the line. Presumably then your "unique best move" algorithm would reject this type of problem in some way? 

Of course what we want here is for the computer to go 1...f2, but unfortunately that leads to black getting mated in 7 moves, so in fact if the solution was extended the computer would play an apparently nutty move such as 1...Rxc7.
     
       
Logged
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #11 on: January 13, 2008, 02:28:24 pm »

Hi,

Looking at Unique best moves is certainly something that can be looked at.  However by itself without some other heuristics, it has the potential for leading to a large number of relatively uninteresting defensive positions. So for example, after blacks move at move 20 you have an evaluation of +2 (from blacks point of view)

After whites move
21. random_move_a

the  evaluations for best and second best moves for black at this point might be:
Best Move: +2
Second Best Move: -7
 
So the best move here has a big gap to the second best move, however this may be as simple as your queen is threatened, move it away or lose it on the next move. I see it as similar in nature to a tactic which takes a hanging piece (which we have a few of in the set, but hopefully not enough to be annoying).  One argument might be that we could throw a constrained number of these into the set so that users can't automatically assume they are dealing with an offensive move.  What gets more difficult is finding defensive moves that are also "interesting".


Regards,
Richard.

Logged
chesseric
Newbie
*
Posts: 15


View Profile
« Reply #12 on: February 04, 2008, 08:33:34 pm »

Yes, that was exactly the issue I was thinking on, what criterion to use to get more interesting moves. That might be a tricky problem to solve. Somehow a problem of the defensive trivial sort already sliped into the problem set, actually its so absurd its even funny (at least I laughed  Cheesy). I'm refering to problem nr. 15805, where you get to make the one and only legal move before you get mated! LoL!
Logged
roq
Jr. Member
**
Posts: 50


View Profile
« Reply #13 on: February 04, 2008, 11:45:05 pm »

Chesseric

Ah yes you’ve found that one I see. Of course you saw the joke and did appreciate that this in fact a signature problem intentionally inserted into the set as a memento mori in the same way that Holbein hid a skull in the corner of his painting “The Ambassadors”? In case you didn’t you can easily see the allegory: The constrained king has only a single move and that runs into the final quick blow by the bishop that represents the swift fall of an axe head. Whilst in the final position the cluster of pieces is reminiscent of a staring skull with the black pieces as sockets. Meanwhile, the white queen and rook are helpless onlookers, poised forever on the brink of their own execution of the black king never to achieve it although winning near their goal...       
Logged
richard
Administrator
Hero Member
*****
Posts: 990



View Profile
« Reply #14 on: February 05, 2008, 12:08:41 pm »

.. Of course you saw the joke and did appreciate that this in fact a signature problem intentionally inserted into the set as a memento mori in the same way that Holbein hid a skull in the corner of his painting “The Ambassadors”? In case you didn’t you can easily see the allegory: The constrained king has only a single move and that runs into the final quick blow by the bishop that represents the swift fall of an axe head. Whilst in the final position the cluster of pieces is reminiscent of a staring skull with the black pieces as sockets. Meanwhile, the white queen and rook are helpless onlookers, poised forever on the brink of their own execution of the black king never to achieve it although winning near their goal...       


<smile>

Yes, a clever display of "oblique anamorphosis" that's what it was! (cough - bug - cough).

Richard.
Logged
Pages: [1]
Print
Jump to: