Chess Tempo

Username:
Password:
/ Register

User Details

Username:
Blitz Rating:
Standard Rating:
Logout
July 24, 2008, 06:20:01 pm *
Welcome, Guest. Please login or register.
News: SMF - Just Installed!
 
Pages: [1]
Print
Author Topic: Problem problems!?  (Read 1003 times)
roq
Jr. Member
**
Posts: 50


View Profile
« on: January 21, 2008, 03:27:28 pm »

Firstly, I’m noticing a higher percentage of suspect problems since the update. I suspect that is because they haven’t yet been pushed up to a high Glicko Rating, so that they now appear more frequently than they will when the ratings settle down. Also I might mention I’ve rarely seen a mate over six moves long that doesn’t have either 1) an ambiguity problem, 2) early win of material versus long mate 3) computer sacs material too early. Most of the very long mates (i.e 8 moves and over) are probably unsolvable by humans and even Fritz has difficulties with many of them.

One type of problem I’m seeing more frequently is where the solution does not justify the first move. In #15923 for example, a human would probably play the solution move 1...Rxf8 anyway just because it seems more natural than 1...Qxf8, but here the computer then plays the bemusing 2.Qd2?! losing immediately. The natural human response for white is 2.Kxg2, but the computer has seen that this loses to the following complex line that most humans are not going to see:

1...Rxf8 2.Kxg2 Qf6 3.Qe1 Nf3 4.Qe2 Nh4+ 5.Kh3 Qf5+ 6.Kg3 Qf4+ 7.Kh3 g4+ 8.Qxg4 Qxe3+ Line

In problem #22247 the solution ends just when things are getting interesting. Black has the counter tactic 3. Rxc2+ winning back the queen and it’s only because of the counter counter tactic 7.Rhd1! pinning and winning the knight that white comes out on top. 

1.Nxe6 fxe6 2.Bxf6 Rxc4 3.Bxd8 (Solution ends) Rxc2+ 4.Qxc2 Bxc2 5.Kxc2 exd5 6.Rxd5 Kxd8 7.Rhd1 Line

A couple of problems I’m not sure about:

#18925 This is a very hard six move mate with no ambiguos mates where the correct move is 1.Rb6+!!. However, after 1.Ra6!? instead white wins the knight after 3 moves as shown and later ends up a queen up. You could argue this both ways. I didn’t know how to rate this one! What do you think?
   
1.Rb6+ [1.Ra6+ Kc4 2.Ra4+ Nb4 3.Qxb4+ Kd5 4.Qb3+ Ke4 5.Qc2+ Kf3 6.Qxc3+ Kg4 7.Rxd4+ Bxd4 8.Qxd4+] 1...Kc4 2.Qa6+ Kd5 3.Qa8+ Kc4 4.Qa2+ Kc5 5.Qa5+ Kc4 6.Qb5# Line

#29593 Here you have a choice of taking the queen with a winning position or a two move mate. Would it really be wrong to take the queen or not? I’m not sure. I know that some people get frustrated by these. At the moment I’m not rating these.

Some of the dubious problems just need a simple bit of editing rather than being discarded (discarding problems worries me since I’m suspicious that tightening the generator’s parameters will reduce the diversity that I think is one of the main attractions of CT over CTS).
 
P.S. There are a lot of great problems in the new update as well! 
Logged
richard
Administrator
Hero Member
*****
Posts: 589


View Profile
« Reply #1 on: January 26, 2008, 01:26:33 pm »

Hi Roq,

Apologies for the delayed response, I've been away so haven't had much time to catch up on the forum.  One of the drawbacks of introducing so many problems at once is certainly the increased exposure of the more problematic positions to a wider range of users.  Of course the most highly ranked players see these positions reasonably often, I suspect this probably reduces the usefulness of the site to the highly proficient, so I'm keen to move forward with problem set improvements.

Problems of the type seen in  #15923 are amoungst the trickiest to deal with, any position where the computer sees something that a human finds hard to notice is problematic. I think these are prime candidates for human filtering rather than trying to tweak the generator.

#22247 looks bad enough to be a bug.  I'd have to check the source code to be sure, but I suspect the generator decided an overwhelming advantage had been achieved after the first queen take so did not proceed, despite the action that unfolds just after that.  This check was added to the code to avoid the feeling some users had that "I've taken the queen now, why would I bother trying to look at any more moves?".  Unfortunately it seems I'm not checking properly for counter play.  I'll look into this later.

For #18925, I also struggle to decide about these. Sometimes its clear that the material win is a more sensible move, but in this case its not really clear cut, the knight take alone might not be enough to justify missing the mate, and the queen take occurs a bit further out...some might argue the 6 move mate is just as clear if not clearer.

#29593 is another gray area.  It gets more obviously annoying as the length of the mating sequence increases.  My personal opinion is that for mates this short I feel comfortable preferring the mate over the Queen take.

I'm also concerned about getting too harsh at the generator stage, diversity is certainly at risk as the generator is made to throw away more potentially good tactics.  I've just finished the latest test run and it would prune the current set back to just over 11K problems and leave only 13% of the problem set as mates.  This was after getting rid of all mates which had longer mates visible within the search time used. It also used a new heuristic to try and get rid of positions that had high material winning alternatives to the mate.  I don't have hard data on this but it seemed the latter heuristic discarded as much as the longer mates change, this surprised me a little.

I haven't had time to load up the new set for testing.  I'll probably do that this week (relax - not on the live site :-)  ), to look at how diversity has stood up under this fairly harsh pruning.

Anyway, thanks for the ongoing feedback, its really useful to have these concrete problem numbers to allow me to look at whether test runs cull these or not.  Hopefully the test runs will indicate things are progressing in the right direction, although the length of time required for each runs means it will still be a few weeks before a new generator can start to generate an official update (which itself will take some time as I'll have to generate enough problems with the new generator to avoid making an update where the total number of problems goes backwards).


Regards,
Richard.


Logged
roq
Jr. Member
**
Posts: 50


View Profile
« Reply #2 on: January 26, 2008, 08:27:51 pm »

I'm also concerned about getting too harsh at the generator stage, diversity is certainly at risk as the generator is made to throw away more potentially good tactics.  I've just finished the latest test run and it would prune the current set back to just over 11K problems and leave only 13% of the problem set as mates.  This was after getting rid of all mates which had longer mates visible within the search time used. It also used a new heuristic to try and get rid of positions that had high material winning alternatives to the mate.  I don't have hard data on this but it seemed the latter heuristic discarded as much as the longer mates change, this surprised me a little.

That sounds like very radical pruning and would be over half the problems. I don’t think the issues are that serious. A finger in the air estimate from my experience including the new set would be about one in four to one in five problems with issues some of which aren’t that serious. A reasonably conservative, seat of the pants filter that might reduce the mating issues a bit (that may or may not be difficult to do!?) without removing many sound problems might be 1) Remove all mates that happen in greater than six moves and 2) remove all mates with ambiguos mates one move longer that are two moves or greater.
 
I think, perhaps, in positions with an exposed king, there can be more ways to win material, because 1) if you eliminate material defending the king then mates become more likely and 2) Tactics against the king such as pins, skewers etc are absolute often forcing the loss of pieces with check. 
 
Anyway, thanks for the ongoing feedback, its really useful to have these concrete problem numbers to allow me to look at whether test runs cull these or not. 

Thanks that’s reassuring – Its hard to judge whether information is useful or not. In general, I only put things in the forum problem section if I think they highlight some new kind of issue or extend information on a known issue. If you need some more specific problems with known issues for your analysis, I tend to rate most of the problems I do (and comment all the problems that I think have issues) making the following use of the star system:

1* - Problems that I think are suspect and may need to be removed or edited.
2* - Dubious with some issues that don’t appear as serious as 1*.
3* - Bread and butter. Problems I think are technically OK, but that don’t inspire me to the heights of eulogy.
4* - Good to very good.
5* - Excellent to brilliancy prize.

The last two months ratings/comments are probably quite reasonable as I now have some grasp of the issues, due to your explanations.

P.S. I hope you had a good break. I understand the snow is quite good at the moment (at least in Europe), but unfortunately I can’t get away right at the moment...
Logged
richard
Administrator
Hero Member
*****
Posts: 589


View Profile
« Reply #3 on: January 28, 2008, 10:44:32 am »

Hi Roq,

Snow might be ok in Europe, but not so great in Sydney which is where I was holidaying :-)

I think your seat of the pants method would be more than adequate and would avoid the bulk of the annoying mate versus long mate positions.

The reason I'm experimenting with removing all of the mates that have longer mates is that the second best move will then always give me information on the forward projected material balance which allows me to remove a number of other annoying mates that would still remain in the system after the "seat of the pants" heuristics. For example queen sack in 1 move versus mate in 5.  Currently my feeling is that these types of alternative moves are much more annoying than "mate in 4 found but mate in 3 was required". Removing ALL ambiguous mates let me have a go at removing both.

I'm a little concerned about removing the mate in > 6 moves as I like the idea that there are still legitimate but very difficult problems for the very skillful players.

I'm not that scared of throwing out a large number of problems if it can give a significant improvement in problem set quality without decreasing diversity too much.   I'll hopefully get a chance to assess the diversity of the last test run in the next day or so. In terms of the total problem set size at completion, I still haven't got through more than 10% of the total games I have available, and I currently only look at one potential tactic per game (the tactic closest to the end).  So even without looking further through existing games I could still end up with well over 100K problems.

Regards,
Richard.
Logged
roq
Jr. Member
**
Posts: 50


View Profile
« Reply #4 on: January 28, 2008, 05:15:00 pm »

I'm a little concerned about removing the mate in > 6 moves as I like the idea that there are still legitimate but very difficult problems for the very skillful players.

Yes you are right of course Sad - I found a sound and original 7 move mate, but unfortunately lost the problem # (for some reason I’d logged off by mistake). So have had to revise my opinion on removing all mates of over six moves.

However, note that IMHOP *many* mates over this length are "unsolvable", by which I mean not that I can’t solve them, but that Fritz can’t see them in a reasonably short time. What I think this means is that the problems are of the type referred to by Kotov (In Think like a Grandmaster) as dense coppices of variations and Fritz (my Fritz is running at about 2.5 million nodes/sec) has to look at many millions of nodes on it’s move tree to find the shortest solution, rather than following a relatively linear sequence of forced moves - something no human can emulate of course. One way of filtering out these problems might be to find out how long an engine takes to find the solution and mark as dubious, ones where the engine takes over some certain time (that's what I've done when I use the word "unsolvable" in my comments). I think that as the number of solution moves gets greater the number of unsolvable problems compared to solvable ones will increase exponentially. However, I think I’m in egg sucking territory again, so I’d better stop here...

P.S. 100,000 problems would be awesome!           
« Last Edit: January 28, 2008, 09:42:22 pm by roq » Logged
roq
Jr. Member
**
Posts: 50


View Profile
« Reply #5 on: January 29, 2008, 12:52:19 am »

The reason I'm experimenting with removing all of the mates that have longer mates is that the second best move will then always give me information on the forward projected material balance which allows me to remove a number of other annoying mates that would still remain in the system after the "seat of the pants" heuristics. For example queen sack in 1 move versus mate in 5.  Currently my feeling is that these types of alternative moves are much more annoying than "mate in 4 found but mate in 3 was required". Removing ALL ambiguous mates let me have a go at removing both.

I’m not too symphathetic with with this approach :-) It seems a bit like bathing a pair of twins and throwing one of the babies out with the bath water in order to detect whether the other baby should be thrown out too. Grin



Logged
richard
Administrator
Hero Member
*****
Posts: 589


View Profile
« Reply #6 on: January 30, 2008, 03:37:27 pm »

I like your suggestion of looking at the time the computer has spent.  Something like nodes examined compared to depth searched might be a reasonable measurement of how much branching the tree contained.

I agree with the sentiments on baby and bathwater however I think the "eggs/omelette" view is also relevant here.  To get a  great omelette instead of just a good one, we have to break a few eggs.  As eggs are cheap we can afford to use a lot of them.  Of course this only stands up as long as we haven't thrown out so many eggs we only have mediocre ones left.  I still have to verify that the eggs we have left are still of sufficient quality (and range of tastes - not everyone likes their omelettes the same :-)  ).

Regards,
Richard.
Logged
roq
Jr. Member
**
Posts: 50


View Profile
« Reply #7 on: February 01, 2008, 08:28:15 pm »

I think your idea of using the number of nodes and depth is better than mine of using the time taken for approximating the complexity and also much more comparable over different engines (I wonder how an ancient program such as Belle would compare to say Fritz? It would be an interesting experiment).

P.S. I'd hate to see the state of your kitchen after you've been cooking omelettes :-)
« Last Edit: February 02, 2008, 12:07:24 pm by roq » Logged
richard
Administrator
Hero Member
*****
Posts: 589


View Profile
« Reply #8 on: February 02, 2008, 02:03:18 pm »

P.S. I'd hate to see the state of your kitchen after you've been cooking omelettes :-)

:-)

Logged
Pages: [1]
Print
Jump to: