Chess Tempo

Username:
Password:
/ Register

User Details

Username:
Blitz Rating:
Standard Rating:
Logout
December 03, 2008, 12:00:38 am *
Welcome, Guest. Please login or register.
News: SMF - Just Installed!
 
Pages: [1]
Print
Author Topic: Problem 9357: glockenspiel in need of service  (Read 253 times)
revenant
Full Member
***
Posts: 167


View Profile
« on: September 17, 2008, 11:00:40 pm »

http://chesstempo.com/chess-problems/9357

Several commenters at the problem correctly point out that after 1. Re4 b6, although 2. Qa3 is best, white's try 2. Qb4 should be accepted as an alt.  The variations sidebar shows Toga evaluating 2. Qb4 as only +0.89 at depth 16.  If you give it some time and let it search outward an additional 2 or 3 ply, you should see the eval leap up over the alt threshold to +4 or more, much closer to the value of 2. Qa3.  At least, that's what happened when I fed it to Crafty 22.0.

It would be a shame if the values converge so closely that the problem becomes ambiguous and gets rejected, but I guess sometimes you have to throw the glockenspiel out with the bathwater.  After all, when the crows come home to roost under the carnival awning, a nod's as good as a wink to the deaf post on which they sit.

Might there be a number of unfairly rejected alt's lurking in other problems?  In the "Extra analysis for near alternatives" thread a couple of weeks ago, cyanfish proposed allocating more CPU time to check for them.  If you can't afford to do this for the whole problem set, maybe you could pare it down to just the ones already rated over 1800 or thereabouts, as those will be likeliest to produce head-scratching complexities.
Logged
richard
Administrator
Hero Member
*****
Posts: 991



View Profile
« Reply #1 on: September 17, 2008, 11:58:32 pm »

I could be wrong but I don't think these situations are necessarily an issue of complexity.  The horizon problem is problem is pretty common in search problems (where you stop looking just before you get to the next interesting item over the current horizon), and there will also be a certain percentage of problems that exhibit the "search a bit longer and see a big jump in eval".  I'm not sure that high rated problems will definitely have a lot more of these than lower rated problems. 

Sometimes humans get lucky and find a problem where they can get the evaluation to agree with them if they ask the computer to search longer, but the time spent on each position at the moment should be enough to look further ahead than most humans will bother to look when solving problems here. How many users are actually looking at this problem and seeing that after 20 ply they are in a better position than if they only looked for 17 or 18 ply?  There is also the problem that, who is to say there was not another branch that opened up at 21 or 22 ply that swung back in the other direction?  I think this is one of those situations where a line has to be drawn at some point and the time allowed at the moment (on currently available CPUs) means that line is around 16-18 ply (+- a couple of ply depending on the complexity of the position).  When the line ends up being inadequate as it is probably is here, then I can manually kill the problem.

Regards,
Richard.
Logged
revenant
Full Member
***
Posts: 167


View Profile
« Reply #2 on: September 18, 2008, 02:37:54 am »

Good points all.  If we're looking for heuristics to help identify problems that can benefit from extra CPU, the problem's rating by itself may not be much of a clue.  Maybe we could look for certain aspects of the position.  Problem 9357 has an open board with queens and rooks still on.  I find that to be a common source of "reversals of fortune" or "good news, bad news" tactics.  (You know those stories schoolkids make up and tell each other...  "The good news was, we were going to Acapulco.  The bad news was, the plane's engines blew.  The good news was, we bailed out with parachutes.  The bad news was, we were falling toward cactus.  The good news was, we were able to steer toward water.  The bad news was, there were alligators in the water!...")  Other such types are positions with promotions in the air (especially promos for both sides in a race), or where all pawns are gone or locked in place so that stalemate becomes a viable tactic for the side down material.

It may be unusual for an eval to leap from +1 to +4 within 2 extra ply when the search is already at depth 16 as in 9357, but I doubt we really need to worry very often that the eval might then plummet back in another 3 ply.   Once an eval goes up, doesn't it generally stay up?  Otherwise the quiescence heuristic would be useless.  If weird late leaps happen in only 1/n cases, then wouldn't we expect weird late leaps followed by weirder later plummets to happen in only 1/(n-squared) cases?
Logged
richard
Administrator
Hero Member
*****
Posts: 991



View Profile
« Reply #3 on: September 18, 2008, 07:51:49 am »

Hi Revenant,

Good point on the 1/n versus 1/(n^2). I agree that double reversals should be much less likely.  Intuitively it also seems to make sense that the more material on the board the more likely a reversal is going to be, I'm not sure open or close board would make as much difference as the amount of material, but I guess there may be some bias towards open positions being more likely to have reversals.  I'll add an item to the todo list to look at spending more time in material heavy positions, but this will probably be a "wait until rewrite of generator" change.

Regards,
Richard.
Logged
Pages: [1]
Print
Jump to: