Chess Tempo

Username:
Password:
/ Register

User Details

Username:
Blitz Rating:
Standard Rating:
Logout
December 02, 2008, 06:02:30 am *
Welcome, Guest. Please login or register.
News: SMF - Just Installed!
 
Pages: [1]
Print
Author Topic: New problem set  (Read 425 times)
drahacikfm
Sr. Member
****
Posts: 417


View Profile
« on: May 21, 2008, 06:16:08 pm »

The new problem set is much more fun.  For example, now when I see after 20 seconds that I can win the queen for a rook, I just do it.  Previously, I had to spend an extra minute or two looking for weird mates-in-5 that might or might not be there.  Now I skip that part, and just win the queen Smiley
Logged

FIDE Master Drahacik
richard
Administrator
Hero Member
*****
Posts: 988



View Profile
« Reply #1 on: May 21, 2008, 06:38:12 pm »

I'm very happy to hear that :-)  Mind you, there are still some mates lurking in the system where alternative material winning lines exist.  However my guess is that these are so uncommon now that you can almost always enjoy taking the queen without getting burnt.

I had an unrelated question that has come to mind after looking through some of the recent problem comments.  There are a number of non-mating problems that people have rated low quality due to positions which demand the user to find a rook take when a (perhaps easier) knight take is available.  This is related to what my generator considers to be winning. Generally speaking I don't allow positions where there are two moves that lead to a "won" position (this is easier for non-mates than mates as the mates hide the material evaluation).  My definition of won is a material advantage of slightly more than a minor piece. This means that going a knight up is not considered winning so positions with a rook take and knight take alternative are allowed and the rook take must be found.

How reasonable do you think this is?  I think part of the issue is that my definition of "won" here is a bit naive.  Whether going a knight up is winning or not is probably highly dependent on the nature of the position at that point and the computer doesn't do a great job of assessing that.  I can think about possibly being a bit stricter in this area when working on the next version of the problem generator, what do you (and anyone else reading) think about positions like this?

Regards,
Richard.
Logged
drahacikfm
Sr. Member
****
Posts: 417


View Profile
« Reply #2 on: May 21, 2008, 08:33:30 pm »

When considering a rook-take and a knight-take, are you evaluating those based purely on the material remaining on the board, or on the computer evaluation?  Purely by material, one is +5 and the other is +3.  But the computer might say the rook-take position is worth +4.5 and the knight-take position is +3.5, a difference of only 1.0, which is an ambiguous problem.  Better to use the computer evaluation, not just the material.  I think computers are pretty good at adding the other considerations, besides material, into the evaluation.

I think there are two goals in the problem generator:

1) To find problems where there is clearly a "best" move.  This doesn't have to be a clearly winning move!  It could be a drawing move in a position where all other moves lose.  Or a move that leaves you one pawn up, while all other moves leave you a piece down.  Problems where the best move is not clearly winning are actually very practical and good for training.

2) To reject problems with more than one "winning" line.

So a solution could be like this:

To satisfy Goal 1, the best line has to be at least 3.0 better than the second-best line, according to the computer evaluations.

To satisfy Goal 2, the second-best line cannot have a computer evaluation higher than 2.0.  Yes, I think it's reasonable to lower your "winning" level from being a full minor piece up, because we have included Goal 1, a minimum 3.0 difference between best and second-best.
Logged

FIDE Master Drahacik
richard
Administrator
Hero Member
*****
Posts: 988



View Profile
« Reply #3 on: May 22, 2008, 04:01:04 am »

I mostly use computer evaluations throughout the generator, there are some situations where I use a straight material count (such as working out material balance in mating position).

In terms of goals 1 and 2 , I do have code that attempts to achieve both these goals, however I'm currently less strict than your guidelines.  One of the issues with setting a 3.0 better score for best versus second best is that you will never have exchange winning moves in the problem set. For this reason I've set the best to second best threshold closer to 2.0.

If I understand your criteria correctly one outcome would be that all positions that had a choice between a rook and a knight take would be deemed ambiguous? I'm not currently this strict but not have a big problem with becoming more strict here, I would be a little concerned with throwing away tactics that have a won exchange though.

I'm probably going to do a fairly large rewrite of the next generator, it has grown in a way which makes it hard to add further features to (when I started writing it I expected a rather small program that let the chess engine do most of the work but as I've had to add heuristic on top of heuristic things have gotten a bit out of control). I'll keep all the heuristics I've developed so far but I'll implement them in a somewhat cleaner way.

I also need to keep data of the evaluations around in a more convenient format.  At the moment I don't have a convenient way of investigating why the generator liked a problem .  There is a lot of debugging output but I don't keep that in an easily accessible format.  Doing so will not only allow better analysis of why the generator is making certain decision but will also allow me to expose these decisions more directly to users in the UI.



Regards,
Richard.
Logged
drahacikfm
Sr. Member
****
Posts: 417


View Profile
« Reply #4 on: May 22, 2008, 04:31:34 pm »

Yes, I see your point that the 3.0 difference between first and second lines for Goal 1 is too much.  It eliminates lines that win the exchange.  Keeping that at 2.0 should be fine.

With Goal 2 set at 2.0, problems will be eliminated where the position is otherwise completely equal, but you can grab a free knight (+3) or a free rook (+5).  That's good, because it shouldn't be marked wrong to take the knight.  I think 3.0 is too high for Goal 2, because it would allow these positions.

A position where you are down 2 pawns, but you can grab a free knight (resulting in +1) or a free rook (resulting in +3) would still be allowed, because only the rook-take is over the threshold of 2.0.  That's a good generator result.  Because if you have a knight for 2 pawns, you are not clearly winning.

I left out Goal 3, which is obvious and you probably already include it:  The best move has to be at least drawing.  You don't want problems where the position is lost Smiley  So Goal 3 is:  best move must be at least 0.00 evaluation.
Logged

FIDE Master Drahacik
richard
Administrator
Hero Member
*****
Posts: 988



View Profile
« Reply #5 on: May 22, 2008, 05:03:44 pm »

Thanks Drahacik,

I think one of the issues with the current generator is probably that I've focussed too much on relative scores rather than absolute scores, i.e. many heuristics examine the best evaluation in relation to the pre-tactic evaluation and the second best move evaluation but don't put enough weight on the non-best line's absolute evaluations.  This has allowed a number of rook versus knight situations into the set where the user will be marked wrong if they choose the knight over the rook.  I'll have to think a bit harder on improving this area for the next verification run.

Regards,
Richard.
Logged
slacker00
Jr. Member
**
Posts: 63


View Profile
« Reply #6 on: May 25, 2008, 10:03:42 am »

Your discussion has got me thinking.  Why not have more problems where the user is losing, but can mitigate his losses and get to a playable or drawn position?  For example, getting a draw in a losing position, or finding a variation that "only" loses the exchange where other variations are much worse.  I'm not sure what others think about these kinds of problems, but I  thought I'd throw it out there for  discussion.
Logged
richard
Administrator
Hero Member
*****
Posts: 988



View Profile
« Reply #7 on: May 25, 2008, 06:53:14 pm »

Hi Slacker00,

The current generator can find tactics where a losing position can be turned into a winning or even position  by winning material, I suspect there are not that many in the set as opportunities for tactics probably start to dry up once you are at a material disadvantage. Unfortunately the generator doesn't deal with drawn positions that well at the moment. Drawn positions are treated the same as a completely even position (the engine sends 0.00 as the evaluation when a draw is found and I don't check for this). I'd like to be able to deal with drawn positions as some point as I think it is an area where there a number of interesting tactics that would be nice to add to the problem set.

The idea of the "defensive" tactic such as your example where you have to find the move that only loses the exchange is an interesting area that another user brought up a while ago.  There are a few things that make it tricky for the problem generator in its current form to deal with these, it is something I might look at in the future.

Regards,
Richard.
Logged
tmr
Jr. Member
**
Posts: 58


View Profile
« Reply #8 on: May 26, 2008, 01:15:47 am »

I suppose stalemates and perpetual checks would be draw situations that would be easy to find (or even create by having the opponent make the "wrong" move), though I don't have a clue how prevalent these are in real-world games.  Still I think they would enhance the character of the problem set.
Logged
richard
Administrator
Hero Member
*****
Posts: 988



View Profile
« Reply #9 on: May 26, 2008, 10:12:24 am »

I suppose stalemates and perpetual checks would be draw situations that would be easy to find (or even create by having the opponent make the "wrong" move), though I don't have a clue how prevalent these are in real-world games.  Still I think they would enhance the character of the problem set.

I think so also.

Regards,
Richard.
Logged
Pages: [1]
Print
Jump to: