PDN parsing issues¶
This section gives an overview of some issues with existing PDN definitions. In particular those issues that are important when writing a PDN parser.
Game Separator (1)¶
The *
symbol is both used as a game terminator/separator and as a move strength
indicator to denote a forced move. This introduces a nasty ambiguity.
For example the string 1-6* 32-28
can be interpreted as one game containing two moves,
or as two games separated by a *
. Since *
is commonly used in draughts publications
to denote forced moves, the preferred solution would be to disallow *
as a game separator,
and to use a different symbol like #
. However, this would completely destroy backward
compatibility. A less intrusive solution is to disallow *
as a move strength indicator.
Note that there is an alternative available in the form of the $7
numeric annotation
glyph. Yet another solution is to demand that there can be no space between a move and it’s
corresponding move strength. Then a move and it’s corresponding move strength can be
defined as one token.
Game Separator (2)¶
It is common practice to terminate games with their result. In PGN this is no problem,
since the chess results differ substantially from chess moves. But in draughts some
results like 1-0
and 1-1
are very similar to normal draughts moves. This complicates
parsing. For example, if the result 1-1
is defined as a token, then parsers may easily
get confused by a move like 1-18
. Several parsers insist on parsing this as
1-1
followed by an 8
. This problem is likely to occur when a move is split up into
separate tokens.
Since the result of a game can already be specified in PDN using the Result
tag, there
is no need to use a game result as a game separator. It can even be considered as bad style
to have two different ways to specify the result of a game. It seems therefore logical to
forbid using the result of a game as a game terminator (or separator).
Capture Separator¶
The squares of a capture are separated using the symbol x
, for example in the
move 32x23
. If one defines a capture as a production
CaptureMove = Square “x” Square
then there can easily be conflicts with identifier tokens. Tokenizers are often greedy,
which means that they can insist on parsing x23
as an identifier token, instead
of a capture separator x
followed by a square 23
. Some parsers offer solutions to
this type of problem, but not all of them. Note that this problem can be avoided by defining
a move as a single token.
Move token¶
It is an important question whether a move should be defined as a single token (by means of a regular expression), or as a production consisting of multiple elements. A production has the benefit that the structure of a move can be represented more clearly. But as explained above, then a more powerful parser is needed. If a move is defined as a token, then a simple LL(1) parser is enough to parse PDN.
Move strength¶
In draughts publications a move strength can be wrapped in parentheses, like in 31-27(?)
.
Parentheses are also used to define variations in an analysis, for example
1.32-28 18-23 2.38-32 ( 2.37-32? 23-29! ) 12-18
. This introduces an ambiguity, but in
most parsers this can be resolved by defining a move strength as a single token.