page_util_norm_peg(3tcl) Parser generator tools page_util_norm_peg(3tcl)
______________________________________________________________________________
NAME
page_util_norm_peg - page AST normalization, PEG
SYNOPSIS
package require page::util::norm_peg ?0.1?
package require snit
::page::util::norm::peg tree
______________________________________________________________________________
DESCRIPTION
This package provides a single utility command which takes an AST for a
parsing expression grammar and normalizes it in various ways. The re-
sult is called a Normalized PE Grammar Tree.
Note that this package can only be used from within a plugin managed by
the package page::pluginmgr.
API
::page::util::norm::peg tree
This command assumes the tree object contains for a parsing ex-
pression grammar. It normalizes this tree in place. The result
is called a Normalized PE Grammar Tree.
The following operations are performd
[1] The data for all terminals is stored in their grand-
parental nodes. The terminal nodes and their parents are
removed. Type information is dropped.
[2] All nodes which have exactly one child are irrelevant and
are removed, with the exception of the root node. The im-
mediate child of the root is irrelevant as well, and re-
moved as well.
[3] The name of the grammar is moved from the tree node it is
stored in to an attribute of the root node, and the tree
node removed.
The node keeping the start expression separate is removed
as irrelevant and the root node of the start expression
tagged with a marker attribute, and its handle saved in
an attribute of the root node for quick access.
[4] Nonterminal hint information is moved from nodes into at-
tributes, and the now irrelevant nodes are deleted.
Note: This transformation is dependent on the removal of
all nodes with exactly one child, as it removes the all
'Attribute' nodes already. Otherwise this transformation
would have to put the information into the grandparental
node.
The default mode given to the nonterminals is value.
Like with the global metadata definition specific infor-
mation is moved out out of nodes into attributes, the now
irrelevant nodes are deleted, and the root nodes of all
definitions are tagged with marker attributes. This pro-
vides us with a mapping from nonterminal names to their
defining nodes as well, which is saved in an attribute of
the root node for quick reference.
At last the range in the input covered by a definition is
computed. The left extent comes from the terminal for the
nonterminal symbol it defines. The right extent comes
from the rightmost child under the definition. While this
not an expression tree yet the location data is sound al-
ready.
[5] The remaining nodes under all definitions are transformed
into proper expression trees. First character ranges,
followed by unary operations, characters, and nontermi-
nals. At last the tree is flattened by the removal of su-
perfluous inner nodes.
The order matters, to shed as much nodes as possible
early, and to avoid unnecessary work later.
BUGS, IDEAS, FEEDBACK
This document, and the package it describes, will undoubtedly contain
bugs and other problems. Please report such in the category page of
the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please
also report any ideas for enhancements you may have for either package
and/or documentation.
When proposing code changes, please provide unified diffs, i.e the out-
put of diff -u.
Note further that attachments are strongly preferred over inlined
patches. Attachments can be made by going to the Edit form of the
ticket immediately after its creation, and then using the left-most
button in the secondary navigation bar.
KEYWORDS
PEG, graph walking, normalization, page, parser generator, text pro-
cessing, tree walking
CATEGORY
Page Parser Generator
COPYRIGHT
Copyright (c) 2007 Andreas Kupries <andreas_kupries@users.sourceforge.net>
tcllib 1.0 page_util_norm_peg(3tcl)