doctools::toc::parse(3tcl) Documentation tools doctools::toc::parse(3tcl)
______________________________________________________________________________
NAME
doctools::toc::parse - Parsing text in doctoc format
SYNOPSIS
package require doctools::toc::parse ?0.1?
package require Tcl 8.4
package require doctools::toc::structure
package require doctools::msgcat
package require doctools::tcl::parse
package require fileutil
package require logger
package require snit
package require struct::list
package require struct::stack
::doctools::toc::parse text text
::doctools::toc::parse file path
::doctools::toc::parse includes
::doctools::toc::parse include add path
::doctools::toc::parse include remove path
::doctools::toc::parse include clear
::doctools::toc::parse vars
::doctools::toc::parse var set name value
::doctools::toc::parse var unset name
::doctools::toc::parse var clear ?pattern?
______________________________________________________________________________
DESCRIPTION
This package provides commands to parse text written in the doctoc
markup language and convert it into the canonical serialization of the
table of contents encoded in the text. See the section ToC serializa-
tion format for specification of their format.
This is an internal package of doctools, for use by the higher level
packages handling doctoc documents.
API
::doctools::toc::parse text text
The command takes the string contained in text and parses it un-
der the assumption that it contains a document written using the
doctoc markup language. An error is thrown if this assumption is
found to be false. The format of these errors is described in
section Parse errors.
When successful the command returns the canonical serialization
of the table of contents which was encoded in the text. See the
section ToC serialization format for specification of that for-
mat.
::doctools::toc::parse file path
The same as text, except that the text to parse is read from the
file specified by path.
::doctools::toc::parse includes
This method returns the current list of search paths used when
looking for include files.
::doctools::toc::parse include add path
This method adds the path to the list of paths searched when
looking for an include file. The call is ignored if the path is
already in the list of paths. The method returns the empty
string as its result.
::doctools::toc::parse include remove path
This method removes the path from the list of paths searched
when looking for an include file. The call is ignored if the
path is not contained in the list of paths. The method returns
the empty string as its result.
::doctools::toc::parse include clear
This method clears the list of search paths for include files.
::doctools::toc::parse vars
This method returns a dictionary containing the current set of
predefined variables known to the vset markup command during
processing.
::doctools::toc::parse var set name value
This method adds the variable name to the set of predefined
variables known to the vset markup command during processing,
and gives it the specified value. The method returns the empty
string as its result.
::doctools::toc::parse var unset name
This method removes the variable name from the set of predefined
variables known to the vset markup command during processing.
The method returns the empty string as its result.
::doctools::toc::parse var clear ?pattern?
This method removes all variables matching the pattern from the
set of predefined variables known to the vset markup command
during processing. The method returns the empty string as its
result.
The pattern matching is done with string match, and the default
pattern used when none is specified, is *.
PARSE ERRORS
The format of the parse error messages thrown when encountering viola-
tions of the doctoc markup syntax is human readable and not intended
for processing by machines. As such it is not documented.
However, the errorCode attached to the message is machine-readable and
has the following format:
[1] The error code will be a list, each element describing a single
error found in the input. The list has at least one element,
possibly more.
[2] Each error element will be a list containing six strings de-
scribing an error in detail. The strings will be
[1] The path of the file the error occurred in. This may be
empty.
[2] The range of the token the error was found at. This range
is a two-element list containing the offset of the first
and last character in the range, counted from the begin-
ning of the input (file). Offsets are counted from zero.
[3] The line the first character after the error is on.
Lines are counted from one.
[4] The column the first character after the error is at.
Columns are counted from zero.
[5] The message code of the error. This value can be used as
argument to msgcat::mc to obtain a localized error mes-
sage, assuming that the application had a suitable call
of doctools::msgcat::init to initialize the necessary
message catalogs (See package doctools::msgcat).
[6] A list of details for the error, like the markup command
involved. In the case of message code doctoc/include/syn-
tax this value is the set of errors found in the included
file, using the format described here.
[DOCTOC] NOTATION OF TABLES OF CONTENTS
The doctoc format for tables of contents, also called the doctoc markup
language, is too large to be covered in single section. The interested
reader should start with the document
[1] doctoc language introduction
and then proceed from there to the formal specifications, i.e. the doc-
uments
[1] doctoc language syntax and
[2] doctoc language command reference.
to get a thorough understanding of the language.
TOC SERIALIZATION FORMAT
Here we specify the format used by the doctools v2 packages to serial-
ize tables of contents as immutable values for transport, comparison,
etc.
We distinguish between regular and canonical serializations. While a
table of contents may have more than one regular serialization only ex-
actly one of them will be canonical.
regular serialization
[1] The serialization of any table of contents is a nested
Tcl dictionary.
[2] This dictionary holds a single key, doctools::toc, and
its value. This value holds the contents of the table of
contents.
[3] The contents of the table of contents are a Tcl dictio-
nary holding the title of the table of contents, a label,
and its elements. The relevant keys and their values are
title The value is a string containing the title of the
table of contents.
label The value is a string containing a label for the
table of contents.
items The value is a Tcl list holding the elements of
the table, in the order they are to be shown.
Each element is a Tcl list holding the type of the
item, and its description, in this order. An al-
ternative description would be that it is a Tcl
dictionary holding a single key, the item type,
mapped to the item description.
The two legal item types and their descriptions
are
reference
This item describes a single entry in the
table of contents, referencing a single
document. To this end its value is a Tcl
dictionary containing an id for the refer-
enced document, a label, and a longer tex-
tual description which can be associated
with the entry. The relevant keys and
their values are
id The value is a string containing the
id of the document associated with
the entry.
label The value is a string containing a
label for this entry. This string
also identifies the entry, and no
two entries (references and divi-
sions) in the containing list are
allowed to have the same label.
desc The value is a string containing a
longer description for this entry.
division
This item describes a group of entries in
the table of contents, inducing a hierarchy
of entries. To this end its value is a Tcl
dictionary containing a label for the
group, an optional id to a document for the
whole group, and the list of entries in the
group. The relevant keys and their values
are
id The value is a string containing the
id of the document associated with
the whole group. This key is op-
tional.
label The value is a string containing a
label for the group. This string
also identifies the entry, and no
two entries (references and divi-
sions) in the containing list are
allowed to have the same label.
items The value is a Tcl list holding the
elements of the group, in the order
they are to be shown. This list has
the same structure as the value for
the keyword items used to describe
the whole table of contents, see
above. This closes the recusrive
definition of the structure, with
divisions holding the same type of
elements as the whole table of con-
tents, including other divisions.
canonical serialization
The canonical serialization of a table of contents has the for-
mat as specified in the previous item, and then additionally
satisfies the constraints below, which make it unique among all
the possible serializations of this table of contents.
[1] The keys found in all the nested Tcl dictionaries are
sorted in ascending dictionary order, as generated by
Tcl's builtin command lsort -increasing -dict.
BUGS, IDEAS, FEEDBACK
This document, and the package it describes, will undoubtedly contain
bugs and other problems. Please report such in the category doctools
of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist]. Please
also report any ideas for enhancements you may have for either package
and/or documentation.
When proposing code changes, please provide unified diffs, i.e the out-
put of diff -u.
Note further that attachments are strongly preferred over inlined
patches. Attachments can be made by going to the Edit form of the
ticket immediately after its creation, and then using the left-most
button in the secondary navigation bar.
KEYWORDS
doctoc, doctools, lexer, parser
CATEGORY
Documentation tools
COPYRIGHT
Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>
tcllib 1 doctools::toc::parse(3tcl)