Statement: lexicon
The lexicon statement defines a lexicon which can later be incorporated into a grammar definition. It describes a series of text tokens and their equivalent numerical values.
The lexicon must be given a unique (within all other lexicons) code name by which it can be referred to. It then must contain the following sub-statements in between '{' and '}' braces.
lexicon Sub-Statement List | ||
---|---|---|
Sub-Statement | Use | Example |
name | Give the lexicon a text description. | name "Month names"; |
inherit | Inherit the settings and tokens from previously defined lexicon. | inherit hm; |
fieldname | The field name normally associated with this lexicon. | fieldname month; |
lang | The language used. (Not in use yet.) | lang en; |
pseudo | The text used by by the pseudo date to describe the lexicon. | pseudo Month, Mon"; |
tokens | A list of values, tokens and an optional abbreviation. | tokens {1, January, Jan; ...} |
Sub-Statement: name
Gives a descriptive name to the lexicon. Used to provide a list of available lexicons to the host program.
Sub-Statement: inherit
This allows a lexicon to be extended with additional tokens or selectively replaced. Other details, such as the fieldname sub-statement, may be replaced.
Sub-Statement: fieldname
This statement links the lexicon to a particular field name for a given scheme. This can be used when deciphering text formats as the position of these elements does not need to be known in advance.
Sub-Statement: lang
This statement sets the language for the lexicon. Multi-lingual versions of the program have not yet been developed, so this is available for the future.
Sub-Statement: pseudo
The pseudo text is used by a format definition when creating the pseudo date used to show the intended output. The pseudo date is the text shown when selecting the output format, as in the string "dd Mon yyyy" for the dmy format. The 'Mon' in that string indicates that output will use the abbreviated value from the lexicon.
Sub-Statement: tokens
A list of the value and token names. A second token for a value may be given to be used as an abbreviation. Additional values may be added after the abbreviation to provide alternative names or spellings which will be recognised.
Each token takes the form of a comer separated list of the value, token, optional abbreviaton terminated with a ';' semi-colon. The list of tokens is enclosed within '{' '}' braces.
If the token name does not conform to a script name token, then they must be enclosed in '"' double quotes.
Example Script |
---|
lexicon w { name "Weekday names"; fieldname wday; lang en; pseudo Weekday, WDay; tokens { 1, Monday, Mon; 2, Tuesday, Tue; 3, Wednesday, Wed; 4, Thursday, Thur; 5, Friday, Fri; 6, Saturday, Sat; 7, Sunday, Sun; } } |