README.md 4.5 KB

ll1

ll1 is a tool to parse and check LL(1) specifications, and to generate code or reports using Go templates based on these specifications. ll1 specifications must contain a definition for an ll1 grammar, and may optionally also specify a lexer for that grammar.

Usage

ll1 [options] input_file.ll1 [template_file.ext*]

The [options] are:

-append file
    Name of output file to append. Takes precedence over -out.
-define definition
    Add a definition for the template, in the form of key:value or
    []key:value. Keys that start with a [] are arrays and can be
    concatenated to by specifying the same definition key again.
    Non array keys will be overwoitten if they are specified again.
-help
    Shows the help page.
-out file
    Name of output file to overwrite.
-template file
    Template file to expand. This may be repeated to make use
    of several templates to generate one output file.
-verbose
    Be more verbose. Shows the scanned tokens as well.

The names of template files may be given with the -t option, or after the ll1 input file.

Syntax

The syntax of an LL1 grammar itself is:

Specification -> Grammar OptLexer.
Grammar -> Rules.
Rules -> Rule OptRules .
OptRules -> dot Rules | epsilon.
Rule -> Name arrow Definition Template.
Name -> ruleName .
Template -> rawString | epsilon .
// Alternates consist of sequences.
Definition -> Alternates .
Alternates -> Sequence OptSequences .
OptSequences -> or Alternates | epsilon.
Sequence -> Element OptElements .
OptElements -> Element OptElements | epsilon .
Element -> Parenthesis .
Element -> Name .
Element -> literal .
Parenthesis -> '(' Definition ')' .
OptLexer -> LexerTerminal OptLexerTerminals | epsilon .
LexerTerminal -> terminalName arrow LexerDefinition Template .
LexerDefinition -> LexerAlternates .
LexerAlternates -> LexerPattern OptLexerMatches .
OptLexerMatches -> or LexerPattern | epsilon.
LexerPattern -> literal .
OptElements -> Element OptElements | epsilon .
Element -> Parenthesis .
Element -> Name .
Element -> literal /*| Rule */.
// Lexer specification starts here:
dot          -> '.'
or           -> '|'
literal      -> characterLiteral | stringLiteral
ruleName     -> "re:[[:isUpper:]][[:isAlNum]]"
terminalName -> "re:[[:isLower:]][[:isAlNum]]"
epsilon      -> "epsilon" | 'ε'
arrow        -> "->" | '→'

The syntax of an ll1 grammar has the following elements:

  • //comment : Line comments start with //, /block comments/ are C-like
  • RuleName : names that start with an upper case letter are rule names or nonterminals defined by the grammar.
  • terminal : names that start with a lower case letter are names of teminals that the lexer produces.
  • 'l' : single quoted strings are rune literals that the lexer produces.
  • "literal" : double quoted strings are rune literals that the lexer produces.
  • arrow : a literal -> → as a separator.
  • epsion : a literal "epsilon" or 'ε', which indicates the empty rule. this is used in conjunction with alternates to make a rule optional.

Templates

If no templates are given, ll1 simply checks the grammar and outputs a simple text report to the output file.

If a template is given, it will be expanded and output to the output file.

Inside the template the following variables are available:

  • .Grammar: contains the .Rules of the grammar.
  • .InName: contains the name of the ll1 input file.
  • .OutName: contains the name of the output file specified with -a or -o.
  • .Templates: contains the names of the templates read.
  • .Definitions: contains the keys of the available definitions.
  • All other variables defined with -d

Inside the ll1 templates, the following template functions are available:

  • Most functions from the strings package (see go doc strings).
  • CompileRegexp compiles a regexp package regexp which can be used as such.
  • ToString to convert anything anything that isn't a string to a string.
  • NewMap creates a map based it's argumens wich have string keys and interface{} values This is handly to pass multiple aruments to a sub-template
  • NewList creates a list from the given arguments.

Follow conflict example:

A -> B | C .

B -> D e . C -> e . D -> f | epsilon .

Since D is optional and can be empty, this parser cannot decide between B and C, because in case D is empty, both sides match.