Hoa central
Static Public Member Functions | List of all members
Hoa\Compiler\Llk\Llk Class Reference

Static Public Member Functions

static load (Stream\IStream\In $stream)
 
static parsePP ($pp, &$tokens, &$rules)
 

Detailed Description

Class .

Provide a generic LL(k) compiler compiler using the PP language. Support: skip (skip), token (token), token namespace (ns1:token name value -> ns2), rule (rule:), disjunction (|), capturing (operators ( and )), quantifiers (?, +, * and {n,m}), node (#node) with options (#node:options), skipped token (::token::), kept token (<token>), token unification (token[i]) and rule unification (rule()[j]).

Definition at line 56 of file Llk.php.

Member Function Documentation

static Hoa\Compiler\Llk\Llk::load ( Stream\IStream\In  $stream)
static

Load parser from a file that contains the grammar. Example: skip space

token word [a-zA-Z]+ token number [0-9]+(.[0-9]+)? token open_par ( token close_par ) token equal = token plus + token minus - token divide \/ token times *

#equation: formula() ::equal:: <number>

formula: factor() ( ::plus:: formula() #addition | ::minus:: formula() #substraction )?

factor: operand() ( ::times:: factor() #product | ::divide:: factor() #division )?

operand: <word> | ::minus::? <number> #number | ::open_par:: formula() ::close_par::

Use tabs or spaces, it does not matter. Instructions follow the form: %<instruction>. Only skip and token are supported. Rules follow the form: <rule name>="">:<new line>="">[<space><rule><new line>="">]*. Contexts are useful to set specific skips and tokens. We give a full example with context + unification (for fun) to parse b: skip space token lt < -> in_tag token inner [^<]*

skip in_tag:space token in_tag:slash / token in_tag:tagname [^>]+ token in_tag:gt > -> default

#foo: ::lt:: <tagname[0]> ::gt:: <inner> ::lt:: ::slash:: ::tagname[0]:: ::gt::

Parameters
\Hoa\Stream\IStream\In$streamStream that contains the grammar.
Returns
Exceptions

Definition at line 120 of file Llk.php.

121  {
122  $pp = $stream->readAll();
123 
124  if (empty($pp)) {
125  $message = 'The grammar is empty';
126 
127  if ($stream instanceof Stream\IStream\Pointable) {
128  if (0 < $stream->tell()) {
129  $message .=
130  ': the stream ' . $stream->getStreamName() .
131  ' is pointable and not rewinded, maybe it ' .
132  'could be the reason';
133  } else {
134  $message .=
135  ': nothing to read on the stream ' .
136  $stream->getStreamName();
137  }
138  }
139 
140  throw new Compiler\Exception($message . '.', 0);
141  }
142 
143  static::parsePP($pp, $tokens, $rawRules);
144 
145  $ruleAnalyzer = new Rule\Analyzer($tokens);
146  $rules = $ruleAnalyzer->analyzeRules($rawRules);
147 
148  return new Parser($tokens, $rules);
149  }
static Hoa\Compiler\Llk\Llk::parsePP (   $pp,
$tokens,
$rules 
)
static

Parse PP.

Parameters
string$ppPP.
array$tokensExtracted tokens.
array$rulesExtracted raw rules.
Returns
void
Exceptions

Definition at line 160 of file Llk.php.

161  {
162  $lines = explode("\n", $pp);
163  $tokens = ['default' => []];
164  $rules = [];
165 
166  for ($i = 0, $m = count($lines); $i < $m; ++$i) {
167  $line = rtrim($lines[$i]);
168 
169  if (0 === strlen($line) || '//' == substr($line, 0, 2)) {
170  continue;
171  }
172 
173  if ('%' == $line[0]) {
174  if (0 !== preg_match('#^%skip\s+(?:([^:]+):)?([^\s]+)\s+(.*)$#u', $line, $matches)) {
175  if (empty($matches[1])) {
176  $matches[1] = 'default';
177  }
178 
179  if (!isset($tokens[$matches[1]])) {
180  $tokens[$matches[1]] = [];
181  }
182 
183  if (!isset($tokens[$matches[1]]['skip'])) {
184  $tokens[$matches[1]]['skip'] = $matches[3];
185  } else {
186  $tokens[$matches[1]]['skip'] =
187  '(?:' . $matches[3] . ')|' .
188  $tokens[$matches[1]]['skip'];
189  }
190  } elseif (0 !== preg_match('#^%token\s+(?:([^:]+):)?([^\s]+)\s+(.*?)(?:\s+->\s+(.*))?$#u', $line, $matches)) {
191  if (empty($matches[1])) {
192  $matches[1] = 'default';
193  }
194 
195  if (isset($matches[4]) && !empty($matches[4])) {
196  $matches[2] = $matches[2] . ':' . $matches[4];
197  }
198 
199  if (!isset($tokens[$matches[1]])) {
200  $tokens[$matches[1]] = [];
201  }
202 
203  $tokens[$matches[1]][$matches[2]] = $matches[3];
204  } else {
205  throw new Compiler\Exception(
206  'Unrecognized instructions:' . "\n" .
207  ' %s' . "\n" . 'in file %s at line %d.',
208  1,
209  [
210  $line,
211  $stream->getStreamName(),
212  $i + 1
213  ]
214  );
215  }
216 
217  continue;
218  }
219 
220  $ruleName = substr($line, 0, -1);
221  $rule = null;
222  ++$i;
223 
224  while ($i < $m &&
225  isset($lines[$i][0]) &&
226  (' ' === $lines[$i][0] ||
227  "\t" === $lines[$i][0] ||
228  '//' === substr($lines[$i], 0, 2))) {
229  if ('//' === substr($lines[$i], 0, 2)) {
230  ++$i;
231 
232  continue;
233  }
234 
235  $rule .= ' ' . trim($lines[$i++]);
236  }
237 
238  if (isset($lines[$i][0])) {
239  --$i;
240  }
241 
242  $rules[$ruleName] = $rule;
243  }
244 
245  return;
246  }

The documentation for this class was generated from the following file: