Regular expressions and grammars¶

Module tock.regexps¶

class tock.regexps.RegularExpression(op, args, start=None, end=None)¶

A (abstract syntax tree of a) regular expression.

Parameters:

op (str) – Possible values are: ‘union’, ‘concatenation’, ‘star’, ‘symbol’.
args – tuple of RegularExpression or Symbol objects.

The empty string is represented as RegularExpression(‘concatenation’, ()).

The empty set is represented as RegularExpression(‘union’, ()).

classmethod from_str(s)¶

Constructs a RegularExpression from a str.

Regular expressions allow the following operations (precedence from lowest to highest):

Symbols
Empty string (&)
Empty set (∅)
Union (|)
Concatenation: Two concatenated symbols must be separated by a space. For example, a b is a concatenated with b, but ab is a single symbol.
Kleene star (*)

tock.regexps.from_regexp(e, display_steps=False)¶

Convert a regular expression to a NFA.

Parameters:

e (RegularExpression or str) – the regular expression to convert.
display_steps (bool) – if True and if run inside a Jupyter notebook, displays all steps of the conversion.

tock.regexps.to_regexp(m, display_steps=False)¶

Convert a finite automaton to a regular expression.

Parameters:

m (Machine) – the automaton to convert, which must be a finite automaton.
display_steps (bool) – if True and if run inside a Jupyter notebook, displays all steps of the conversion.

Module tock.grammars¶

class tock.grammars.Grammar¶

A string-rewriting grammar.

add_nonterminal(x)¶: Add x to the nonterminal alphabet.

add_rule(lhs, rhs)¶: Add rule with left-hand side lhs and right-hand side rhs, where lhs and rhs are both Strings.

compute_first(nullable=None)¶: Compute, for every terminal, nonterminal, and rhs suffix α, the set of terminals b where α ⇒* b γ for some γ.

compute_follow(nullable=None, first=None)¶: Compute, for every nonterminal A, the set of terminals b where S →* γ A b δ for some γ, δ.

compute_nullable()¶: Compute, for every nonterminal and rhs suffix α, whether α ⇒* ε.

classmethod from_file(filename)¶

Read a grammar from a file.

Parameters:: filename (str) – name of file to read from
Returns:: a CFG
Return type:: Grammar

The file should contain one rule per line, for example:

S -> a S b
S -> &

Currently the grammar must be a context-free grammar. The nonterminal symbols are exactly those that appear on a left-hand side. The left-hand side of the first rule is the start symbol.

classmethod from_lines(lines)¶

Read a grammar from a list of strings (see from_file).

Parameters:: lines – a list of strings
Returns:: a CFG
Return type:: Grammar

has_strict_start()¶: Returns True iff the start nonterminal does not appear in the rhs of any rule. I don’t know what the correct terminology for this is.

is_contextfree()¶: Returns True iff the grammar is context-free.

is_contextsensitive()¶

Returns True iff the grammar is context-sensitive, that is, each rule is of the form α A β → α B β where one of the following is true:

A is a nonterminal and len(B) > 0
A = S, α = β = B = ε, and S does not occur on the rhs of any rule

is_leftlinear()¶: Returns True iff the grammar is left-linear, that is, it is context-free and every rule is of the form A → B w or A → w where w contains only terminals.

is_noncontracting()¶

Returns True iff the grammar is essentially noncontracting, that is, each rule is of the form α → β where one of the following is true:

len(β) ≥ len(α)
α = S, β = ε, and S does not occur on the rhs of any rule

is_rightlinear()¶: Returns True iff the grammar is left-linear, that is, it is context-free and every rule is of the form A → w B or A → w where w contains only terminals.

remove_useless()¶: Returns a new grammar containing just useful rules.

set_start_nonterminal(x)¶: Set the start symbol to x. If x is not already a nonterminal, it is added to the nonterminal alphabet.

tock.grammars.from_grammar(g, mode='topdown')¶

Convert a CFG to a PDA.

Parameters:

g (Grammar) – the grammar to convert, which must be a CFG.
mode (str) –
selects which algorithm to use. Possible values are:
- "topdown": nondeterministic top-down, as in Sipser (3e) Lemma 2.21.
- "bottomup": nondeterministic bottom-up.
- "ll1": LL(1) deterministic top-down.
- "lr0": LR(0) deterministic bottom-up.
- "lr1": LR(1) deterministic bottom-up.

Returns:

a PDA equivalent to g.

Return type:

Machine

Regular expressions and grammars¶

Module tock.regexps¶

Module tock.grammars¶

tock

Navigation

Related Topics