![]() |
![]() |
|
Definition: A regular expression (over alphabet Σ) is:
Note how the first three cases are base-cases, and the last five are recursively defined. (Also: note that rules 5 and 2 are redundant: can be expressed as syntactic sugar.)
If you give me a regexp, I can create its parse tree:
ab*(aa|bc+) concat / \ a concat / \ * paren | / | \ b ( ∪ ) / \ concat concat / \ / \ a a b + | cNote: | is low-precedence.
Although we won't worry about a regexp's parse tree in this class, it will be the basis of any proofs-by-induction: we can induct on height-of-tree, or (more generally) "Structural Induction": E.g. show P(αβ) holds, given the inductive hypothesis P(α) and P(β) hold. (So a proof-by-structural induction for regexps will have 3 base cases, and 5 recursive cases.) The Principle of Structural Induction is equivalent to Mathematical Induction (see structural def'n of ℕ); it dispenses with the need to artificially shoe-horn tree processing into height-of-tree processing.
Proof by Structural induction on γ (8 cases; only some are shown):
By inductive-hypothesis, there are FSMs with Mα, Mβ where L(α)=L(Mα) and L(β)=L(Mβ).
We construct a NDFSM Mγ, out of Mα and Mβ: [SEE BOARD]. Mγ = ⟨Kγ, Σγ, sγ, Δγ, Aγ⟩
Now we need to argue that our construction guarantees L(γ) = L(Mγ):
See board. Sketch: a computation from <s_gamma,w> in M_gamma must reach an end-state of a_a, then epsilon-transit so s_b, then reach a_b in A_b. break the computation down in to the sections u, epsilon, v. So we have w=uεv=uv where u,v in L(a),L(b) respl., w in L(a)L(b) by def'n of concat-langs.
If α is regular, then α* is regular. Again, we can construct M' based on M. We can express this either in math, or in code:
Given a machine M1 = ⟨K1, Σ, s1, Δ1, A1 ⟩,
construct M0 from it:
[see slide 53]
def KleeneStarOf( M1 ): """Return a NDFSM M0 which accepts L1*, given a NDFSM M1 which accepts L1.""" # first, extract & name the elements inside the tuple M1 (python's pattern-matching) (K1, Σ, Δ1, s1, A1) = M1 s0 = "a new state, not in K1" #(could also just concat all statenames of K1 and add then a letter) K0 = union(K1, set(s0)) A0 = union(A1, set(s0)) Δ0 = set(Δ1) # a *non*frozen set, for the moment Δ0.add( (s0, ε, s1) ) for s in A1: Δ0.add( (s, ε, s0) ) M0 = (K0, Σ, frozenset(Δ0), s0, A0) return M0 # This function accepts and returns a FSM, where # type “FSM” is: tuple<set<state>, set<char>, set<tuple<state,char,state>>, state, set<state>> # # where: type `state` is (say) a string, # and by "char" we mean a string-of-length-one *or* ε, where: ε = “just a sentinel-value representing the empty transition” |
This page licensed CC-BY 4.0 Ian Barland Page last generated | Please mail any suggestions (incl. typos, broken links) to ibarland ![]() |