RU beehive logo promo banner for Computing & Info Sciences
CS 420
2024fall
ibarland

Definitions
sets; FSMs

READING: Chpt.03

Previously, on itec420…

Examples of regular-expressions: For b(ba*b)* :

def'n: A Language is: a

(possibly infinite)
set
of
(finite)
strings
.

Draw FSM for b(ba*b)* (and minimize it, informally).

Defining Finite State Machines

Generalize: A FSM = ⟨K, Σ, δ, s, A⟩. In math:

In Java:
typedef State String; // okay, this isn't legal Java, but fine: we're representing States by Strings (their name); using `int` is common too. (Really: we should have `class FSM<StateType>`.)

class FSM {
   Set<State> K;
   CharSet Σ;
   State s;      // @pre: K.contains(s)
   Set<State> A; // @pre: K.containsAll(A)
   Map< Pair<State,Character>, State> δ; // @pre States are all in K; Characters are all in Σ.
   // We could also use java.function.BiFunction
  }
The math people sure have more succinct notation, to convey the same info!

Example: All equivalent:

Def'n of FSM "accepts" a string: (informal for now): Feed a string w into the FSM M (starting at s); if after transitioning on all characters of w, M is in a state in A, then we say "M accepts w". (more formal def'n will use terms "configuration of M" and a sequencek-of-configurations, "a computation".)
Def'n: For a FSM M, we say "L(M)" is the language it *accepts*; we also say M "recognizes" L. E.g. We just gave a FSM which recognized b(ba*b)* (and even: that machine computes ).

Sets

Common Sets: ∅, {false,true}, ℕ, ℤ, ℚ, ℝ, ℂ; Σ*

Sample sets to work with:

L₀ = {}
L₁ = {17}
L₂ = {vw, saab, bmw}
L₃ = {1,2,3,4,5,6}
P  = prime numbers
L₄ = state-abbreviations

L₅ = strings over {a,b}* where every 'a' is followed by a 'b'
L₆ = b*aa* = {a, aa, ba, aaa, baa, bba, …, bbbaa, … }
L₇ = b*a*b* = { ε,   a, b,    aa, ab, ba, bb,    aaa, aab, abb, baa, bab, bba, bbb,       …}
L₈ = b(ba*b)*

How to make new sets out of old ones:

Let's practice a bit:

Okay, some more ways to make new sets out of old ones:

  1. function-space: A → B means:
    the set of all possible functions with domain A, and codomain B.

    For example, Σ* → ℕ is the set of all functions taking a string and returning a number; string-length is one member of this set.
    We say that Σ* → ℕ is string-length's signature. (When you write a function in Java, you must give its signature — albeit in Java, not math.)

    What about the signature of substring?

    Sigma* x ℕ x ℕ → Σ*
    (hint: we can say that its input-type is one thing: a tuple of size three.)

  2. concatenation (for sets-of-strings): AB w∈AB means w=ab where a∈A and b∈B
    examples: L₄L₂ = { vavw, vasaab, utbmw, … }; there are 50*3 elements in this case; could be fewer in general!
    L₆L₇ = { a, aa, ba, ab, aaa, … }
  3. powers (for sets-of-strings1): A² = AA, and A³ = AAA = AA², and Aⁿ = AAA…A = AAⁿ⁻¹.
  4. Kleene Star, A*: $$\BigCup_{i \in ℕ}A\sup{i}$$

Questions, for various sets created from the ones listed above: No. of elements? Contains ε (the empty-string)?

Exercise (challenge): Find two sets-of-strings A,B such that |AB| < |A| * |B| Answers:

Define: lexicographic order
Sort by length, and within length sort alphabetically. E.g. {a,b}* = {

ε
,
a
,
b
,
aa
,
ab
,
ba
,
bb
,
aaa
,
aab
,
aba
, …
}

We use Lexicographic order because, unlike alphabetical order, the enumeration actually hits every string. E.g. aba is the 10th string above. But if we'd tried to order the set as ε, a, aa, aaa, aaaa, … then aba wouldn't show up as the 10th or the 99th or the 10-billionth entry — it wouldn't be covered at any finite index!


1 Note that for sets-of-non-strings, we'll take “multiplication” to be the cartesian cross-product (⨯), not string-concatenation. So ℝ² = ℝ⨯ℝ (think: points on the plane), and ℝⁿ = ℝ⨯ℝ⨯…⨯ℝ.      

logo for creative commons by-attribution license
This page licensed CC-BY 4.0 Ian Barland
Page last generated
Please mail any suggestions
(incl. typos, broken links)
to ibarlandradford.edu
Rendered by Racket.