![]() |
![]() |
|
home—lectures—exams—hws—breeze (snow day)
Specifying Languages: syntax vs semantics Syntax: a grammar G1 (mongo the caveman's grammar): S -> N V N . N -> cat | ufo | apple | fox | dog | hotdog | ipod | itec V -> bite | run | jump | hit | love | spin | hear What is a sentence we can derive from S? What's one that we can't? G1 is a bit mongo-ish; how to add ... articles, plurals, and maybe some verb conjugation? Here's G2: S -> NP CV NP . NP -> ART N ART -> a | an | the CV -> V s | V ed N -> cat | ufo | apple | fox | dog | hotdog | ipod | mongo | itec V -> bite | run | jump | hit | love | spin | hear [think "Sentence", "Verb Phrase", "Conjugated Verb", etc.] [Technically, I'm leaving out spaces from the grammar. How to fix?] syntax allows "colorless green ideas sleep furiously"; the semantics is a different issue. [example by Chomsky] Some terms: "nonterminal", "terminal", | is 'or', -> is 'is' Book also uses the terms: token (nonterminal 'words'/punctuation of the grammar), lexeme (a concrete 'word'). Example: "DoubleLiteral" and "BinaryOp" are a tokens; "173.4" and "*" are lexemes. (I've always heard 'token' used for what the book calls lexemes, myself.) Derivations and trees (equivalent ways, essentially): (a) Here is a derivation of a cat bites the ufo .S -> NP CV NP . -> ART N CV NP . -> a N CV NP . -> a N CV ART N . (observe: we replace *any* non-terminal) -> a N CV ART ufo . -> a N CV the ufo . ... -> a cat bites the ufo .The same thing, as a tree: [draw] What is an example of something which does *not* meet the above grammar? Def'n: parsing: reading in a stream of tokens, and figuring out if it meets the grammar (building up a parse tree). We'll deal with parsing later; right now we'll keep doing it instinctively. What if we add further rules? G3 will be G2, plus the rules: Exercise: add adverbs and adjectives. ADJ -> quick | lazy | brown | hot ADV -> ADJ ly and changing G2's rules for `S` to be: S -> NP CV NP . | NP V NP ADV. What is a grammar for... - declaring a Java identifier (w/o init'ing, but including access modifiers) - a Java identifier? - a Java assignment statement? (we'll leave "expr" not fully defined) See: Java syntax for 120 (which is incomplete and technically wrong in some places), and BNF for java Some limitations of CFG -- see ITEC420 (computability th'y)! Also, if we place limitations on the rules, we end up with "regular grammars", which capture exactly the same thing as regular expressions. (A separate issue, btw: By constraining our grammar we can get nicer guarantees of algorithmic behavior -- parsing etc.)
home—lectures—exams—hws—breeze (snow day)
©2011, Ian Barland, Radford University Last modified 2011.Oct.05 (Wed) |
Please mail any suggestions (incl. typos, broken links) to ibarland ![]() |
![]() |