A Propositional Expression Evaluator¶
Racket is good language for writing interpreters, compilers, and other programs that process programming languages. As long as we’re willing to stick languages with a Racket-like syntax, then Racket is often a great choice for such programs.
Lets write some functions that deal with literal boolean propositional
expressions like (t and (not f)). The symbols t and f stand for
true and false respectively, and there are no variables allowed in the
expressions. We are not using the built-in Racket boolean values (#t and
#f) since this is our own little language.
We’ll code two different solutions: one using small functions, and the second
way using the match form. The match approach results in shorter code
that it is easier to read for most programmers.
For the first implementation, we need to define some helper functions to check the top-level format of expressions:
#lang racket
(define (is-true? e) (equal? e 't))
(define (is-false? e) (equal? e 'f))
(define (is-literal? e)
(or (is-false? e)
(is-true? e)))
;; returns true iff lst is a list of length n
(define (nlist? n lst)
(and (list? lst)
(= n (length lst))))
;; (not expr)
(define (is-not? e)
(and (nlist? 2 e)
(equal? 'not (car e))))
;; (p op q)
(define (is-bin-op? e op)
(and (nlist? 3 e)
(equal? op (second e))))
(define (is-and? e) (is-bin-op? e 'and)) ;; (p and q)
(define (is-or? e) (is-bin-op? e 'or)) ;; (p or q)
These are all non-recursive functions that check only the top-level form of an expression; they don’t check if sub-expressions are valid.
Next, using these helper functions, lets write a recursive function that tests if an expression is a valid proposition:
(define (is-expr? e)
(cond
[(is-literal? e) #t]
[(is-not? e) (is-expr? (second e))]
[(is-and? e)
(and (is-expr? (first e))
(is-expr? (third e)))]
[(is-or? e)
(and (is-expr? (first e))
(is-expr? (third e)))]
[else #f]))
Next, lets write an evaluator that calculates the value of a valid proposition:
(define (eval-prop-bool e)
(cond
[(is-literal? e)
(is-true? e)]
[(is-not? e)
(not (eval-prop-bool (second e)))]
[(is-and? e)
(and (eval-prop-bool (first e))
(eval-prop-bool (third e)))]
[(is-or? e)
(or (eval-prop-bool (first e))
(eval-prop-bool (third e)))]
[else #f]))
(define (eval-prop e)
(if (eval-prop-bool e) 't 'f))
For example:
> (eval-prop '((not t) and (t or (not f))))
f
Note that eval-prop-bool returns Racket boolean values, i.e. #t and
#f. We do this so that we can use the built-in Racket functions and,
or, and not. If instead we returned 't and 'f, we’d have to
write special-purpose versions of and, or, and not that work with
't and 'f.
Now lets re-write eval? and eval-prop using the Racket match
form. match makes it easy to do pattern-matching lists, and so we don’t
use the helper functions:
(define (is-expr? e)
(match e
['t #t]
['f #t]
[`(not ,a) (is-expr? a)]
[`(,a or ,b) (and (is-expr? a)
(is-expr? b))]
[`(,a and ,b) (and (is-expr? a)
(is-expr? b))]
[_ #f]))
(define (eval-prop-bool expr)
(match expr
['t #t]
['f #f]
[`(not ,a) (not (eval-prop-bool a))]
[`(,a or ,b) (or (eval-prop-bool a)
(eval-prop-bool b))]
[`(,a and ,b) (and (eval-prop-bool a)
(eval-prop-bool b))]
[_ (error "eval-prop-bool: syntax error")]))
Once you get used to reading match expressions like this, they can be more
readable than the previous code since the expression formats are written
directly in the code.
Notice the use of quasi-quoting, i.e. the backwards ` at the start of
the lists. Recall that quasiquoting lets you use a , inside the list to
mark an expression that should be evaluated.
EBNF: Extended Backus-Naur Notation¶
is-expr? from above is a precise definition of what is, and isn’t, a valid
boolean literal expression. We can say that it defines a small language.
Another way to precisely specify a language is to use Extended Backus-Naur Formalism, or EBNF for short. EBNF is a language designed for precisely defining the syntax (not semantics!) of other languages.
An EBNF “program” consists of a series of named productions, i.e. rules of this general form:
production-name = body .
The idea is that whenever you see production-name in an EBNF rule, you
can replace that name with the body of the corresponding rule. A set of
productions is called a grammar, and grammars are very useful for creating
things like tokenizers and parsers.
We’ll use the Go EBNF notation,
which is designed specifically for being easy to use in text. Among other
details given below, ends production rules with a .. For example, here is
the production that Go uses to define an assignment operator:
assign_op = [ add_op | mul_op ] "=" .
Anything in “”-marks indicates a token or symbol that appears as-is in a Go
program. So in this production, the "=" at the end means a = character
appears in the Go program, while the = that appears near the start is
part of the EBNF language that separates a production’s name from its body.
Two other EBNF operators appear in this production. The | is the
alternation operator, and indicates a choices between two, or more,
alternatives. So add_op | mul_op means you can chose add_op or
mul_up (but not both). Here’s another example:
add_op = "+" | "-" | "|" | "^" .
This production, named add_op, says that an add_op is a +, -,
|, or ^. This defines add_op by simply listing all the possible
alternatives.
The [] operator indicates an optional expression that can occur 0, or
1, times. So the expression [ add_op | mul_op ] means that the entire
expression can appear, or not appear. If it does occur, then, because of the
|, either add_op or mul_op is chosen.
Another EBNF operator is {}, which indicates repetition of 0, or
more, times. For example:
identifier = letter { letter | unicode_digit } .
This describes legal identifiers in Go, i.e. the legal names for things like
variables, functions, structs, types, and packages. It says that an identifier
must start with a letter, and then be followed by 0 or more letters or
digits:
letter = unicode_letter | "_" .
decimal_digit = "0" ... "9" .
In the Go EBNF notation, ... is
an informal short-hand that indicates an obvious pattern. So the
decimal_digit production is understood to be equivalent to this:
decimal_digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" .
The identifier rule lets us check if a sequence of characters is a legal
identifier name. For example, up2 is a legal Go identifier. We can check
this by tracing the identifier production. The first character, u, is
a valid letter, and the next character is a valid letter. The 2 is a
valid digit, and so the entire identifier is valid.
Now consider 4xyz, which is not a valid Go identifier. It starts with
4, but according to the letter production 4 is not a valid
letter. So we know 4xyz cannot be an identifier, because a valid
identifier must start with a letter.
The unicode_letter production allows all characters marked as “Letter” by
Unicode. There are thousands of such characters, so they are not listed
explicitly.
Example: EBNF Rules for Literal Propositional Expressions¶
Now lets write an EBNF grammar, using the Go EBNF notation, to describe the valid boolean
literal expressions defined by (is-expr? e).
Here are the productions:
expr = bool-literal | not-expr | and-expr | or-expr .
bool-literal = "t" | "f" .
not-expr = "(" "not" expr ")" .
and-expr = "(" expr "and" expr ")" .
or-expr = "(" expr "or" expr ")" .
Note that EBNF does not require the productions to appear in any particular order. But, for humans, we usually group related productions, and often put them in the order either from most general to specific.
We should point out that the syntax of this little propositional language is specifically designed to be Racket-like, and so every expression is fully-bracketed. This means we don’t need to define the precedence of operators, since the brackets always make that explicit.
Consider the not-expr production. It is recursive because it expands into
an expression involving expr, which could expand into another
not-expr. This allows us to show that an expression like “(not (not t))”
is valid as follows:
expr = not-expr
= "(" "not" expr ")"
= "(" "not" not-expr ")"
= "(" "not" "(" "not" expr ")" ")"
= "(" "not" "(" "not" bool-literal ")" ")"
= "(" "not" "(" "not" "t" ")" ")"
Each step expands one part of the expression, and, by making the right choice of rules, eventually we get the exact expression we want. If you want to automate this process, you could write a parser to do this.
Comparing this EBNF grammar to the implementation of is-expr?, you can
see that it essentially mimics the EBNF grammar.
Example: EBNF for cond Expressions¶
Here’s a simple EBNF description for a Racket cond expression:
cond-expr = "(" "cond" { test-val } [else-val] ")" .
test-val = "(" bool value ")" .
else-val = ("else" | "#t") value .
bool = any Racket_ expression that evaluates to #t or #f .
value = any Racket_ expression that evaluates to a value .
Note that the brackets in the else-val production are not in “-marks.
Unquoted round brackets are used for grouping in Go’s EBNF, and in
else-val indicate that the choice is this:
else-val = ("else" | "#t") value . // right
else-val = "else" | ( "#t" value ) . // wrong
else-val = "else" | "#t" value . // ambiguous
This is not the official definition of a Racket cond. See the
documentation on if-statements if you want to know the
official syntax. You’ll see also that Racket uses a different notation for
describing syntax, but it is conceptually very similar to the notation we’re
using here.
Example: nand-only Expressions¶
In propositional logic, the nand operator is a standard logical operator.
The expression (p nand q) is true just when either p, q, or both
p and q, are false. In other words, (p nand q) is equivalent to
(not (p and q)).
Here are a couple of EBNF descriptions of this nand-only language. First, we could use two productions:
expr = "t" | "f" | nand .
nand = "(" expr "nand" expr ")" .
This is simple enough that to be combined into a single rule that is still quite readable:
expr = "t"
| "f"
| "(" expr "nand" expr ")" .
Both grammars are fine, and describe the same language. Note that the second one relies partly on formatting to help avoid ambiguity, i.e. each of the 3 lines in the production describe one kind of string.
Now let’s implement some helper functions based on this grammer. First, here
is a function that returns #t just when expr is a valid nand-only
expression:
(define (is-nand? expr)
(match expr
['t #t]
['f #t]
[`(,a nand ,b) (and (is-nand? a)
(is-nand? b))]
[_ #f]))
And here’s a function that evaluates a nand-only expression:
;; evaluate a literal propositional logic expression using nand
;; as the only connective
(define (eval-nand expr)
(match expr
['t #t]
['f #f]
[`(,a nand ,b) (not (and (eval-nand a)
(eval-nand b)))]
[_ (error "eval-nand: syntax error")]))
The advantage of this propositional language is that its processing functions
are simpler than the equivalent functions above for expressions that allow
and, or, and not.
Simplifying a Propositional Expression¶
Some propositional expressions can be simplified into logically equivalent
expressions. For example, an expression of the form (not (not p)) is
logically equivalent to p.
Lets write an expression simplifier that applies this one particular rule (double negation elimination) to expressions:
;; double negation elimination
;; (not (not expr)) <==> expr
(define (simplify expr)
(match expr
['t 't]
['f 'f]
[`(not (not ,a)) (simplify a)]
[`(not ,a) (list 'not (simplify a))]
[`(,a or ,b) (list (simplify a) 'or (simplify b))]
[`(,a and ,b) (list (simplify a) 'and (simplify b))]
[_ (error "simplify: syntax error")]))
> (simplify '((not (not t)) and (not (not t))))
(t and t)
The match form makes this kind of function relatively easy to write; in
particular, the expression `(not (not ,a)) is a simple way of specify a
double-negation pattern. Notice also that the order in which the matching is
done is important: we have to check for (not (not a)) before (not a).
Rewriting an Expression with Only nand¶
An interesting and useful fact about propositional logic is that any
expression can be re-written as a logically equivalent expression that uses
only the nand operator. As mentioned above, the expression (p nand q)
is logically equivalent to (not (p and q)).
Here are the rules for writing the other propositional operators in terms of
nand:
(not p)is logically equivalent to(p nand p)(p and q)is logically equivalent to((p nand q) nand (p nand q))(p or q)is logically equivalent to((p nand p) nand (q nand q))
By repeatedly apply these rules we can transform any propositional expression
into a logically equivalent one using only nand.
Here we will allow variables, like p or q, in the expressions we want
to simplify.
Here’s some code that does the transformation:
(define (make-nand a b) (list a 'nand b))
;; returns a propositional logic expression into a logically equivalent one
;; that uses only nand
(define (nand-rewrite expr)
(if (symbol? expr) expr
(match expr
['t 't]
['f 'f]
[`(not ,a) (let ([na (nand-rewrite a)])
(make-nand na na))]
[`(,a or ,b) (let* ([na (nand-rewrite a)]
[nb (nand-rewrite b)]
[nana (make-nand na na)]
[nbnb (make-nand nb nb)])
(make-nand nana nbnb))]
[`(,a and ,b) (let* ([na (nand-rewrite a)]
[nb (nand-rewrite b)]
[nanb (make-nand na nb)])
(make-nand nanb nanb))]
[_ (error "nand-rewrite syntax error")])))
> (nand-rewrite '((p and q) or ((not p) or q)))
'((((p nand q) nand (p nand q)) nand ((p nand q) nand (p nand q)))
nand
((((p nand p) nand (p nand p)) nand (q nand q)) nand (((p nand p) nand (p nand p)) nand (q nand q))))
Example: Rewriting Racket Code¶
After you’ve written a few Racket programs, you may have noticed that these two expressions are equivalent:
(cond (test val1)
(else val2))
;; same as
(if test val1 val2)
The if-expression is a little simpler and easier to read, so it’s often the
preferred form. But when writing code you don’t always know for sure if you
have only one test, and so it’s wise to use cond everywhere. After your
program is done you could go back and re-write any single-condition cond
expressions as an equivalent if-expression, but that is tedious and
error-prone.
Instead, lets write a Racket function that automatically replaces occurrences
of a simple cond with an equivalent if. Here’s a function that does
that:
;; Deeply change all occurrences of (cond (test val1) (else val2)) to
;; (if test val1 val2)
(define (rewrite-simple-cond expr)
(if (not (list? expr)) expr
(match expr
[`(cond (,test ,val1)
(else ,val2)) (map rewrite-simple-cond
(list 'if test val1 val2))]
[`(cond (,test ,val1)
(#t ,val2)) (map rewrite-simple-cond
(list 'if test val1 val2))]
[_ (map rewrite-simple-cond expr)])))
(rewrite-simple-cond expr) first checks if expr is a list; if it’s
not a list, then expr is returned unchanged. If expr is a list, then
match checks to see if has the form of a simple cond; there are two
cases depending on whether the else-clause uses else or #f. If
expr is not a simple cond, then we use map to call
rewrite-simple-cond on each value in expr.
When expr is a simple cond, it is returned as an if-statement. We call
rewrite-simple-cond on val1 and val2 using a small trick:
rewrite-simple-cond is called on every element of the resulting if-list.
We know from how rewrite-simple-cond works that the if symbol at the
start will be returned as 'if. We could have written it like this
instead:
(list 'if (rewrite-simple-cond test)
(rewrite-simple-cond val1)
(rewrite-simple-cond val2))
However, the map version is shorter and simpler.
Example: Binary Search Trees¶
The following is a sample Racket implementation of some basic binary search tree functions. The tr
#lang racket
;;; bst.rkt
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; An implementation of a binary search tree (BST). Nodes are lists of the
;; form (value left-tree right-tree).
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; '() is the empty tree
(define (is-empty? t) (null? t))
;; getters
(define (value node) (first node))
(define (left node) (second node))
(define (right node) (third node))
;; insert value x into tree t
(define (insert x t)
(if (is-empty? t)
(list x '() '())
(let ([V (value t])
[L (left t])
[R (right t]))
(cond
[(is-empty? t)
(list x '() '())]
[(= x V) t];; x already in the tree, so do nothing
[(< x V)
(list V (insert x L) R)]
[else
(list V L (insert x R))]
))))
;; convert a list of things into a tree
(define (make-tree lst)
(if (empty? lst)
'()
(insert (first lst) (make-tree (rest lst)))))
;; return a list of the values of tree in "inorder" order, i.e. for
;; a BST the list of values will be in ascending sorted order
(define (list-inorder t)
(cond
[(is-empty? t) '()]
[else (append (list-inorder (left t))
(list (value t))
(list-inorder (right t)))]))
(define (tree-sort lst) (list-inorder (make-tree lst)))
;; Returns the depth of a tree
(define (depth t)
(cond
[(is-empty? t) 0]
[else
(+ 1 (max (depth (left t)) (depth (right t))))]))
;; return the max value in a tree
(define (tree-max t)
(cond
[(is-empty? t) 'error]
[(is-empty? (right t))
(value t)]
[else
(tree-max (right t))]))
;; test if x is a value in tree t
(define (contains x t)
(cond
[(is-empty? t) #f]
[(= x (value t)) #t]
[(< x (value t))
(contains x (left t))]
[else
(contains x (right t))]))