A Propositional Expression Evaluator¶
Racket is good language for writing interpreters, compilers, and other programs that process programming languages. As long as we’re willing to stick languages with a Racket-like syntax, then Racket is often a great choice for such programs.
Lets write some functions that deal with literal boolean propositional
expressions like (t and (not f))
. The symbols t
and f
stand for
true and false respectively, and there are no variables allowed in the
expressions. We are not using the built-in Racket boolean values (#t
and
#f
) since this is our own little language.
We’ll code two different solutions: one using small functions, and the second
way using the match
form. The match
approach results in shorter code
that it is easier to read for most programmers.
For the first implementation, we need to define some helper functions to check the top-level format of expressions:
#lang racket
(define (is-true? e) (equal? e 't))
(define (is-false? e) (equal? e 'f))
(define (is-literal? e)
(or (is-false? e)
(is-true? e)))
;; returns true iff lst is a list of length n
(define (nlist? n lst)
(and (list? lst)
(= n (length lst))))
;; (not expr)
(define (is-not? e)
(and (nlist? 2 e)
(equal? 'not (car e))))
;; (p op q)
(define (is-bin-op? e op)
(and (nlist? 3 e)
(equal? op (second e))))
(define (is-and? e) (is-bin-op? e 'and)) ;; (p and q)
(define (is-or? e) (is-bin-op? e 'or)) ;; (p or q)
These are all non-recursive functions that check only the top-level form of an expression; they don’t check if sub-expressions are valid.
Next, using these helper functions, lets write a recursive function that tests if an expression is a valid proposition:
(define (is-expr? e)
(cond
[(is-literal? e) #t]
[(is-not? e) (is-expr? (second e))]
[(is-and? e)
(and (is-expr? (first e))
(is-expr? (third e)))]
[(is-or? e)
(and (is-expr? (first e))
(is-expr? (third e)))]
[else #f]))
Next, lets write an evaluator that calculates the value of a valid proposition:
(define (eval-prop-bool e)
(cond
[(is-literal? e)
(is-true? e)]
[(is-not? e)
(not (eval-prop-bool (second e)))]
[(is-and? e)
(and (eval-prop-bool (first e))
(eval-prop-bool (third e)))]
[(is-or? e)
(or (eval-prop-bool (first e))
(eval-prop-bool (third e)))]
[else #f]))
(define (eval-prop e)
(if (eval-prop-bool e) 't 'f))
For example:
> (eval-prop '((not t) and (t or (not f))))
f
Note that eval-prop-bool
returns Racket boolean values, i.e. #t
and
#f
. We do this so that we can use the built-in Racket functions and
,
or
, and not
. If instead we returned 't
and 'f
, we’d have to
write special-purpose versions of and
, or
, and not
that work with
't
and 'f
.
Now lets re-write eval?
and eval-prop
using the Racket match
form. match
makes it easy to do pattern-matching lists, and so we don’t
use the helper functions:
(define (is-expr? e)
(match e
['t #t]
['f #t]
[`(not ,a) (is-expr? a)]
[`(,a or ,b) (and (is-expr? a)
(is-expr? b))]
[`(,a and ,b) (and (is-expr? a)
(is-expr? b))]
[_ #f]))
(define (eval-prop-bool expr)
(match expr
['t #t]
['f #f]
[`(not ,a) (not (eval-prop-bool a))]
[`(,a or ,b) (or (eval-prop-bool a)
(eval-prop-bool b))]
[`(,a and ,b) (and (eval-prop-bool a)
(eval-prop-bool b))]
[_ (error "eval-prop-bool: syntax error")]))
Once you get used to reading match
expressions like this, they can be more
readable than the previous code since the expression formats are written
directly in the code.
Notice the use of quasi-quoting, i.e. the backwards `
at the start of
the lists. Recall that quasiquoting lets you use a ,
inside the list to
mark an expression that should be evaluated.
EBNF: Extended Backus-Naur Notation¶
is-expr?
from above is a precise definition of what is, and isn’t, a valid
boolean literal expression. We can say that it defines a small language.
Another way to precisely specify a language is to use Extended Backus-Naur Formalism, or EBNF for short. EBNF is a language designed for precisely defining the syntax (not semantics!) of other languages.
An EBNF “program” consists of a series of named productions, i.e. rules of this general form:
production-name = body .
The idea is that whenever you see production-name
in an EBNF rule, you
can replace that name with the body
of the corresponding rule. A set of
productions is called a grammar, and grammars are very useful for creating
things like tokenizers and parsers.
We’ll use the Go EBNF notation,
which is designed specifically for being easy to use in text. Among other
details given below, ends production rules with a .
. For example, here is
the production that Go uses to define an assignment operator:
assign_op = [ add_op | mul_op ] "=" .
Anything in “”-marks indicates a token or symbol that appears as-is in a Go
program. So in this production, the "="
at the end means a =
character
appears in the Go program, while the =
that appears near the start is
part of the EBNF language that separates a production’s name from its body.
Two other EBNF operators appear in this production. The |
is the
alternation operator, and indicates a choices between two, or more,
alternatives. So add_op | mul_op
means you can chose add_op
or
mul_up
(but not both). Here’s another example:
add_op = "+" | "-" | "|" | "^" .
This production, named add_op
, says that an add_op
is a +
, -
,
|
, or ^
. This defines add_op
by simply listing all the possible
alternatives.
The []
operator indicates an optional expression that can occur 0, or
1, times. So the expression [ add_op | mul_op ]
means that the entire
expression can appear, or not appear. If it does occur, then, because of the
|
, either add_op
or mul_op
is chosen.
Another EBNF operator is {}
, which indicates repetition of 0, or
more, times. For example:
identifier = letter { letter | unicode_digit } .
This describes legal identifiers in Go, i.e. the legal names for things like
variables, functions, structs, types, and packages. It says that an identifier
must start with a letter
, and then be followed by 0 or more letters or
digits:
letter = unicode_letter | "_" .
decimal_digit = "0" ... "9" .
In the Go EBNF notation, ...
is
an informal short-hand that indicates an obvious pattern. So the
decimal_digit
production is understood to be equivalent to this:
decimal_digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" .
The identifier
rule lets us check if a sequence of characters is a legal
identifier name. For example, up2
is a legal Go identifier. We can check
this by tracing the identifier
production. The first character, u
, is
a valid letter
, and the next character is a valid letter
. The 2 is a
valid digit, and so the entire identifier is valid.
Now consider 4xyz
, which is not a valid Go identifier. It starts with
4
, but according to the letter
production 4
is not a valid
letter. So we know 4xyz
cannot be an identifier
, because a valid
identifier
must start with a letter.
The unicode_letter
production allows all characters marked as “Letter” by
Unicode. There are thousands of such characters, so they are not listed
explicitly.
Example: EBNF Rules for Literal Propositional Expressions¶
Now lets write an EBNF grammar, using the Go EBNF notation, to describe the valid boolean
literal expressions defined by (is-expr? e)
.
Here are the productions:
expr = bool-literal | not-expr | and-expr | or-expr .
bool-literal = "t" | "f" .
not-expr = "(" "not" expr ")" .
and-expr = "(" expr "and" expr ")" .
or-expr = "(" expr "or" expr ")" .
Note that EBNF does not require the productions to appear in any particular order. But, for humans, we usually group related productions, and often put them in the order either from most general to specific.
We should point out that the syntax of this little propositional language is specifically designed to be Racket-like, and so every expression is fully-bracketed. This means we don’t need to define the precedence of operators, since the brackets always make that explicit.
Consider the not-expr
production. It is recursive because it expands into
an expression involving expr
, which could expand into another
not-expr
. This allows us to show that an expression like “(not (not t))”
is valid as follows:
expr = not-expr
= "(" "not" expr ")"
= "(" "not" not-expr ")"
= "(" "not" "(" "not" expr ")" ")"
= "(" "not" "(" "not" bool-literal ")" ")"
= "(" "not" "(" "not" "t" ")" ")"
Each step expands one part of the expression, and, by making the right choice of rules, eventually we get the exact expression we want. If you want to automate this process, you could write a parser to do this.
Comparing this EBNF grammar to the implementation of is-expr?
, you can
see that it essentially mimics the EBNF grammar.
Example: EBNF for cond Expressions¶
Here’s a simple EBNF description for a Racket cond expression:
cond-expr = "(" "cond" { test-val } [else-val] ")" .
test-val = "(" bool value ")" .
else-val = ("else" | "#t") value .
bool = any Racket_ expression that evaluates to #t or #f .
value = any Racket_ expression that evaluates to a value .
Note that the brackets in the else-val
production are not in “-marks.
Unquoted round brackets are used for grouping in Go’s EBNF, and in
else-val
indicate that the choice is this:
else-val = ("else" | "#t") value . // right
else-val = "else" | ( "#t" value ) . // wrong
else-val = "else" | "#t" value . // ambiguous
This is not the official definition of a Racket cond
. See the
documentation on if-statements if you want to know the
official syntax. You’ll see also that Racket uses a different notation for
describing syntax, but it is conceptually very similar to the notation we’re
using here.
Example: nand-only Expressions¶
In propositional logic, the nand
operator is a standard logical operator.
The expression (p nand q)
is true just when either p
, q
, or both
p
and q
, are false. In other words, (p nand q)
is equivalent to
(not (p and q))
.
Here are a couple of EBNF descriptions of this nand-only language. First, we could use two productions:
expr = "t" | "f" | nand .
nand = "(" expr "nand" expr ")" .
This is simple enough that to be combined into a single rule that is still quite readable:
expr = "t"
| "f"
| "(" expr "nand" expr ")" .
Both grammars are fine, and describe the same language. Note that the second one relies partly on formatting to help avoid ambiguity, i.e. each of the 3 lines in the production describe one kind of string.
Now let’s implement some helper functions based on this grammer. First, here
is a function that returns #t
just when expr
is a valid nand-only
expression:
(define (is-nand? expr)
(match expr
['t #t]
['f #t]
[`(,a nand ,b) (and (is-nand? a)
(is-nand? b))]
[_ #f]))
And here’s a function that evaluates a nand-only expression:
;; evaluate a literal propositional logic expression using nand
;; as the only connective
(define (eval-nand expr)
(match expr
['t #t]
['f #f]
[`(,a nand ,b) (not (and (eval-nand a)
(eval-nand b)))]
[_ (error "eval-nand: syntax error")]))
The advantage of this propositional language is that its processing functions
are simpler than the equivalent functions above for expressions that allow
and
, or
, and not
.
Simplifying a Propositional Expression¶
Some propositional expressions can be simplified into logically equivalent
expressions. For example, an expression of the form (not (not p))
is
logically equivalent to p
.
Lets write an expression simplifier that applies this one particular rule (double negation elimination) to expressions:
;; double negation elimination
;; (not (not expr)) <==> expr
(define (simplify expr)
(match expr
['t 't]
['f 'f]
[`(not (not ,a)) (simplify a)]
[`(not ,a) (list 'not (simplify a))]
[`(,a or ,b) (list (simplify a) 'or (simplify b))]
[`(,a and ,b) (list (simplify a) 'and (simplify b))]
[_ (error "simplify: syntax error")]))
> (simplify '((not (not t)) and (not (not t))))
(t and t)
The match
form makes this kind of function relatively easy to write; in
particular, the expression `(not (not ,a))
is a simple way of specify a
double-negation pattern. Notice also that the order in which the matching is
done is important: we have to check for (not (not a))
before (not a)
.
Rewriting an Expression with Only nand¶
An interesting and useful fact about propositional logic is that any
expression can be re-written as a logically equivalent expression that uses
only the nand
operator. As mentioned above, the expression (p nand q)
is logically equivalent to (not (p and q))
.
Here are the rules for writing the other propositional operators in terms of
nand
:
(not p)
is logically equivalent to(p nand p)
(p and q)
is logically equivalent to((p nand q) nand (p nand q))
(p or q)
is logically equivalent to((p nand p) nand (q nand q))
By repeatedly apply these rules we can transform any propositional expression
into a logically equivalent one using only nand
.
Here we will allow variables, like p
or q
, in the expressions we want
to simplify.
Here’s some code that does the transformation:
(define (make-nand a b) (list a 'nand b))
;; returns a propositional logic expression into a logically equivalent one
;; that uses only nand
(define (nand-rewrite expr)
(if (symbol? expr) expr
(match expr
['t 't]
['f 'f]
[`(not ,a) (let ([na (nand-rewrite a)])
(make-nand na na))]
[`(,a or ,b) (let* ([na (nand-rewrite a)]
[nb (nand-rewrite b)]
[nana (make-nand na na)]
[nbnb (make-nand nb nb)])
(make-nand nana nbnb))]
[`(,a and ,b) (let* ([na (nand-rewrite a)]
[nb (nand-rewrite b)]
[nanb (make-nand na nb)])
(make-nand nanb nanb))]
[_ (error "nand-rewrite syntax error")])))
> (nand-rewrite '((p and q) or ((not p) or q)))
'((((p nand q) nand (p nand q)) nand ((p nand q) nand (p nand q)))
nand
((((p nand p) nand (p nand p)) nand (q nand q)) nand (((p nand p) nand (p nand p)) nand (q nand q))))
Example: Rewriting Racket Code¶
After you’ve written a few Racket programs, you may have noticed that these two expressions are equivalent:
(cond (test val1)
(else val2))
;; same as
(if test val1 val2)
The if-expression is a little simpler and easier to read, so it’s often the
preferred form. But when writing code you don’t always know for sure if you
have only one test, and so it’s wise to use cond
everywhere. After your
program is done you could go back and re-write any single-condition cond
expressions as an equivalent if-expression, but that is tedious and
error-prone.
Instead, lets write a Racket function that automatically replaces occurrences
of a simple cond
with an equivalent if
. Here’s a function that does
that:
;; Deeply change all occurrences of (cond (test val1) (else val2)) to
;; (if test val1 val2)
(define (rewrite-simple-cond expr)
(if (not (list? expr)) expr
(match expr
[`(cond (,test ,val1)
(else ,val2)) (map rewrite-simple-cond
(list 'if test val1 val2))]
[`(cond (,test ,val1)
(#t ,val2)) (map rewrite-simple-cond
(list 'if test val1 val2))]
[_ (map rewrite-simple-cond expr)])))
(rewrite-simple-cond expr)
first checks if expr
is a list; if it’s
not a list, then expr
is returned unchanged. If expr
is a list, then
match
checks to see if has the form of a simple cond
; there are two
cases depending on whether the else-clause uses else
or #f
. If
expr
is not a simple cond
, then we use map
to call
rewrite-simple-cond
on each value in expr
.
When expr
is a simple cond
, it is returned as an if-statement. We call
rewrite-simple-cond
on val1
and val2
using a small trick:
rewrite-simple-cond
is called on every element of the resulting if-list.
We know from how rewrite-simple-cond
works that the if
symbol at the
start will be returned as 'if
. We could have written it like this
instead:
(list 'if (rewrite-simple-cond test)
(rewrite-simple-cond val1)
(rewrite-simple-cond val2))
However, the map
version is shorter and simpler.
Example: Binary Search Trees¶
The following is a sample Racket implementation of some basic binary search tree functions. The tr
#lang racket
;;; bst.rkt
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; An implementation of a binary search tree (BST). Nodes are lists of the
;; form (value left-tree right-tree).
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; '() is the empty tree
(define (is-empty? t) (null? t))
;; getters
(define (value node) (first node))
(define (left node) (second node))
(define (right node) (third node))
;; insert value x into tree t
(define (insert x t)
(if (is-empty? t)
(list x '() '())
(let ([V (value t])
[L (left t])
[R (right t]))
(cond
[(is-empty? t)
(list x '() '())]
[(= x V) t];; x already in the tree, so do nothing
[(< x V)
(list V (insert x L) R)]
[else
(list V L (insert x R))]
))))
;; convert a list of things into a tree
(define (make-tree lst)
(if (empty? lst)
'()
(insert (first lst) (make-tree (rest lst)))))
;; return a list of the values of tree in "inorder" order, i.e. for
;; a BST the list of values will be in ascending sorted order
(define (list-inorder t)
(cond
[(is-empty? t) '()]
[else (append (list-inorder (left t))
(list (value t))
(list-inorder (right t)))]))
(define (tree-sort lst) (list-inorder (make-tree lst)))
;; Returns the depth of a tree
(define (depth t)
(cond
[(is-empty? t) 0]
[else
(+ 1 (max (depth (left t)) (depth (right t))))]))
;; return the max value in a tree
(define (tree-max t)
(cond
[(is-empty? t) 'error]
[(is-empty? (right t))
(value t)]
[else
(tree-max (right t))]))
;; test if x is a value in tree t
(define (contains x t)
(cond
[(is-empty? t) #f]
[(= x (value t)) #t]
[(< x (value t))
(contains x (left t))]
[else
(contains x (right t))]))