A Propositional Expression Evaluator

Racket is good language for writing interpreters, compilers, and other programs that process programming languages. As long as we’re willing to stick languages with a Racket-like syntax, then Racket is often a great choice for such programs.

Lets write some functions that deal with literal boolean propositional expressions like (t and (not f)). The symbols t and f stand for true and false respectively, and there are no variables allowed in the expressions. We are not using the built-in Racket boolean values (#t and #f) since this is our own little language.

We’ll code two different solutions: one using small functions, and the second way using the match form. The match approach results in shorter code that it is easier to read for most programmers.

For the first implementation, we need to define some helper functions to check the top-level format of expressions:

#lang racket

(define (is-true? e) (equal? e 't))
(define (is-false? e) (equal? e 'f))

(define (is-literal? e)
  (or (is-false? e)
      (is-true? e)))

;; returns true iff lst is a list of length n
(define (nlist? n lst)
  (and (list? lst)
       (= n (length lst))))

;; (not expr)
(define (is-not? e)
  (and (nlist? 2 e)
       (equal? 'not (car e))))

;; (p op q)
(define (is-bin-op? e op)
  (and (nlist? 3 e)
       (equal? op (second e))))

(define (is-and? e) (is-bin-op? e 'and))    ;; (p and q)
(define (is-or? e) (is-bin-op? e 'or))      ;; (p or q)

These are all non-recursive functions that check only the top-level form of an expression; they don’t check if sub-expressions are valid.

Next, using these helper functions, lets write a recursive function that tests if an expression is a valid proposition:

(define (is-expr? e)
    (cond
      [(is-literal? e) #t]
      [(is-not? e) (is-expr? (second e))]
      [(is-and? e)
       (and (is-expr? (first e))
            (is-expr? (third e)))]
      [(is-or? e)
       (and (is-expr? (first e))
            (is-expr? (third e)))]
      [else #f]))

Next, lets write an evaluator that calculates the value of a valid proposition:

(define (eval-prop-bool e)
    (cond
      [(is-literal? e)
       (is-true? e)]
      [(is-not? e)
       (not (eval-prop-bool (second e)))]
      [(is-and? e)
       (and (eval-prop-bool (first e))
            (eval-prop-bool (third e)))]
      [(is-or? e)
       (or (eval-prop-bool (first e))
           (eval-prop-bool (third e)))]
      [else #f]))

    (define (eval-prop e)
            (if (eval-prop-bool e) 't 'f))

For example:

> (eval-prop '((not t) and (t or (not f))))
f

Note that eval-prop-bool returns Racket boolean values, i.e. #t and #f. We do this so that we can use the built-in Racket functions and, or, and not. If instead we returned 't and 'f, we’d have to write special-purpose versions of and, or, and not that work with 't and 'f.

Now lets re-write eval? and eval-prop using the Racket match form. match makes it easy to do pattern-matching lists, and so we don’t use the helper functions:

(define (is-expr? e)
  (match e
    ['t #t]
    ['f #t]
    [`(not ,a) (is-expr? a)]
    [`(,a or ,b) (and (is-expr? a)
                      (is-expr? b))]
    [`(,a and ,b) (and (is-expr? a)
                       (is-expr? b))]
    [_ #f]))

(define (eval-prop-bool expr)
  (match expr
    ['t #t]
    ['f #f]
    [`(not ,a) (not (eval-prop-bool a))]
    [`(,a or ,b) (or (eval-prop-bool a)
                     (eval-prop-bool b))]
    [`(,a and ,b) (and (eval-prop-bool a)
                       (eval-prop-bool b))]
    [_ (error "eval-prop-bool: syntax error")]))

Once you get used to reading match expressions like this, they can be more readable than the previous code since the expression formats are written directly in the code.

Notice the use of quasi-quoting, i.e. the backwards ` at the start of the lists. Recall that quasiquoting lets you use a , inside the list to mark an expression that should be evaluated.

EBNF: Extended Backus-Naur Notation

is-expr? from above is a precise definition of what is, and isn’t, a valid boolean literal expression. We can say that it defines a small language.

Another way to precisely specify a language is to use Extended Backus-Naur Formalism, or EBNF for short. EBNF is a language designed for precisely defining the syntax (not semantics!) of other languages.

An EBNF “program” consists of a series of named productions, i.e. rules of this general form:

production-name = body .

The idea is that whenever you see production-name in an EBNF rule, you can replace that name with the body of the corresponding rule. A set of productions is called a grammar, and grammars are very useful for creating things like tokenizers and parsers.

We’ll use the Go EBNF notation, which is designed specifically for being easy to use in text. Among other details given below, ends production rules with a .. For example, here is the production that Go uses to define an assignment operator:

assign_op = [ add_op | mul_op ] "=" .

Anything in “”-marks indicates a token or symbol that appears as-is in a Go program. So in this production, the "=" at the end means a = character appears in the Go program, while the = that appears near the start is part of the EBNF language that separates a production’s name from its body.

Two other EBNF operators appear in this production. The | is the alternation operator, and indicates a choices between two, or more, alternatives. So add_op | mul_op means you can chose add_op or mul_up (but not both). Here’s another example:

add_op = "+" | "-" | "|" | "^" .

This production, named add_op, says that an add_op is a +, -, |, or ^. This defines add_op by simply listing all the possible alternatives.

The [] operator indicates an optional expression that can occur 0, or 1, times. So the expression [ add_op | mul_op ] means that the entire expression can appear, or not appear. If it does occur, then, because of the |, either add_op or mul_op is chosen.

Another EBNF operator is {}, which indicates repetition of 0, or more, times. For example:

identifier = letter { letter | unicode_digit } .

This describes legal identifiers in Go, i.e. the legal names for things like variables, functions, structs, types, and packages. It says that an identifier must start with a letter, and then be followed by 0 or more letters or digits:

letter        = unicode_letter | "_" .
decimal_digit = "0" ... "9" .

In the Go EBNF notation, ... is an informal short-hand that indicates an obvious pattern. So the decimal_digit production is understood to be equivalent to this:

decimal_digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" .

The identifier rule lets us check if a sequence of characters is a legal identifier name. For example, up2 is a legal Go identifier. We can check this by tracing the identifier production. The first character, u, is a valid letter, and the next character is a valid letter. The 2 is a valid digit, and so the entire identifier is valid.

Now consider 4xyz, which is not a valid Go identifier. It starts with 4, but according to the letter production 4 is not a valid letter. So we know 4xyz cannot be an identifier, because a valid identifier must start with a letter.

The unicode_letter production allows all characters marked as “Letter” by Unicode. There are thousands of such characters, so they are not listed explicitly.

Example: EBNF Rules for Literal Propositional Expressions

Now lets write an EBNF grammar, using the Go EBNF notation, to describe the valid boolean literal expressions defined by (is-expr? e).

Here are the productions:

        expr = bool-literal | not-expr | and-expr | or-expr .

bool-literal = "t" | "f" .

    not-expr = "(" "not" expr ")" .

    and-expr = "(" expr "and" expr ")" .

     or-expr = "(" expr "or" expr ")" .

Note that EBNF does not require the productions to appear in any particular order. But, for humans, we usually group related productions, and often put them in the order either from most general to specific.

We should point out that the syntax of this little propositional language is specifically designed to be Racket-like, and so every expression is fully-bracketed. This means we don’t need to define the precedence of operators, since the brackets always make that explicit.

Consider the not-expr production. It is recursive because it expands into an expression involving expr, which could expand into another not-expr. This allows us to show that an expression like “(not (not t))” is valid as follows:

expr = not-expr
     = "(" "not" expr ")"
     = "(" "not" not-expr ")"
     = "(" "not" "(" "not" expr ")" ")"
     = "(" "not" "(" "not" bool-literal ")" ")"
     = "(" "not" "(" "not" "t" ")" ")"

Each step expands one part of the expression, and, by making the right choice of rules, eventually we get the exact expression we want. If you want to automate this process, you could write a parser to do this.

Comparing this EBNF grammar to the implementation of is-expr?, you can see that it essentially mimics the EBNF grammar.

Example: EBNF for cond Expressions

Here’s a simple EBNF description for a Racket cond expression:

cond-expr = "(" "cond" { test-val } [else-val] ")" .

test-val  = "(" bool value ")" .

else-val  = ("else" | "#t") value .

bool      = any Racket_ expression that evaluates to #t or #f .

value     = any Racket_ expression that evaluates to a value .

Note that the brackets in the else-val production are not in “-marks. Unquoted round brackets are used for grouping in Go’s EBNF, and in else-val indicate that the choice is this:

else-val  = ("else" | "#t") value .    // right

else-val  = "else" | ( "#t" value ) .  // wrong

else-val  = "else" | "#t" value .      // ambiguous

This is not the official definition of a Racket cond. See the documentation on if-statements if you want to know the official syntax. You’ll see also that Racket uses a different notation for describing syntax, but it is conceptually very similar to the notation we’re using here.

Example: nand-only Expressions

In propositional logic, the nand operator is a standard logical operator. The expression (p nand q) is true just when either p, q, or both p and q, are false. In other words, (p nand q) is equivalent to (not (p and q)).

Here are a couple of EBNF descriptions of this nand-only language. First, we could use two productions:

expr = "t" | "f" | nand .

nand = "(" expr "nand" expr ")" .

This is simple enough that to be combined into a single rule that is still quite readable:

expr =   "t"
       | "f"
       | "(" expr "nand" expr ")" .

Both grammars are fine, and describe the same language. Note that the second one relies partly on formatting to help avoid ambiguity, i.e. each of the 3 lines in the production describe one kind of string.

Now let’s implement some helper functions based on this grammer. First, here is a function that returns #t just when expr is a valid nand-only expression:

(define (is-nand? expr)
  (match expr
    ['t #t]
    ['f #t]
    [`(,a nand ,b) (and (is-nand? a)
                        (is-nand? b))]
    [_ #f]))

And here’s a function that evaluates a nand-only expression:

;; evaluate a literal propositional logic expression using nand
;; as the only connective
(define (eval-nand expr)
  (match expr
    ['t #t]
    ['f #f]
    [`(,a nand ,b) (not (and (eval-nand a)
                             (eval-nand b)))]
    [_ (error "eval-nand: syntax error")]))

The advantage of this propositional language is that its processing functions are simpler than the equivalent functions above for expressions that allow and, or, and not.

Simplifying a Propositional Expression

Some propositional expressions can be simplified into logically equivalent expressions. For example, an expression of the form (not (not p)) is logically equivalent to p.

Lets write an expression simplifier that applies this one particular rule (double negation elimination) to expressions:

;; double negation elimination
;; (not (not expr)) <==> expr
(define (simplify expr)
  (match expr
    ['t 't]
    ['f 'f]
    [`(not (not ,a)) (simplify a)]
    [`(not ,a) (list 'not (simplify a))]
    [`(,a or ,b) (list (simplify a) 'or (simplify b))]
    [`(,a and ,b) (list (simplify a) 'and (simplify b))]
    [_ (error "simplify: syntax error")]))

> (simplify '((not (not t)) and (not (not t))))
(t and t)

The match form makes this kind of function relatively easy to write; in particular, the expression `(not (not ,a)) is a simple way of specify a double-negation pattern. Notice also that the order in which the matching is done is important: we have to check for (not (not a)) before (not a).

Rewriting an Expression with Only nand

An interesting and useful fact about propositional logic is that any expression can be re-written as a logically equivalent expression that uses only the nand operator. As mentioned above, the expression (p nand q) is logically equivalent to (not (p and q)).

Here are the rules for writing the other propositional operators in terms of nand:

  • (not p) is logically equivalent to (p nand p)
  • (p and q) is logically equivalent to ((p nand q) nand (p nand q))
  • (p or q) is logically equivalent to ((p nand p) nand (q nand q))

By repeatedly apply these rules we can transform any propositional expression into a logically equivalent one using only nand.

Here we will allow variables, like p or q, in the expressions we want to simplify.

Here’s some code that does the transformation:

(define (make-nand a b) (list a 'nand b))

;; returns a propositional logic expression into a logically equivalent one
;; that uses only nand
(define (nand-rewrite expr)
  (if (symbol? expr) expr
      (match expr
        ['t 't]
        ['f 'f]
        [`(not ,a) (let ([na (nand-rewrite a)])
                     (make-nand na na))]
        [`(,a or ,b) (let* ([na (nand-rewrite a)]
                            [nb (nand-rewrite b)]
                            [nana (make-nand na na)]
                            [nbnb (make-nand nb nb)])
                       (make-nand nana nbnb))]
        [`(,a and ,b) (let* ([na (nand-rewrite a)]
                             [nb (nand-rewrite b)]
                             [nanb (make-nand na nb)])
                        (make-nand nanb nanb))]
        [_ (error "nand-rewrite syntax error")])))

> (nand-rewrite '((p and q) or ((not p) or q)))
'((((p nand q) nand (p nand q)) nand ((p nand q) nand (p nand q)))
  nand
  ((((p nand p) nand (p nand p)) nand (q nand q)) nand (((p nand p) nand (p nand p)) nand (q nand q))))

Example: Rewriting Racket Code

After you’ve written a few Racket programs, you may have noticed that these two expressions are equivalent:

(cond (test val1)
      (else val2))

;; same as

(if test val1 val2)

The if-expression is a little simpler and easier to read, so it’s often the preferred form. But when writing code you don’t always know for sure if you have only one test, and so it’s wise to use cond everywhere. After your program is done you could go back and re-write any single-condition cond expressions as an equivalent if-expression, but that is tedious and error-prone.

Instead, lets write a Racket function that automatically replaces occurrences of a simple cond with an equivalent if. Here’s a function that does that:

;; Deeply change all occurrences of (cond (test val1) (else val2)) to
;; (if test val1 val2)
(define (rewrite-simple-cond expr)
  (if (not (list? expr)) expr
      (match expr
        [`(cond (,test ,val1)
                (else ,val2)) (map rewrite-simple-cond
                                   (list 'if test val1 val2))]
        [`(cond (,test ,val1)
                (#t ,val2)) (map rewrite-simple-cond
                                 (list 'if test val1 val2))]
        [_ (map rewrite-simple-cond expr)])))

(rewrite-simple-cond expr) first checks if expr is a list; if it’s not a list, then expr is returned unchanged. If expr is a list, then match checks to see if has the form of a simple cond; there are two cases depending on whether the else-clause uses else or #f. If expr is not a simple cond, then we use map to call rewrite-simple-cond on each value in expr.

When expr is a simple cond, it is returned as an if-statement. We call rewrite-simple-cond on val1 and val2 using a small trick: rewrite-simple-cond is called on every element of the resulting if-list. We know from how rewrite-simple-cond works that the if symbol at the start will be returned as 'if. We could have written it like this instead:

(list 'if (rewrite-simple-cond test)
          (rewrite-simple-cond val1)
          (rewrite-simple-cond val2))

However, the map version is shorter and simpler.

Example: Binary Search Trees

The following is a sample Racket implementation of some basic binary search tree functions. The tr

#lang racket

;;; bst.rkt

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; An implementation of a binary search tree (BST). Nodes are lists of the
;; form (value left-tree right-tree).
;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; '() is the empty tree
(define (is-empty? t) (null? t))

;; getters
(define (value node) (first node))
(define (left node) (second node))
(define (right node) (third node))

;; insert value x into tree t
(define (insert x t)
    (if (is-empty? t)
        (list x '() '())
        (let ([V (value t])
              [L (left t])
              [R (right t]))
          (cond
            [(is-empty? t)
             (list x '() '())]
            [(= x V) t];; x already in the tree, so do nothing
            [(< x V)
             (list V (insert x L) R)]
            [else
             (list V L (insert x R))]
            ))))

;; convert a list of things into a tree
(define (make-tree lst)
    (if (empty? lst)
        '()
        (insert (first lst) (make-tree (rest lst)))))


;; return a list of the values of tree in "inorder" order, i.e. for
;; a BST the list of values will be in ascending sorted order
(define (list-inorder t)
    (cond
      [(is-empty? t) '()]
      [else (append (list-inorder (left t))
                    (list (value t))
                    (list-inorder (right t)))]))

(define (tree-sort lst) (list-inorder (make-tree lst)))

;; Returns the depth of a tree
(define (depth t)
    (cond
      [(is-empty? t) 0]
      [else
       (+ 1 (max (depth (left t)) (depth (right t))))]))

;; return the max value in a tree
(define (tree-max t)
    (cond
      [(is-empty? t) 'error]
      [(is-empty? (right t))
       (value t)]
      [else
       (tree-max (right t))]))

;; test if x is a value in tree t
(define (contains x t)
    (cond
      [(is-empty? t) #f]
      [(= x (value t)) #t]
      [(< x (value t))
       (contains x (left t))]
      [else
       (contains x (right t))]))