Logic Programming with core.logic
=================================

Logic programming is an interesting approach to programming that was pioneered
by Prolog_ in the 1970s and 1980s. While it never became as popular as its
supporters hoped it would, the ideas behind it are very interesting.


Constraint Problems
-------------------

Logic programming is an approach to programming that is often useful for
solving **constraint problems** where searching is required. For instance,
consider this constraint problem:

	Find all integer values of x and y such that::

	    x + y = 10

	x and y are constrained as follows::

	    0 <= x <= 7
	    2 <= y <= 11

One solution to this problem is x = 4 and y = 6, but there are others. 

You could find all solutions to this problem in Go_ like this::

	package main

	import "fmt"

	func main() {
		for x := 0; x <= 7; x++ {
			for y := 2; y <= 11; y++ {
				if x+y == 10 {
					fmt.Printf("x=%v y=%v\n", x, y)
				}
			}
		}
	}

This is a **brute force** solution. The two loops generate all pairs of ``x``
and ``y`` values that satisfy the variable constraints, and then tests if they
sum to 10. Since there are 8 possible ``x`` values and 10 possible ``y``
values, there are exactly :math:`8 \cdot 10 = 80` (x, y) pairs to test.

The great thing about brute force is that it works with any constraint problem
like this. Of course, the bad thing is that search spaces for even small
problems can be gigantic, thus making brute force it glacially slow.

Here's another constraint problem:

	Find all integer values of x and y such that::

	    x*x - 2*y*y = 1

	x and y are constrained as follows::

	    0 <= x <= 10000
	    0 <= y <= 10000

This is an instance of the `Pell equation
<https://en.wikipedia.org/wiki/Pell%27s_equation>`_.

This Go_ program prints all the answers::

	package main

	import "fmt"

	func main() {
		for x := 0; x <= 10000; x++ {
			for y := 0; y <= 10000; y++ {
				if x*x-2*y*y == 1 {
					fmt.Printf("x=%v y=%v\n", x, y)
				}
			}
		}
	}

Here they are::

	x=1 y=0
	x=3 y=2
	x=17 y=12
	x=99 y=70
	x=577 y=408
	x=3363 y=2378

Since ``x`` and ``y`` both range from 0 to 10000, exactly :math:`10001 \cdot
10001 = 100020001` (x, y) pairs are tested. That's over 100 million pairs of
numbers. On a fast modern computer there's not much delay in generate the
answer, but as the ranges for ``x`` and ``y`` increase, the program gets
slower and slower. You need to come up with more clever methods to reduce the
space of possible answers.

Here's a famous constraint problem with more variables:

	Assign a digit to each of the following letters that makes this equation
	true::

	   SEND + MORE = MONEY

	The letters are constrained as follows:

	- each letter stands for a digit from 0 to 9

	- different letters are different digits

	- a number can't start with 0 (so S and M can't be 0)

Hold onto your mouse --- here's a brute-force Go_ solution::

	package main

	import "fmt"

	func main() {
		method1()
	} // main

	func method1() {
		for S := 0; S <= 9; S++ {
			for E := 0; E <= 9; E++ {
				for N := 0; N <= 9; N++ {
					for D := 0; D <= 9; D++ {
						for M := 0; M <= 9; M++ {
							for O := 0; O <= 9; O++ {
								for R := 0; R <= 9; R++ {
									for Y := 0; Y <= 9; Y++ {
										if allDiff([]int{S, E, N, D, M, O, R, Y}) && S != 0 && M != 0 {
											SEND := 1000*S + 100*E + 10*N + D
											MORE := 1000*M + 100*O + 10*R + E
											MONEY := 10000*M + 1000*O + 100*N + 10*E + Y
											if SEND+MORE == MONEY {
												fmt.Printf("   %v\n + %v\n-------\n  %v\n\n", SEND, MORE, MONEY)
											}
										}
									} // Y
								} // R
							} // O
						} // M
					} // D
				} // N
			} // E
		} // S
	} // method1

	func allDiff(nums []int) bool {
		for i := 0; i < len(nums); i++ {
			for j := i + 1; j < len(nums); j++ {
				if nums[i] == nums[j] {
					return false
				}
			}
		}
		return true
	}

As before, the idea is to use nested loops, one loop per variable, to generate
all possible assignments of the variables ``S``, ``E``, ``N``, ``D``, ``M``,
``O``, ``R``, and ``Y``. Since there are 8 variables, there are
:math:`10^{8}`, i.e. a hundred million, possible assignments. It finds the one
and only solution::

	   9567
	 + 1085
	-------
	  10652

How fast is this program? On my computer::

	$ time go run sendMoreMoney.go
	   9567
	 + 1085
	-------
	  10652

	5.50user 0.06system 0:05.37elapsed 103%CPU (0avgtext+0avgdata 26300maxresident)k
	0inputs+2816outputs (0major+10863minor)pagefaults 0swaps

So this takes about 5.5 seconds to compile and run, which isn't bad.

But we can do a little better. An obvious improvement is to never assign 0 to
``S`` or ``M`` (because the problem says a number can't start with 0). So we
can change the ``S`` and ``M`` loops to start at 1 instead of 0::

	for S := 1; S <= 9; S++ {
		// ...
		for M := 1; M <= 9; M++ {
		// ...

The ``S == 0`` and ``M == 0`` tests in the ``allDiff`` if-statement are no
longer needed, giving this code::

	func method2() {
		for S := 1; S <= 9; S++ { // 0 changed to 1
			for E := 0; E <= 9; E++ {
				for N := 0; N <= 9; N++ {
					for D := 0; D <= 9; D++ {
						for M := 1; M <= 9; M++ { // 0 changed to 1
							for O := 0; O <= 9; O++ {
								for R := 0; R <= 9; R++ {
									for Y := 0; Y <= 9; Y++ {
										if allDiff([]int{S, E, N, D, M, O, R, Y}) {
											SEND := 1000*S + 100*E + 10*N + D
											MORE := 1000*M + 100*O + 10*R + E
											MONEY := 10000*M + 1000*O + 100*N + 10*E + Y
											if SEND+MORE == MONEY {
												fmt.Printf("   %v\n + %v\n-------\n  %v\n\n", SEND, MORE, MONEY)
											}
										}
									} // Y
								} // R
							} // O
						} // M
					} // D
				} // N
			} // E
		} // S
	} // method2

This runs a little faster than ``method1``::

	$ time go run sendMoreMoney.go
	   9567
	 + 1085
	-------
	  10652

	4.55user 0.08system 0:04.46elapsed 103%CPU (0avgtext+0avgdata 26176maxresident)k
	8inputs+2824outputs (0major+12215minor)pagefaults 0swaps

With some thought, you can infer that the only possible value for ``M`` is 1.
That's because ``MONEY`` has five digits, but ``SEND`` and ``MORE`` only have
four, and so the carry to get the ``M`` in ``MONEY`` can only be 0 or
1. Since 0 is disallowed by the problem statement, ``M`` must be 1.

Continuing reasoning in this fashion is how you could solve the problem by
hand. It's a fun puzzle that you should try. As you figure out more numbers,
you can cut down on the possible values for other variables, eventually 
zero-ing in on the solution.

Inferring values like this is a good approach, but its hard to implement. So
lets think if there are relatively simple changes we could make to the program
that will speed it up.

One idea is to spread out the testing for different variables. For example,
consider the first two loops::

	for S := 1; S <= 9; S++ { // 0 changed to 1
		for E := 0; E <= 9; E++ {
		// ...

Suppose ``S`` is 1 and ``E`` is 1. Then we know immediately that these can't
be in the solution, because two letters can't have the same value. So there's
no need to calculate any further values. Instead, we can immediately jump to
the next value of ``E`` (which is 2), and try that. This idea can applied to
all the loops: after a variable is assigned, check to make sure it is not
equal to any of the previously assigned variables. If it is, then immediately
jump to the next value.

Here's an implementation of that idea::

	func method3() {
		for S := 1; S <= 9; S++ {
			for E := 0; E <= 9; E++ {
				if E == S {
					continue
				}
				for N := 0; N <= 9; N++ {
					if N == S || N == E {
						continue
					}
					for D := 0; D <= 9; D++ {
						if D == S || D == E || D == N {
							continue
						}
						for M := 1; M <= 9; M++ {
							if M == S || M == E || M == N || M == D {
								continue
							}
							for O := 0; O <= 9; O++ {
								if O == S || O == E || O == N || O == D || O == M {
									continue
								}
								for R := 0; R <= 9; R++ {
									if R == S || R == E || R == N || R == D || R == M || R == O {
										continue
									}
									for Y := 0; Y <= 9; Y++ {
										if Y == S || Y == E || Y == D || Y == M || Y == O || Y == R || Y == N {
											continue
										}
										SEND := 1000*S + 100*E + 10*N + D
										MORE := 1000*M + 100*O + 10*R + E
										MONEY := 10000*M + 1000*O + 100*N + 10*E + Y
										if SEND+MORE == MONEY {
											fmt.Printf("   %v\n + %v\n-------\n  %v\n\n", SEND, MORE, MONEY)
										}
									} // Y
								} // R
							} // O
						} // M
					} // D
				} // N
			} // E
		} // S
	} // method3

The results in a significant performance increase::

	$ time ./sendMoreMoney
	   9567
	 + 1085
	-------
	  10652

	0.61user 0.06system 0:00.49elapsed 138%CPU (0avgtext+0avgdata 26032maxresident)k
	0inputs+2824outputs (0major+12074minor)pagefaults 0swaps

It takes 0.61 seconds, which is only a couple tenths of a second, or so, more
than the the time needed to compile the program!

Despite it's good performance, this program is big and ugly. It's also tedious
to write, and hard to change. Is there a better way?


Logic Programming
-----------------

`Logic programming <https://en.wikipedia.org/wiki/Logic_programming>`_ is a
style of programming based on formal logic. Prolog_ is the most famous example
of a logic programming language, and while it can be used for general-purpose
programming, it is probably most useful for a handful of applications, such as
grammar processing (e.g. creating parse trees), and solving constraint
problems. Prolog_ programs also have a reputation for begin very compact,
which is can be good and bad depending upon the situation. For the kinds of
constraint problems introduced above, Prolog_ programs that solve them tend to
be short and readable than the what we had to write for Go_.

Here's a sample Prolog_ program (copied from
https://en.wikipedia.org/wiki/Prolog)::

	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	%
	% Facts.
	%
	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

	mother_child(trude, sally).   % trude is the mother of sally
	 
	father_child(tom, sally).     % tom is the father of sally
	father_child(tom, erica).     % tom is the father of erica
	father_child(mike, tom).      % mike is the father of tom

	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
	%
	% Rules.
	%
	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

	% X is a sibling of Y if they share a common parent Z
	sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).
	 
	% X is the parent of Y if either X is the father of Y, 
	% or X is the mother of Y
	parent_child(X, Y) :- father_child(X, Y).
	parent_child(X, Y) :- mother_child(X, Y).

This program is divided into two basic parts: a small database of facts, and a
number of rules that can be applied to those facts. You can see that ``trude``
is the mother of ``sally``, and ``mike`` is the father of ``tom``.

Prolog_ rules, in their pure form, are `horn clauses
<https://en.wikipedia.org/wiki/Horn_clause>`_ with this general form::

	H :- B_1, B_2, ..., B_n

You can read this as saying "to prove ``H``, prove ``B_1``, and prove ``B_2``,
... and prove ``B_n``". The program's sibling rule has this form::

	sibling(X, Y) :- parent_child(Z, X), parent_child(Z, Y).

This says to prove ``sibling(X, Y)``, you must prove both ``parent_child(Z,
X)`` and ``parent_child(Z, Y)``.

With these facts and rules in place, you can now write *queries* at the
interpreter like this::

	?- sibling(sally, erica).
	 Yes

Given these facts and rules, Prolog_ can prove that ``sally`` is a sibling of
``erica``. Using an elegant combination of backtracking and pattern matching,
Prolog_ is able to automatically infer this fact. Thus, Prolog_ is a popular
choice for mimicking symbolic reasoning.

One way to think about Prolog_ is as a database system that contains both
facts and rules, and when you query this database Prolog_ can apply the rules
to prove new true facts.

The power of this approach can be surprising. For example, consider this
simple Prolog_ program consisting of just these two facts::

	bit(0). 
	bit(1).

This says that 0 and 1 are bits. Now we can generate, say, all 3-bit bit
strings with this query::

	?- bit(A), bit(B), bit(C), Bits=[A, B, C].
	A = B, B = C, C = 0,
	Bits = [0, 0, 0] ;
	A = B, B = 0,
	C = 1,
	Bits = [0, 0, 1] ;
	A = C, C = 0,
	B = 1,
	Bits = [0, 1, 0] ;
	A = 0,
	B = C, C = 1,
	Bits = [0, 1, 1] ;
	A = 1,
	B = C, C = 0,
	Bits = [1, 0, 0] ;
	A = C, C = 1,
	B = 0,
	Bits = [1, 0, 1] ;
	A = B, B = 1,
	C = 0,
	Bits = [1, 1, 0] ;
	A = B, B = C, C = 1,
	Bits = [1, 1, 1].

This query finds all values of ``A``, ``B``, and ``C`` that are bits, which is
the same as asking for all 8 bit strings of length 3.

Despite its charms, Prolog_ is no longer a very popular language. While its
program can be compact, *writing* those programs can be quite tricky compared
to more mainstream languages. The syntax is quite unusual, and you must learn
a number of tricks specific to Prolog_ to solve many basic programming
problems that are trivial in procedural, or even functional, languages. Plus,
Prolog_ is usually not as fast at solving any particular constraint problem as
compared to a customized Java_ or C++ program doing the same thing.


Logic Programming in Clojure
----------------------------

So while Prolog_ never really caught on in the mainstream, the ideas behind it
are still useful and intriguing, and have been copied in different forms in
other languages. For example, Clojure_ has a library called `core.logic
<https://github.com/clojure/core.logic>`_ that implements a number of logic
programming ideas within Clojure_.

Lets see how to solve a few constraint problems in `core.logic
<https://github.com/clojure/core.logic>`_. 

First:

	Find all integer values of x and y such that::

	    x + y = 10

	x and y are constrained as follows::

	    0 <= x <= 7
	    2 <= y <= 11

This `core.logic <https://github.com/clojure/core.logic>`_ will solve it::

	(require '[clojure.core.logic :as logic]
	         '[clojure.core.logic.fd :as fd])

	(def csp1
	    (logic/run* [q]
	        (logic/fresh [x y]
	            (fd/in x (fd/interval 0 7))
	            (fd/in y (fd/interval 2 11))
	            (fd/eq
	                (= (+ x y) 10)
	            )
	            (logic/== q [x y])))
	)

	=> csp1
	([0 10] [1 9] [2 8] [3 7] [4 6] [5 5] [6 4] [7 3])


Second::

	Find all integer values of x and y such that::

	    x*x - 2*y*y = 1

	x and y are constrained as follows::

	    0 <= x <= 10000
	    0 <= y <= 10000

Here is a `core.logic <https://github.com/clojure/core.logic>`_ solution that
doesn't work::

	(def csp2
	    (logic/run* [q]
	        (logic/fresh [x y]
	            (fd/in x y (fd/interval 0 10000))  ;; 10000 is too big!
	            (fd/eq
	                (= 1 (- (* x x) (* 2 y y)))
	            )
	            (logic/== q [x y])))
	)

When you evaluate ``csp2``, you get this::

	; => csp2   ;; when x, y in interval 0 1000
	; StackOverflowError   clojure.lang.Symbol.hashCode (Symbol.java:84)

Presumably the implementation of `core.logic
<https://github.com/clojure/core.logic>`_ makes recursive calls, and a
``StackOverflowError`` implies that too many function calls have been made. So
for this solution to work, you must decrease the range of ``x`` and ``y``,
e.g.::

	(def csp2
	    (logic/run* [q]
	        (logic/fresh [x y]
	            (fd/in x y (fd/interval 0 600))  ;; 600 is ok (on my computer)
	            (fd/eq
	                (= 1 (- (* x x) (* 2 y y)))
	            )
	            (logic/== q [x y])))
	)

	=> csp2
	([1 0] [3 2] [17 12] [99 70] [577 408])

It takes a few seconds on my computer to find this solution.

And finally:

	Assign a digit to each of the following letters that makes this equation
	true::

	   SEND + MORE = MONEY

	The letters are constrained as follows:

	- each letter stands for a digit from 0 to 9

	- different letters are different digits

	- a number can't start with 0 (so S and M can't be 0)

Here is a `core.logic <https://github.com/clojure/core.logic>`_ solution that
works::

	(def csp3
	    (logic/run* [q]
	        (logic/fresh [S M E N D O R Y]
	            (fd/in S M (fd/interval 1 9))
	            (fd/in E N D O R Y (fd/interval 0 9))
	            (fd/distinct [S M E N D O R Y])
	            (fd/eq
	                (= (+ (* 10000 M) (* 1000 O) (* 100 N) (* 10 E) Y)
	                   (+ (+ (* 1000 S) (* 100 E) (* 10 N) D)
	                      (+ (* 1000 M) (* 100 O) (* 10 R) E)
	                   ) 
	                )
	            )
	            (logic/== q [[S E N D] [M O R E] [M O N E Y]])))
	)

	=> csp3
	([[9 5 6 7] [1 0 8 5] [1 0 6 5 2]])

This solution is calculated in the blink of an eye.

In practice, systems like `core.logic
<https://github.com/clojure/core.logic>`_ are quite flexible and easy to use
(once you get used to the syntax!), but tend to perform more slowly than
custom-made solvers (like the ones we wrote above in Go_). Plus, different
ways of writing the same constraints can result in wildly different run-times,
and so it can be hard to predict ahead of time how fast --- or slow --- a
`core.logic <https://github.com/clojure/core.logic>`_ might run.