Exercise 4: Formalism and Symbolic Execution

This exercise will help you practice and demonstrate your ability to apply different formal systems that are useful within the context of software engineering. You will also make use of symbolic execution, which is built on top of these formal foundations. Some portions of the exercise will require written answers that should be submitted as PDFs via CourSys. Other portions will invovle submitting code, which should also be submitted to CourSys but via the appropriate submission components for the exercise.

The template for the programming portions is available here.

Task 1 (Partial Orders)

Recall the definition of a partial order when answering these questions.

  1. Construct a partial order that is not a lattice.
  2. Construct a partial order that is not over numbers, subsets, or products thereof.
  3. Suppose you have a Lattice L that is a product lattice, as seen in class, L1 × L2 × … × Ln. What is the height of L, where height is the longest path from ⊥ to ⊤?

Task 2 (Hoare Logic)

  1. Compute the weakest preconditions for the following snippet of code:
if (x > y)
  x = x + y + 5
else
  y = y + x * 2
z = x + y

with the postcondition {z < 10}

Task 3 (Symex I)

Your colleague Jauer Bach has discovered a countdown device on a sensitive machine at your workplace. Unless you can discover a password that will let you in to disable it, it will destroy all files related to your company's top secret IP. You will use Angr and binary symbolic execution in order to reverse engineer a password and stop the device.

Before leaving on a poorly timed vacation, Jauer left you an example program for exploring all paths within a program using Angr. You can see the example here. This example loads the compiled binary in bin/simplepaths and traverses all paths using symbolic execution. It then prints out specific inputs that can drive the program along each of the paths. Note that there are more focused and powerful ways of controlling the simulation manager. Study the examples that Jauer left for you in order to better understand them. You may also find the Angr documentation helpful. The section on simulation managers is particularly relevant to the example. You may also find the provided Angr example exploring fauxware to be handy. It provides a detailed walkthrough of the APIs involved.

Conveniently, you have extracted the software controlling the device and determined that it is keyed to your SFU user ID (how odd!). You can find the extracted software in bin/ticktock of the provided template. If you pass in your user ID as the first command line argument and a secret eight character password as the second command line argument, it will stop the countdown and print out SAFE. You must use Angr to find your secret password. Note, each student has a different password depending on their user ID.

Write your script to find the password in password.py of the provided template. You must also write your resulting password in passresults.txt in the main directory for your project (se-symex-template/). Make sure your password is the only text in that file. If the text is read in by a script and passed to the program, the program should recognize it as safe. That is, when using a bash terminal, I should be able to run:

bin/ticktock <username> $(printf $(<./passresults.txt))

Note that in some cases, the password produced may result in unprintable characters. These cannot be copied and pasted conveniently from the terminal, but the above command line conveniently supports hex encoding, thus the username a1b2c3d4 could result in a password that looks like Z�\�\�^� on the terminal and has the hex encoding \x5a\x8b\x5c\x8d\x5c\x8d\x5e\x8f. The command line above allows you to use hex in passresults.txt as well as characters that do not print properly on the command line. This means that it could contain any one of the following lines in a correct submission:

Z�\�\�^�
\x5a\x8b\x5c\x8d\x5c\x8d\x5e\x8f
Z\x8b\x5c\x8d\x5c\x8d\x5e\x8f

Using Angr in CSIL or at Home

Angr has been made available to you in CSIL via a python virtual environment. In order to take advantage of it, you can activate the virtual environment on the command line via

source /usr/shared/CMPT/faculty/wsumner/base/env745/bin/activate

this will make the Angr packages available to you as you write your scripts. You can then deactivate the virtual environment when you are done via

deactivate

If you instead wish to work in Angr at home, you can simply install it via pip:

pip install angr

You may do this within a virtual environment of your own in order to ensure isolation of dependencies from other components on your system.

Task 4 (Symex II)

Oh no! You accidentally pushed a refactoring commit to master right before releasing a new version of your software, and some customers are reporting a regression! Even worse, because their workloads are proprietary, they won't tell you how to reproduce the regression and are threatening to cancel their contracts with you! Worse still, you don't presently have access to your own source code….

Use Angr to find a regression between bin/version1 and bin/version2. These programs take five strings as command line arguments, all at most eight character long. The programs then compute a handful of results based on these strings.

Write your script in regression.py of the provided template. You must also write the command line arguments that illustrate the regression in regressing.txt in the top level directory of the project (se-symex-template/). The programs should be able to run in bash as:

1
2
    bin/version1 $(< regressing.txt)
    bin/version2 $(< regressing.txt)

Hint: The regression is particularly amenable to symbolic execution. The approach you use to find it need not work for all possible regression bugs. However, exhaustive search over all inputs without symbolic execution is unlikely to find the bug. In fact, it would likely (and quickly) consume all available memory on your machine. Consider the directions that symbolic execution allow, as there are many ways to solve the prolem. SimulationManagers and constraints can oth provide reasonable directions.

Submitting

Submit a PDF containing your responses to tasks 1 and 2 questions via CourSys.

Your solutions to Angr problems should be submitted by archiving your template directory and uploading to CourSys. Create an archive of your solution by changing into the directory containing your project template and running:

tar zcvf e4.tar.gz se-symex-template/

You can then upload this archive to CourSys.