CMPT 125 Assignments

CMPT 125 Assignment 3 Solution

Question 1 – Mergesort: 8 marks

a) Using C code complete the mergesort function shown below, your function must be implemented iteratively (i.e. using loops instead of recursion). You can assume the existence of a merge function with the following header (i.e. you don't have to write the merge function, and can call it in your mergesort function). [4 marks]

void merge(int arr[], int start, int mid, int end);

// PARAM: arr = array to be sorted

// len = length of arr

// POST: arr is sorted in ascending order

void mergesort(int arr[], int len){

int n = 1; // length of sub-arrays to be merged

while(n < len){

// Set indexes of sub-arrays

int start = 0;

int mid = start + n - 1; //end of first subarray

int end = mid + n;

// Merge sub-arrays

while(start < len){

// Set index of last sub-array

if(end >= len){

end = len - 1;

}

merge(arr, start, mid, end);

// set indexes of next sub-arrays

start = end + 1;

mid = start + n - 1;

end = mid + n;

}

// Double size of sub-array

n = n * 2;

}

b) Using C code write a merge function that merges two sorted arrays, not sub-arrays of the same array, into a single sorted array and that returns a pointer to the resulting merged array. Your function should have parameters for the two input arrays and their lengths. [4 marks]

// PRE: arr[0..len1-1] and arr2[0..len2-1] are sorted

// POST: returns a pointer to an array of size len1+len2

// result array is sorted

int* merge1b(int arr1[], int len1, int arr2[], int len2)

{

// Create empty results array and set indexes

int len_result = len1 + len2;

int* result = malloc (sizeof(int) * len_result);

int arr1_index = 0;

int arr2_index = 0;

int result_index = 0;

// Merge input arrays

while(arr1_index < len1 && arr2_index < len2) {

if (arr1[arr1_index] < arr2[arr2_index]) {

result[result_index++] = arr1[arr1_index++];

} else {

result[result_index++] = arr2[arr2_index++];

}

// Copy remaining elements from unfinished array

while(arr1_index < len1){

result[result_index++] = arr1[arr1_index++];

}

while(arr2_index < len2){

result[result_index++] = arr2[arr2_index++];

}

return result;

}

Question 2 – Stack Overflow: 10 marks

The call stack is part of main memory that is reserved for function calling. Like all computer memory it is finite, so can be exhausted resulting in a stack overflow. Recursive functions allocate space on the stack for each recursive call; if there are many such recursive calls a stack overflow can result. The questions that follow ask you to investigate recursive functions and stack overflows.

Note that when running programs in the Linux terminal a stack overflow results in a segmentation fault.

a) Using C code write a recursive function that is equivalent to the iterative function shown below that prints the numbers from 1 to the largest possible integer value. Note that the constant INT_MAX is contained in the limits.h header file. [2 marks]

void overflow(){

for(int i = 1; i <= INT_MAX; i++){

printf("%d\n", i);

}

Your recursive function should have a single integer parameter, and you should initially call the function by passing 1 to its parameter.

// PRE:

// POST: Prints the values from n to INT_MAX (in theory)

void overflow(int n)

{

if(n <= INT_MAX){

printf("%d\n", n);

overflow(n+1);

}

b) Write a main function and run your solution to part (a) given above. The function will cause a stack overflow. When this has occurred take a screenshot of the terminal and include that in your submission. Your screenshot should show the segmentation fault and (at least) the last value that was printed. [2 marks]

Not shown.

c) Run each of the two functions shown below in turn, initially passing each function the value 1 to its parameter. Again, take screenshots of the terminal showing the segmentation fault and the last value that was printed. [2 marks]

void foo(int n)
{
    int* arr = malloc(100 * sizeof(int));
    assert(arr);
    for(int i=0; i < 100; i++){
        arr[i] = i + n;
    }
    printf("%d\n", arr[99]);
    foo(n+1);
}

void bar(int n)
{
    int arr[100];
    for(int i=0; i < 100; i++){
        arr[i] = i + n;
    }
    printf("%d\n", arr[99]);
    bar(n+1);
}

Not shown – foo should run considerably longer than bar.

d) Explain why, or why not, the segmentation fault occurred at different times for the two functions that you ran in part (c). [2 marks]

Because bar allocates additional space on the call stack for each recursive call for an array, whereas foo is allocating an equivalent amount of space in dynamic memory.

e) Assume that a computer system has a heap of 2GB and that the size of the call stack for an application is 1MB. Quicksort is used to sort an array that is stored on the heap (i.e. it is stored in the computer's internal memory not on a file). Assume that each recursive call to quicksort requires 256 bytes to be allocated to the call stack. Explain why the call to quicksort will not result in a stack overflow regardless of the size of the array being sorted. You should assume that the call to quicksort results in the average case O Notation running time for the algorithm. [2 marks]

The short answer is because an array big enough to cause a stack overflow is going to be so big that it won't fit in dynamic memory. Support for this is as follows.

1 Mb allows for approximately 4,000 recursive calls (1,048,576 / 256). Given that quicksort makes O(log n) recursive calls to partition the array to sub-arrays of size 1 we can solve for n where log₂(n) = 4,000 to find the largest array that can be sorted.2⁴⁰⁰⁰ » 1*10¹²⁰⁰ which is vastly in excess of the size of the heap.

Question 3 – Bob Does More Sorting: 10 marks

The law firm that employs Bob has a (menial) task for him. He is to sort 1,000 documents by their case number. Each document is 8.5 by 11 inches and the case number is an eight digit number in the top right hand corner of each document. Bob is to sort the documents in a small (10 by 12 feet) room which contains a single 60 by 30 inch table.

Bob knows that you have been studying sorting algorithms so asks you to help him figure out how to sort the documents in a reasonable amount of time.

a) Describe to Bob a process for sorting the documents. The process should be based on one of the four sorting algorithms we have studied in class (selection sort, insertion sort, merge sort and quick sort). Marks are allocated to how quickly Bob can sort the documents using your process. Your description should either be in (clear) English or pseudo-code. Remember that you are describing how to sort documents, not the contents of an array. Also note that Bob has not studied Computing Science and has not studied Maths since grade 11 (and he wasn't very good at it) and your description must be easy for Bob to follow. [6 marks]

Poor Bob. There are many possible answers. Note that log₂(1000) = 10. If you are going to relatively strictly follow the process for mergesort or quicksort this is roughly the number of piles of documents you need.

Here is a version based on insertion sort.

Put the pile of unsorted documents on the table to your right. Pick up the first document and put it on the table in front of you. This is the first document in what is going to be the pile of sorted documents. Pick up the next document from the pile of unsorted documents and put it in the right place¹ in the sorted pile. Once the sorted pile has more than ten documents in it you are going to find the right place to put the next document as I'm about to describe.

Pick up roughly half of the pile of sorted documents and put them to the left of the pile. Now compare the case number of the document at the top of the pile in front of you (the right hand half of the pile of sorted documents you were looking at) with the one you are inserting into the sorted pile. If the case number of the document to be inserted is less than the one at the top of the pile in front of you then repeat this process with the documents you put to your left. If the case number of the document you are inserting is greater than the one at the pile in front of you then repeat the process with that pile, the again split the pile in half and putting the lower numbered half to your left making sure that it goes in between its upper half and any other piles to the left. This way the piles of documents will remain in sorted order from left to right. Keep on doing this until either you find a matching case number or your pile of documents consists of a single document. In the first case put the document to be inserted on top of the pile with the matching case number. In the second case put the document to be inserted on top of the pile (actually a single document) if its number is less than the document's and on the bottom otherwise. Then put all the documents back into a single stack making sure that you put the piles together in the right order.²

Then repeat this for each document in your pile of unsorted documents until you've inserted all of the unsorted documents into the sorted pile.

1. We didn't specify what the "right place" was. That's because even though Bob hasn't done Maths since grade 11 and wasn't very good at it, we can assume he is smart enough to know the order of numbers from lowest to highest.

2. Bob is finding the right place to insert the documents using binary search. We could have done this with our insertion sort algorithm in class, but it isn't useful because you still have to shift the array elements up one position to make room for the value being inserted, which still makes this a O(n) process. This isn't an issue when you are inserting a document in a pile of other documents.

b) Estimate the amount of time that Bob will take to sort the documents. State your assumptions. [4 marks]

Again, there are many possible answers but here is one solution to my answer to (a).

Assumed time for each step of the process

1. It takes 3 seconds to split a pile of documents into two piles

2. It takes 2 seconds to compare one case number to another one and figure out which is bigger

3. It takes 1 second to put the inserted document above or below a single document

4. It takes 5 seconds to gather up the stacks of documents into a single stack

Every time a document is inserted into the sorted pile you have to perform steps 1, 2 and 3 log(n) times and step 4 once. There are 1,000 documents. An approximate value for the time is therefore 1000 * (log(1000) * (3+2+1) + 5) » 65,000 hours or about 18 hours!

CMPT 125 Home

John Edgar (johnwill@sfu.ca)