CMPT 125 Assignment 3 Solution
a) Using C code complete the mergesort
function shown below, your function must be implemented iteratively (i.e. using
loops instead of recursion). You can assume the existence of a merge function
with the following header (i.e. you don't have to write the merge function, and can call it in your mergesort function). [4 marks]
void
merge(int arr[], int start, int mid, int end);
// PARAM: arr = array to be sorted
// len = length
of arr
// POST: arr is sorted in ascending order
void mergesort(int arr[],
int len){
int
n = 1; // length of sub-arrays to be merged
while(n < len){
// Set indexes of sub-arrays
int
start = 0;
int
mid = start + n - 1; //end of first subarray
int
end = mid + n;
// Merge sub-arrays
while(start
< len){
// Set index of last sub-array
if(end
>= len){
end = len - 1;
}
merge(arr, start, mid, end);
// set indexes of next sub-arrays
start =
end + 1;
mid =
start + n - 1;
end = mid
+ n;
}
// Double size of sub-array
n = n * 2;
}
}
b) Using C code write a merge function that merges two
sorted arrays, not sub-arrays of the
same array, into a single sorted array and that returns a pointer to the resulting
merged array. Your function should have parameters for the two input arrays and
their lengths. [4 marks]
// PRE: arr[0..len1-1]
and arr2[0..len2-1] are sorted
// POST:
returns a pointer to an array of size len1+len2
// result array is sorted
int* merge1b(int arr1[], int len1, int arr2[], int len2)
{
// Create empty
results array and set indexes
int
len_result = len1 + len2;
int*
result = malloc (sizeof(int) * len_result);
int
arr1_index = 0;
int
arr2_index = 0;
int result_index
= 0;
// Merge input arrays
while(arr1_index
< len1 && arr2_index < len2) {
if
(arr1[arr1_index] < arr2[arr2_index]) {
result[result_index++] = arr1[arr1_index++];
} else {
result[result_index++] = arr2[arr2_index++];
}
}
// Copy remaining elements from unfinished
array
while(arr1_index
< len1){
result[result_index++] = arr1[arr1_index++];
}
while(arr2_index
< len2){
result[result_index++] = arr2[arr2_index++];
}
return result;
}
The call stack is part of main
memory that is reserved for function calling. Like all computer memory it is
finite, so can be exhausted resulting in a stack
overflow. Recursive functions allocate space on the stack for each
recursive call; if there are many such recursive calls a stack overflow can
result. The questions that follow ask you to investigate recursive functions
and stack overflows.
Note that when running programs
in the Linux terminal a stack overflow results in a segmentation fault.
a) Using C
code write a recursive function that is equivalent to the iterative function
shown below that prints the numbers from 1 to the largest possible integer
value. Note that the constant INT_MAX is contained in the limits.h header file. [2 marks]
void
overflow(){
for(int i = 1; i
<= INT_MAX; i++){
printf("%d\n", i);
}
}
Your recursive function should have
a single integer parameter, and you should initially call the function by
passing 1 to its parameter.
// PRE:
// POST:
Prints the values from n to INT_MAX (in theory)
void
overflow(int n)
{
if(n <=
INT_MAX){
printf("%d\n", n);
overflow(n+1);
}
}
b) Write a
main function and run your solution to part (a) given above. The function will
cause a stack overflow. When this has occurred take a screenshot of the terminal
and include that in your submission. Your screenshot should show the
segmentation fault and (at least) the last value that was printed. [2 marks]
Not shown.
c) Run
each of the two functions shown below in turn, initially passing each function
the value 1 to its parameter. Again, take screenshots of the terminal showing
the segmentation fault and the last value that was printed. [2 marks]
void
foo(int n)
{
int* arr = malloc(100 * sizeof(int));
assert(arr);
for(int i=0; i < 100; i++){
arr[i] =
i + n;
}
printf("%d\n", arr[99]);
foo(n+1);
}
void
bar(int n)
{
int arr[100];
for(int i=0; i < 100; i++){
arr[i] =
i + n;
}
printf("%d\n", arr[99]);
bar(n+1);
}
Not shown – foo should run considerably longer than bar.
d) Explain
why, or why not, the segmentation fault occurred at different times for the two
functions that you ran in part (c). [2 marks]
Because bar allocates
additional space on the call stack for each recursive call for an array,
whereas foo is allocating an
equivalent amount of space in dynamic memory.
e) Assume
that a computer system has a heap of 2GB and that the size of the call stack for
an application is 1MB. Quicksort is used to sort an array that is stored on the
heap (i.e. it is stored in the computer's internal memory not on a file).
Assume that each recursive call to quicksort requires 256 bytes to be allocated
to the call stack. Explain why the call to quicksort will not result in a stack
overflow regardless of the size of the array being sorted. You should assume
that the call to quicksort results in the average case O Notation running time
for the algorithm. [2 marks]
The short answer is because an array big enough to
cause a stack overflow is going to be so big that it won't fit in dynamic
memory. Support for this is as follows.
1 Mb allows for approximately 4,000 recursive calls
(1,048,576 / 256). Given that quicksort makes O(log n) recursive calls to partition the array to sub-arrays of size 1
we can solve for n where log2(n) = 4,000 to find the largest array
that can be sorted.24000 » 1*101200
which is vastly in excess of the size of the heap.
The law firm that employs Bob
has a (menial) task for him. He is to sort 1,000 documents by their case
number. Each document is 8.5 by 11 inches and the case number is an eight digit
number in the top right hand corner of each document. Bob is to sort the
documents in a small (10 by 12 feet) room which contains a single 60 by 30 inch
table.
Bob knows that you have been
studying sorting algorithms so asks you to help him figure out how to sort the
documents in a reasonable amount of time.
a) Describe
to Bob a process for sorting the documents. The process should be based on one
of the four sorting algorithms we have studied in class (selection sort,
insertion sort, merge sort and quick sort). Marks are allocated to how quickly
Bob can sort the documents using your process. Your description should either
be in (clear) English or pseudo-code. Remember that you are describing how to
sort documents, not the contents of an array. Also note that Bob has not
studied Computing Science and has not studied Maths since grade 11 (and he
wasn't very good at it) and your description must be easy for Bob to follow. [6
marks]
Poor Bob. There are many possible answers. Note that log2(1000) = 10. If you are going to relatively
strictly follow the process for mergesort or
quicksort this is roughly the number of piles of documents you need.
Here is a version based on insertion sort.
Put the pile of unsorted documents on the table to your
right. Pick up the first document and put it on the table in front of you. This
is the first document in what is going to be the pile of sorted documents. Pick
up the next document from the pile of unsorted documents and put it in the
right place1 in the sorted pile.
Once the sorted pile has more than ten documents in it you are going to
find the right place to put the next document as I'm about to describe.
Pick up roughly half of the pile of sorted documents
and put them to the left of the pile. Now compare the case number of the
document at the top of the pile in front of you (the right hand half of the
pile of sorted documents you were looking at) with the one you are inserting
into the sorted pile. If the case number of the document to be inserted is less
than the one at the top of the pile in front of you then repeat this process
with the documents you put to your left. If the case number of the document you
are inserting is greater than the one at the pile in front of you then repeat
the process with that pile, the again split the pile in half and putting the
lower numbered half to your left making sure that it goes in between its upper
half and any other piles to the left. This way the piles of documents will
remain in sorted order from left to right. Keep on doing this until either you
find a matching case number or your pile of documents consists of a single
document. In the first case put the document to be inserted on top of the pile
with the matching case number. In the second case put the document to be
inserted on top of the pile (actually a single document) if its number is less
than the document's and on the bottom otherwise. Then put all the documents
back into a single stack making sure that you put the piles together in the
right order.2
Then repeat this for each document in your pile of
unsorted documents until you've inserted all of the unsorted documents into the
sorted pile.
1. We didn't specify what the "right place"
was. That's because even though Bob hasn't done Maths since grade 11 and wasn't
very good at it, we can assume he is smart enough to know the order of numbers
from lowest to highest.
2. Bob is finding the right place to insert the documents
using binary search. We could have done this with our insertion sort algorithm
in class, but it isn't useful because you still have to shift the array
elements up one position to make room for the value being inserted, which still
makes this a O(n) process. This isn't an issue when
you are inserting a document in a pile of other documents.
b) Estimate
the amount of time that Bob will take to sort the documents. State your
assumptions. [4 marks]
Again, there are many possible answers but here is one
solution to my answer to (a).
Assumed time for each step of the process
1. It takes 3 seconds to split a pile of documents into
two piles
2. It takes 2 seconds to compare one case number to
another one and figure out which is bigger
3. It takes 1 second to put the inserted document above
or below a single document
4. It takes 5 seconds to gather up the stacks of
documents into a single stack
Every time a document is inserted into the sorted pile
you have to perform steps 1, 2 and 3 log(n) times and
step 4 once. There are 1,000 documents. An approximate value for the time is
therefore 1000 * (log(1000) * (3+2+1) + 5) » 65,000 hours or about 18 hours!
John Edgar (johnwill@sfu.ca)