Assignment 5: Spell Checking¶
Your task in this assignment is to implement a spell checker based on an open address hash table using quadratic hashing to resolve collisions.
Implement this as a command-line program where the user can type in various commands and get immediate results. Here’s a sample run of how it might work when it’s done:
$ ./a5
Welcome to SpellCheck Master!
> check apple
ok
> check bapple
don't know "bapple"
> suggest bapple
Suggested words:
apple
dapple
> add bapple
"bapple" added to the dictionary
> check bapple
ok
> remove bapple
"bapple" removed
> checkfile story.txt
// a list of all mis-spelled words, and the line they appear on
// are printed
Please call your spell checker some cool name other than “SpellCheck Master” — that name’s taken.
Implementation Constraints¶
Please follow these rules for your implementation:
The only files you can
#include
are the ones lists ina5.cpp
below.Your program should assume that all words are stored the two text files are in the same folder as the program:
maindict.txt
is the main dictionary of words.You should assume that this file contains tens of thousands of words, one word per line. The words are not in any particular order.
addedwords.txt
are words added by the user with theadd
command. Since they are stored in a file, user-added words will persist between runs of your program.
Store your words in a (lovingly) hand-crafted hash table with the following properties:
- it’s implemented as a class that hides its implementation details
- the underlying table is an array
- it uses open addressing
- collisions are resolved by quadratic hashing
- adding a string, finding a string, and removing a string are all fast O(1) operations on average; the performance of all the commands should feel instantaneous to the user
- automatically re-sizes itself (e.g. doubles the capacity of the unerlying array) when the load factor is above 0.5; this requires re-hashing all the values, and so it’s okay if operations for when this happens take much more time than when re-hashing is not done
Commands for Your Program¶
For all commands, you can, if you wish, make a few simplifying assumptions:
- strings entered by the user won’t have spaces, or be empty
- case matters, i.e.
apple
andApple
can be treated as different strings - you don’t need to do any special handling for punctuation or other non-letter characters
- the case of the command itself matters, i.e.
check
andCheck
are not the same
Here are the commands your program should implement:
check s: Prints “ok” if string
s
is a word in the dictionary, anddon't know "s"
otherwise.This checking should be instantaneous from the user’s point of view.
add s: Adds string
s
toaddedwords.txt
. Ifs
is alreadymaindict.txt
oraddedwords.txt
, then this does nothing (so there should not be multiple copies of words in either file).Note that you don’t necessarily need to add
s
toaddedwords.txt
immediately. You could, for instance, wait until the program ends to add all user-added words in one big batch.Since user-added words are saved in a file, the next time the program runs they will be treated as correct spelled words.
remove s: Removes
s
fromaddedwords.txt
, and print a message like “s removed”. Ifs
is inmaindict.txt
, then print a message like “can’t remove s: it’s in the main dictionary”. Ifs
is in neithermaindict.txt
noraddedwords.txt
, then print a message like “can’t remove s: unknown word”.checkfile s: Opens a text file named
s
, and finds all misspelled words (i.e words in neithermaindict.txt
noraddedwords.txt
) and lists them in the order they are found. Each misspelled word should be put on its own line and include the line number in the file where it was found.If
s
is not the name of a valid file, then print a message like “can’t find file s”.suggest s: Gives a list of correctly spelled words similar to
s
. Usuallys
is misspelled, but it’s certainly possible thats
is a known word. For known words, you should still give a list of suggestions.The details of exactly what counts as a similar word, and how you can find similar words, are up to you. Do a bit of research on the web to find out how simple spelling suggestions can be made.
For some words, there might only a few (or even 0) suggestions, and it other cases there may be a lot of suggestions. You probably don’t want to show the user too many possible suggests.
quit or stop or end or done: Ends the program.
If the user types an unknown command, then print a helpful error message like “unknown command”.
You are welcome to add other commands. Just make sure the commands above are implemented as described, since they will be the ones tested by the marker.
What to Submit¶
Submit just a file called a5.cpp
that contains all your code for this
assignment. It must compilable with the course makefile, and when ./a5
is
run at the command line it should read maindict.txt
and addedwords.txt
and then give the user a prompt to enter commands.
See Canvas for the exact due date and marking scheme.
Use
valgrind
to help ensure your program has no memory leaks or other memory errors, e.g.:$ make -B a5 $ valgrind ./a5 // ... lots of output ...
A program is considered to have no memory leaks or problems if:
- In the
LEAK SUMMARY
,definitely lost
,indirectly lost
, andpossibly lost
must all be 0. - The
ERROR SUMMARY
must report 0 errors. - It is usually okay if
still reachable
reports a non-zero number of bytes.
- In the
If you are using dynamic memory, use
new
anddelete
to allocate and de-allocate memory. Do not usemalloc
andfree
.Do not use C++ smart pointers, such as
unique_ptr
orshared_ptr
. Stick to raw pointers, and usenew
anddelete
(ordelete[]
) and constructors/destructors (plusvalgrind
) to manage memory.You must include this statement of originality comment at the start of
a5.cpp
(see below). If youra5.cpp
does not have this, then we will assume your work is not original and it will not be marked.
If your program meets all these basic requirements, then it will graded using the marking scheme on Canvas.
a5.cpp¶
// a5.cpp
/////////////////////////////////////////////////////////////////////////
//
// Student Info
// ------------
//
// Name : <put your full name here!>
// St.# : <put your full SFU student number here>
// Email: <put your SFU email address here>
//
//
// Statement of Originality
// ------------------------
//
// All the code and comments below are my own original work. For any non-
// original work, I have provided citations in the comments with enough
// detail so that someone can see the exact source and extent of the
// borrowed work.
//
// In addition, I have not shared this work with anyone else, and I have
// not seen solutions from other students, tutors, websites, books,
// etc.
//
/////////////////////////////////////////////////////////////////////////
//
// You are only allowed to include these files --- don't include anything else!
// If you think it's necessary to include another file, please speak to the
// instructor.
//
#include "cmpt_error.h"
#include <iostream>
#include <string>
#include <fstream> // for reading/writing text files
#include <sstream> // for using string streams
// ...