Exercise 3: Basic C++ and Design

This exercise will help you practice and demonstrate your ability to perform a few introductory C++ programming and design tasks. You will implement several simple behaviors that should not take more than a few minutes of work. However, there are again rules. In order to receive any credit for the assignment,

you may not use loops or recursion unless explicitly stated
you may not use std::for_each
You may not use any data structures other than std::vector, std::optional, and std::pair
you may not use new or delete
you may not write destructors for any classes you write unless explicitly told that you may
you may not use by-reference lambda captures or mutable lambdas
all code must compile and run without memory errors or leaks

Additional rules may be added on a per task basis. The rules exist in part because the tasks are indeed simple and in part because they will constrain you toward better basic design knowledge and skills.

Most of this exercise is again about knowing basic programming skills. The programming tasks are at a freshman level, but this exercise will force you to be more competent in your use of C++. If you have previously taken a course that uses C++, you should find it straightforward. As a reminder, a good reference for modern approaches and wisdom is the C++ Core Guidelines repository. For many of the built in algorithms, you will be interested in the <algorithm> and <numeric> headers. There is extensive documentation available online that breaks them down based on the type of operation being performed. While you cannot use it in this exercise, the Google Abseil library is also an excellent resource for algorithms, data structures, and knowledge.

A template for the project is available here. You should not add new files to the template or they will not be graded. All of your solutions should modify code in the lib/solutions directory. This helps automated checking for adhering to the rules. The template requires at least C++17, but you can use C++23 if you wish by running cmake with

cmake -DCMAKE_CXX_STANDARD=23 <<path to project>>

Some parts are easier using C++20 or C++23.

Note as well that normal plagiarism detection will be applied.
Do not share your code with or show your code to other students.
Keep in mind that this applies even after the semester ends.
It is better to lose points for not knowing something than to cheat.
University policy applies even after graduation.

NOTE: While this is Exercise 3, use the namespace ex4 as mentioned in the instructions.

Part 1 – A fistful of functions

This group of tasks is mainly focused on writing a few simple functions that demonstrate basic competence, but they also ensure that you are aware of some core APIs. You should find it trivial.

Task 1 – Pulling someone else's strings

We discussed usage of strings a little in exercise 0. You will often want to work with APIs that operate on strings. Note that many of these should be operating on domain objects instead of strings (avoid "stringly typed APIs" [6] [7] [8]), but nonetheless strings are common form of data to operate on as well. In spite of this, working with strings can be tricky. Should you take a C string as an argument? Should you take a C++ string as an argument? Are there other forms of strings that should be handled? What about arrays of characters? What about nullptr or an empty string?

C++ uses the std::string_view class to handle these concerns, as we have already discussed and demonstrated in class. It provides a generic interface for strings as described here. You can read a thorough discussion on the Google Abseil blog as well. You should use string_views in this task.

Define a function task01 in task01.cpp that takes a single argument that can be (1) a C style string, (2) a C++ std::string, or (3) even a raw array of characters. Declare the function with a prototype in task01.h.

The function should take the passed in string data and split it into substrings for each span of either (1) contiguous alphanumeric characters or (2) contiguous punctuation characters. For instance, the string data

"This is... maybe... a 1str1ng1."

would be broken into the substrings: "This", "is", "...", "maybe", "...", "a", "1string1", ".".

However, the function should not own the data passed in or the substrings returned going out. It should only perform one allocation. It should also work regardless of how large the passed in string is (as long as there is enough RAM). Note that if the function were to create a std::string for each of these substrings, it would potentially allocate for each one of them.

You may use one loop for this task. As we'll discuss later, if you use C++ ranges, you don't even need that! Using functions of the STL and string_views can solve most of the problem for you without any deep C++ knowledge.

As a final reminder, where it is possible to use a domain relevant type instead of a string, you should almost certainly do so. Stringly typed code is bad and will not count toward your project. Code that operates on strings because its purpose is to operate on strings is, on the other hand, fine.

Task 2 – Avoiding ambiguous interfaces

Related to stringly typed APIs being problematic, ambiguous APIs in general are bad because they are prone to misuse (as discussed in class). Stringly typed APIs are one form of ambiguity. Additional issues arise when multiple arguments of the same type are passed to a function. It can become easy to make mistakes in the order of arguments, leading to defects. As with stringly typed APIs the solution is to define types based upon relevant domain entities and use those instead.

Consider a function

double distance(int x1, int x2, int y1, int y2);

This function computes the Euclidean distance between the coordinates (x1,y1) and (x2,y2). It is easy to see that the above declaration is for a distance function where the components of each coordinate could be misordered, leading to bugs.

Define a function distance in task02.cpp that takes two arguments of type Position and computes the double precision Euclidean distance from these instead of from the coordinate components. Position should be able to represent integral coordinates within the range of a signed int. Declare the function and the Position type in task02.h define anything else you need for Position in task02.cpp as necessary. I should be able to call this new function with, e.g.

double result = distance(Position{1,2}, Position{3,4});

where 1 and 2 are the x and y components of a single point (or position). Note that Position defines a value type that is fine to copy around. No pointers, references, or other such complex notions are required for any of this task.

Note: we still have two arguments of the same type! Have we done this for nothing? In this case, the API is now expressive enough that we can understand exactly what is going on when reading the code. That clarity alone is worth it. In addition, we can tell from the meaning of Euclidean distance that order between the positions does not matter, so the order of the arguments is now irrelevant to the correctness of the caller!

Part 2 – Using the standard library

This group of questions focuses more on making use of the C++ standard library. Keep the rules of the exercise in mind. Note also that in the new version of C++ these are easier, as ranges should enable workflow more like LINQ in .NET or streams in Java. For these tasks, you may also find it helpful to consider the documentation for std::vector and some iterators.

Task 3

Define a function task03 in task03.cpp that takes in a std::vector of InfoBlobs as defined in the Support.h header. The function should return a vector of ex4::Actions as discussed below. Declare the function in task03.h.

The function should identify the subrange of the passed in list from the first HAPPY blob to the last SAD blob, inclusive. The resulting list should contain an Action for each HAPPY and SAD element of this subrange in reverse order. Elements with more spin than entropy should yield a PLOT action. Other elements should yield a RUN action.

You may not modify the original list. You may not allocate any memory except for the std::vector of Actions that is returned. This memory should be allocated once and only once. Recall that a std::vector may grow unless you size it correctly.

These constraints dictate a flow to the algorithm that should help guide you and simplify the code. C++20 ranges make this task particularly simple.

Task 4

Define a function task04 in task04.cpp that takes in a std::vector of InfoBlobs as defined in the Support.h header. The function should return nothing and modify the original vector. Declare the function in task04.h.

The function should identify all HAPPY InfoBlobs in the passed in sequence and place them contiguously before the first PERPLEXED InfoBlob. If there are no PERPLEXED blobs, then the HAPPY blobs should be at the end of the list. The order of all HAPPY blobs must be preserved. The order of all non-HAPPY blobs must be preserved. The rules of the game still apply. You are allowed 1 heap memory allocation of the same size as the passed in list. Note that some standard library functions will allocate memory.

Done well, this should be only a few lines of code.

Part 3 – Ownership, resources, and RAII

Finally, in this last section, you will explore ownership and lifetimes a bit more.

Task 5 – Growing a garden

In many cases, you might consider adding collections of things. You might add a list to the end of another list. You might concatenate many strings into a new string. Such operations are common.

In general the workflow for such operations should perform as few allocations as possible. You should determine the resources required, acquire the necessary resources, and then use those resources to perform the operation in question. Often there are APIs that can help you with this, such as str.join in Python. However, you should understand what is going in order to perform such operations yourself as necessary.

Define a function task05 in task05.cpp that takes in a std::vector of Planters as defined in the Support.h header. A Planter contains many Plants; The function should return a vector of ex4::Plants. The vector should be a flattened form of the passed in arguments. That is, it should first contain the Plants of the first Planter, followed then by the Plants of the second Planter, and so on. Declare the function in task05.h.

You may use at most one memory allocation. Within this one function, you may use up to one for-each loop. That for-each loop may not contain any loops, recursion, or even if statements.

Note that concatenating strings using += in a loop is considered bad practice in almost all programming languages. It is generally a sign that a developer does not understand how to program in a particular language. (The exception was JavaScript, but the future in that context is uncertain.) For C++, Google created the Abseil helper library that includes many common operations like efficient string concatenation.

Task 6 – From gardens to arboreta (on trees and graphs)

Richer data structures may have complex relationships and lifetimes. This can make them particularly tricky to track. Thinking about individual objects and elements in such scenarios is often cumbersome enough that we would rather consider collections of elements. If we can manage a lifetime of the entire collection as a whole, then there are fewer places to manage, fewer cornercases, and fewer things that can go wrong.

You must design a class that represents forests. Call this Forest. Define it in task06.cpp and declare it in task06.h.

The constructor should take a numeric argument that determines the number of nodes in the forest. This determines that nodes with IDs 0 through one less than the size should exist.

The addNode method should increase the number of nodes by one, allowing a higher ID to be used.

The grow method should take a From argument and a To argument denoting the source and sink of an edge in the forest. It will create an edge from the source to the sink. For instance,

forest.grow(From{1}, To{2});

should grow an edge from the node with ID 1 to the node with ID 2. If a target To node already has a parent, then it should be reparented to the given From node. Note that this means there is only ever one edge from a parent to a child.

The cut method should take a To argument. It should remove any edges from the parent of the target to the target itself. If there is no parent, then it does nothing.

If any of the From or To arguments are invalid, then the methods should not perform any operation.

The getEdges method should return a vector<std::pair<size_t,size_t>> containing each pair of nodes participating in and edge of the form (From, To). The edges should be ordered from low to high primarily by From and secondarily by To.

Remember the rules.

Memory leaks are not allowed.
You may not use smart pointers.
You may not use new and delete.
You may not use loops or recursion.
You may not define a destructor or other special member functions.
You may use std::vector (but nothing fancier).

There is no limit to the number of allocations that std::vector may perform in this task, but you should keep in mind the previous lessons.

Looking at Allocation Behavior

If you build the project in release mode as opposed to debug (the default) then you can use tools like heaptrack to analyze the allocation behavior if you are uncertain about it. This can be done by running:

heaptrack << command line of program to analyze>>

To record all allocation behavior of the program and then using

heaptrack --analyze << log file name generated by the first command >>

In debug mode, the template uses AddressSanitizer and LeakSanitizer to check for memory leaks. Note that this will not count your allocations for tasks where those are limited.

Submitting

Create an archive of your solution by changing into the directory containing your project template and running:

tar zcvf e3.tar.gz se-basic-cpp-template/

You can then upload this archive to CourSys.