For this project, you shall perform mutation analysis studies of existing test suites for open-source software projects. We shall be using two different mutation testing frameworks, PIT for Java code and Stryker for TypeScript, JavaScript, or C#.
You shall work in groups of three to four students (no fewer than three), but the overall requirements
scale with the group size.
Part of the reason for having groups is that I find students often slip through
the program without critical skills for building or running software
from source code.
These skills are critical for most exercises from this point on,
but I will not provide explicit instructions for them within the exercise itself.
You may find tools like nvm
useful if you work within CSIL.
Getting the tools running will require reading a bit more documentation on top of that.
You will analyze one project with PIT and one project with Stryker together as a group. This means that you need to select one project for each tool to analyze with the respective analysis tools.
The following requirements apply to the open-source project.
sloccount
.In finding which projects to analyze, you should identify and consider the following attributes:
Make sure to include this information in your reported results. If you have questions about whether a particular project is a good choice, identify these attributes and ask.
A key part of this task is making sure that you can consistently compile the project (if applicable) and run its test suite. I will expect you to have or acquire competencies in building and running a project.
Overall, you should find both PIT and Stryker easy to use, but the process of mutation analysis can be time consuming depending on the particular project, the nature of the test suite, and the way in which each mutation analysis infrastructure performs its analysis. Some tools are able to analyze many mutants in parallel or perform lightweight analysis to know that mutants will behave the same way on a particular test. Other tools perform the entire analysis sequentially on a single machine. These tend to incur a substantial(!) overhead.
Make sure you give yourself enough time to deal with the unknown unknowns.
You can find more detailed instructions for running PIT here. You can find more detailed instructions for running Stryker here.
Both tools will provide different ways to interface with the results and see the mutants that were not killed during the analysis. You will want to experiment with both in order to get acquainted with them and make sure that you can interpret the results meaningfully.
The group as a whole will perform mutation analysis for both projects in their entirety.
In reporting your results from PIT and Stryker, you should include,
How do your results from PIT and Stryker compare? What do the results tell you about your test suite? Which system did you find easier to use (both integrate and interpret) and why? Discuss these issues in your write-up.
For your overall results, make sure to address the following issues.
Note, the above points are not yes or no questions. Present some evidence and justify your answers.
Each group member individually shall also choose one method from one of their open source projects and consider the results of the mutation analysis specifically for that method. Choose a complex routine that you think is likely to have errors. If fewer than 10 mutants were generated for the method, then either select a different method or select additional methods until you have 10 mutants to consider.
Examine 10 of the generated mutants for your method(s). If both killed and unkilled mutants were generated, include a mix of both. For each one, document your mutation operator. What was the type of operator used? How was it applied to the code (how did the code change)?
For your individual mutants consider the following additional questions: How many mutants are killed? How many mutants are live? For every mutant that was not killed, try to determine either (a) that it is an equivalent mutant that should not be killed, or (b) how to add a test to kill it. Note that (b) involves a test case for a function with inputs and an oracle. For all mutants, try to determine whether they are duplicates of each other. What are the challenges involved? Does it affect the results? Calculate and report the mutation score / effectiveness for your particular mutants. What do these results say about the effectiveness of the test suite and the method(s) that you condidered?
Each group shall submit a final writeup including descriptions of the selected project, the group results, and the individual results as discussed above. Include the package summaries from PIT. Include the mutation reports from Stryker.