By this point, you should already be able to use and understand the basics of
version control. You should have specifically used git
while managing a small
team project. In this exercise, you will demonstrate how to use some of
the features of modern version control along with how to use integrated
tools that can assist in coordinating and enforcing tasks like code review,
scheduling, and issue tracking in a team development environment. The skills
that you use in this exercise are core essentials of modern software
development practices. Not only will you be expected to use these techniques,
and the features of GitHub, while developing your semester projects,
but any work that you do outside of these workflows simply may not be taken
into account during grading.
There are many outside sources of information on using git
. The git
website has an excellent book along with
introductory videos that provide an excellent and
thorough introduction. They also have numerous
tutorials. Much of the information in this
exercise comes from these tutorials or expects you to be able to follow these
tutorials and then demonstrate your understanding.
NOTE, within this exercise and for this class, all work should be completed using SFU's private GitHub Enterprise server at https://github.sfu.ca. Work submitted using github.com may not be graded.
git
BasicsAs you already know, a version control system (VCS) tracks the changes in files over time. Thus, if you want to know when or why a particular change was made to a file, the VCS should hold the answer. Managing these versions also means that the VCS can coordinate and track the changes made by multiple developers, ensuring that changes from multiple developers do not (textually) interfere with one another. In a modern VCS, a set of selected changes to files is applied atomically. That is, if the changes do not interfere with other changes concurrently made by another developer, then the changes are all applied simultaneously, otherwise, they are not applied.
One feature of git
and other distributed version control systems is that you
locally batch together groups of changes to files into atomic commits.
You make your changes on your own computer. Then you can choose to share
these batches of changes with others by pushing them to a remote
repository on a different machine or server.
You should have past experience with:
git clone
git add
, git rm
, and git mv
git commit
git status
, git diff
, and git log
git reset HEAD ...
git checkout -- ...
git push
and git pull
.gitignore
If you are uncomfortable with these git basics from your previous classes,
then you can learn more in the following sections of the git
book:
Once you are ready, the first step will just make sure that you can demonstrate these.
Create a new private repository using SFU's GitHub server.
Give both the instructor (wsumner
) and the TAs (hea12, hks48
)
access to the private repository.
You can do this through the menus:
"Settings" ⟶ "Collaborators" ⟶ "Manage Access" ⟶ "Add People".
This is required for your exercise to receive a grade.
You will use this repository for all tasks completed in this exercise and
submit a cloneable URL of the repository at the end.
The repository must contain exactly two commits. The first commit
to the repository must have a main
branch with exactly one file called
student.txt
, and file must contain exactly one line with your SFU user ID
(your short email name) without the @sfu.ca
suffix.
The second commit must rename/move student.txt
to username.txt
and add a
second file called readme.md
.
The second file may contain anything you want.
These two commits must (naturally) be pushed to the remote GitHub server.
Branching in a VCS allows different versions of a project to be worked on at the same time and even combined later. Specifically, branching refers to creating a divergent history for a project. The main history of the project can continue normally on the main branch, and you can make experimental modifications to a second branch without affecting other developers' ability to use the main one. In fact, they may be entirely unaware of the changes that you make to the second branch. Once the second branch is in a desired state, you can merge the histories again, applying the desired changes from the second branch into the main one.
A primary distinguishing feature of git
is that branching is easy and
lightweight enough that it often becomes one of the primary tools for tracking
and managing changes to a software project. We will explore this more in
step 3.
Again, you most likely already know about and understand:
git switch <branch>
, git switch -c <new-branch>
HEAD
git merge ...
If you feel uncertain or in order to gain a better understanding,
you can read the following portions of the git
book:
You may also try one of the online tutorials.
Create a second branch called feature
in your repository. Add a new file
called diary.txt
to the feature
branch that contains anything you want.
Commit those changes to feature
.
Merge the feature
branch into main
.
Be careful about the direction of the merge.
After this, main
should now contain diary.txt
.
You should now be on the main branch. Create a file called recipes.txt
and
commit it.
Create another branch called roguechef
and modify recipes.txt
on
that branch. Make sure to then commit it. Don't merge yet! Back on the main
branch,
modify recipes.txt
differently before committing it. Now merge roguechef
into main. This will cause a conflict because the changes in the two branches
interfere. Resolve the conflict and finish merging roguechef
into main
.
As you saw in chapter 3.4, branching can be used to manage experimentation as part of developing a single feature, maintaining legacy versions and new development versions, and more. In addition, as we shall see shortly, branching can also be used to manage and enforce peer review of all code before it is committed to the repository. From now on, you should be following this latter practice for your projects this semester.
However, there are also some risks that ought to be recognized. In particular, maintaining many different branches can become undesirably confusing for developers, and the longer a branch lives without being merged into its eventual targets, the less likely the branch is to merge with few conflicts and little rework (or at all). The benefits and trade-offs of these workflow options need to be assessed and balanced in order to design a workflow that fits best for your project. Many companies shift to a trunk based approach to manage the risks of stale branches.
For projects this semester, you will follow a simple GitHub Flow. You can find a full explanation of GitHub workflows here. Read this documentation for a full explanation of the workflow. You can also find videos online illustrating a simple use of GitHub Flow along with features like pull request management and code review. Watch this video. It contains a clear illustration of the steps to follow for every change that you make to the repository. Roughly, the usual steps are as follows:
Note, I will track the pull requests as one metric of measuring your overall contributions to your semester projects. Do not delete the source branches as you proceed, as I can use these to give you credit and resolve issues in your group.
For your semester projects, you will receive credit for solid contributions
that makes its way into your main
or develop
branch. This provides
extra incentives to keep your branches short lived and focused.
Create a new branch topsecret
and commit a single file called secrets.txt
to it.
Create a pull request for the branch. From the pull request management
page in SFU GitHub, approve the pull request to finally merge the branch into
the main
branch of the repository.
Do not delete the topsecret branch.
NOTE: In a real project, you should not approve your own pull requests! In this case, you are doing so simply to learn how the system works. For your term project, your pull requests should be reviewed and approved by another team member before being merged. Branch protection rules in GitHub can enforce this rule for you automatically.
GitHub can also identify units of work as issues. An issue can be a bug fix, a documentation change, a feature addition planned for a future sprint, or any other unit of work. The issues page of the rep provides a convenient way of organizing these issues along with a platform for discussing them, assigning them to developers, and monitoring their progress. Using this system for your semester projects can help to ensure that you do not forget a task or lose track of it amongst the many requirements you must fulfill.
Issues can also be assigned to milestones, which are commonly used to identify deadlines or iteration/sprint boundaries in a project. By having both issue tracking and milestone scheduling, GitHub allows you to flexibly keep track of the tasks that need to be performed and schedule or reschedule them as necessary.
Create a new issue titled "Add more cats to the readme". Create a new milestone and schedule the issue you created it.
Now modify readme.md
on a new branch by adding the word cat to it
(anywhere you like) and commit it to the repository with a commit message that
includes the words "Fixes #N" where N is the number of the issue you created.
Push the new branch in order to create a pull request. Accept the pull request.
To submit your exercise, you must submit the cloneable location of your repository via CourSys. You can find this on your repository main page. Click the "Code" button and select the "SSH" option.
For instance, my submission might be
1
git@github.sfu.ca:wsumner/exercise2.git
git
includes many additional features that allow you to more conveniently
change, navigate, and extract useful information from the history of a project.
Three particularly useful features are interactive staging, stashing,
and tools for debugging. In addition, recall that we spoke in class about how
analyzing git
logs could help you to identify problem areas in your project.
If you make a mistake, you can "undo" a local commit while keep the changes in your working tree by using:
1
git reset --soft HEAD~1
Interactive staging allows you to look at the changes in your working directory as diffs and select exactly the combination of changes you want for a commit. This means that you can choose to commit only part of the changes to a file if you made changes to the same file that semantically belong to different commits.
You probably noticed that checking out a branch required a clean working
directory (with no uncommitted changes). Sometimes you might have changes in
your working directory that you do not yet wish to commit. In this case, you
can stash
the
changes for later, change branches to do your work, change back to the original
branch, and unstash
your uncommitted changes on the original branch. It may
sound complicated, but it is quite straightforward to use.
Git contains some commands that can greatly simplify
debugging your
code. In particular, you'll find that use git blame
to identify the last
commit (and developer) that touched a particular line of code. Sometimes,
though, you'll want to perform a binary search on the history of a project in
order to discover when a bug was first introduced. This can be done either
interactively or automatically using git bisect
. These are powerful tools to
have in your arsenal.
There are many git visualization tools, but on the command line you can also visualize git history and branches with:
1
git log --graph --abbrev-commit --decorate
Or in a more condensed form:
1
git log --oneline --decorate --graph --parents
What do you think the history should look like if you followed instructions? Do your classmates' graphs look like yours?
Git has many ways to make mistakes. It may have equally many ways to recover
from mistakes. If you face a problem, others have too.
Oh Shit, Git!?! is a handy collection of problems
and git
recipes for how to recover from them.