Exercise: Git and GitHub

Outline

Introduction

By this point, you should already be able to use and understand the basics of version control. You should have specifically used git while managing a small team project. In this exercise, you will demonstrate how to use some of the features of modern version control along with how to use integrated tools that can assist in coordinating and enforcing tasks like code review, scheduling, and issue tracking in a team development environment. The skills that you use in this exercise are core essentials of modern software development practices. Not only will you be expected to use these techniques, and the features of GitHub, while developing your semester projects, but any work that you do outside of these workflows simply may not be taken into account during grading.

There are many outside sources of information on using git. The git website has an excellent book along with introductory videos that provide an excellent and thorough introduction. They also have numerous tutorials. Much of the information in this exercise comes from these tutorials or expects you to be able to follow these tutorials and then demonstrate your understanding.

NOTE, within this exercise and for this class, all work should be completed using SFU's private GitHub Enterprise server at https://github.sfu.ca. Work submitted using github.com may not be graded.

Step 1: git Basics

As you already know, a version control system (VCS) tracks the changes in files over time. Thus, if you want to know when or why a particular change was made to a file, the VCS should hold the answer. Managing these versions also means that the VCS can coordinate and track the changes made by multiple developers, ensuring that changes from multiple developers do not (textually) interfere with one another. In a modern VCS, a set of selected changes to files is applied atomically. That is, if the changes do not interfere with other changes concurrently made by another developer, then the changes are all applied simultaneously, otherwise, they are not applied.

One feature of git and other distributed version control systems is that you locally batch together groups of changes to files into atomic commits. You make your changes on your own computer. Then you can choose to share these batches of changes with others by pushing them to a remote repository on a different machine or server.

You should have past experience with:

If you are uncomfortable with these git basics from your previous classes, then you can learn more in the following sections of the git book:

Once you are ready, the first step will just make sure that you can demonstrate these.

Action Items

Create a new private repository using SFU's GitHub server. Give both the instructor (wsumner) and the TAs (hea12, hks48) access to the private repository. You can do this through the menus: "Settings" ⟶ "Collaborators" ⟶ "Manage Access" ⟶ "Add People". This is required for your exercise to receive a grade. You will use this repository for all tasks completed in this exercise and submit a cloneable URL of the repository at the end.

The repository must contain exactly two commits. The first commit to the repository must have a main branch with exactly one file called student.txt, and file must contain exactly one line with your SFU user ID (your short email name) without the @sfu.ca suffix.

The second commit must rename/move student.txt to username.txt and add a second file called readme.md. The second file may contain anything you want.

These two commits must (naturally) be pushed to the remote GitHub server.

Step 2: Branching and Merging

Branching in a VCS allows different versions of a project to be worked on at the same time and even combined later. Specifically, branching refers to creating a divergent history for a project. The main history of the project can continue normally on the main branch, and you can make experimental modifications to a second branch without affecting other developers' ability to use the main one. In fact, they may be entirely unaware of the changes that you make to the second branch. Once the second branch is in a desired state, you can merge the histories again, applying the desired changes from the second branch into the main one.

A primary distinguishing feature of git is that branching is easy and lightweight enough that it often becomes one of the primary tools for tracking and managing changes to a software project. We will explore this more in step 3.

Again, you most likely already know about and understand:

If you feel uncertain or in order to gain a better understanding, you can read the following portions of the git book:

You may also try one of the online tutorials.

Action Items

Create a second branch called feature in your repository. Add a new file called diary.txt to the feature branch that contains anything you want. Commit those changes to feature.

Merge the feature branch into main. Be careful about the direction of the merge. After this, main should now contain diary.txt.

You should now be on the main branch. Create a file called recipes.txt and commit it. Create another branch called roguechef and modify recipes.txt on that branch. Make sure to then commit it. Don't merge yet! Back on the main branch, modify recipes.txt differently before committing it. Now merge roguechef into main. This will cause a conflict because the changes in the two branches interfere. Resolve the conflict and finish merging roguechef into main.

Step 3: Following a Workflow and Code Review

As you saw in chapter 3.4, branching can be used to manage experimentation as part of developing a single feature, maintaining legacy versions and new development versions, and more. In addition, as we shall see shortly, branching can also be used to manage and enforce peer review of all code before it is committed to the repository. From now on, you should be following this latter practice for your projects this semester.

However, there are also some risks that ought to be recognized. In particular, maintaining many different branches can become undesirably confusing for developers, and the longer a branch lives without being merged into its eventual targets, the less likely the branch is to merge with few conflicts and little rework (or at all). The benefits and trade-offs of these workflow options need to be assessed and balanced in order to design a workflow that fits best for your project. Many companies shift to a trunk based approach to manage the risks of stale branches.

For projects this semester, you will follow a simple GitHub Flow. You can find a full explanation of GitHub workflows here. Read this documentation for a full explanation of the workflow. You can also find videos online illustrating a simple use of GitHub Flow along with features like pull request management and code review. Watch this video. It contains a clear illustration of the steps to follow for every change that you make to the repository. Roughly, the usual steps are as follows:

  1. Perform your development of a feature, refactoring, or whatever on a separate branch.
  2. When you are ready to merge this feature, push the branch to the remote server.
  3. Create a pull request for the branch. Pull requests are GitHub's way organizing changes that need to be discussed and approved before being merged into the workflow of other developers. After pushing a specific branch upstream, viewing the repo or the pull requests tab of the repo will allow you to start the pull request process.
  4. Assign at least one peer of your team to review the code and provide feedback.
  5. Your team member then reviews the code, and you can continue to discuss and improve it until it is eventually approved and merged to the repository. Under GitHub Flow, the source branch is removed upon merging.

Note, I will track the pull requests as one metric of measuring your overall contributions to your semester projects. Do not delete the source branches as you proceed, as I can use these to give you credit and resolve issues in your group.

For your semester projects, you will receive credit for solid contributions that makes its way into your main or develop branch. This provides extra incentives to keep your branches short lived and focused.

Action Items

Create a new branch topsecret and commit a single file called secrets.txt to it. Create a pull request for the branch. From the pull request management page in SFU GitHub, approve the pull request to finally merge the branch into the main branch of the repository. Do not delete the topsecret branch.

NOTE: In a real project, you should not approve your own pull requests! In this case, you are doing so simply to learn how the system works. For your term project, your pull requests should be reviewed and approved by another team member before being merged. Branch protection rules in GitHub can enforce this rule for you automatically.

Step 4: Issues and Scheduling Milestones

GitHub can also identify units of work as issues. An issue can be a bug fix, a documentation change, a feature addition planned for a future sprint, or any other unit of work. The issues page of the rep provides a convenient way of organizing these issues along with a platform for discussing them, assigning them to developers, and monitoring their progress. Using this system for your semester projects can help to ensure that you do not forget a task or lose track of it amongst the many requirements you must fulfill.

Issues can also be assigned to milestones, which are commonly used to identify deadlines or iteration/sprint boundaries in a project. By having both issue tracking and milestone scheduling, GitHub allows you to flexibly keep track of the tasks that need to be performed and schedule or reschedule them as necessary.

Action Items

Create a new issue titled "Add more cats to the readme". Create a new milestone and schedule the issue you created it.

Now modify readme.md on a new branch by adding the word cat to it (anywhere you like) and commit it to the repository with a commit message that includes the words "Fixes #N" where N is the number of the issue you created. Push the new branch in order to create a pull request. Accept the pull request.

Step 5: Submission

To submit your exercise, you must submit the cloneable location of your repository via CourSys. You can find this on your repository main page. Click the "Code" button and select the "SSH" option.

For instance, my submission might be

1
git@github.sfu.ca:wsumner/exercise2.git

Bonus Steps: Additional Features and Resources to Save Your Skin

git includes many additional features that allow you to more conveniently change, navigate, and extract useful information from the history of a project. Three particularly useful features are interactive staging, stashing, and tools for debugging. In addition, recall that we spoke in class about how analyzing git logs could help you to identify problem areas in your project.

Undo the last commit

If you make a mistake, you can "undo" a local commit while keep the changes in your working tree by using:

1
git reset --soft HEAD~1

Interactive Staging

Interactive staging allows you to look at the changes in your working directory as diffs and select exactly the combination of changes you want for a commit. This means that you can choose to commit only part of the changes to a file if you made changes to the same file that semantically belong to different commits.

Stashing

You probably noticed that checking out a branch required a clean working directory (with no uncommitted changes). Sometimes you might have changes in your working directory that you do not yet wish to commit. In this case, you can stash the changes for later, change branches to do your work, change back to the original branch, and unstash your uncommitted changes on the original branch. It may sound complicated, but it is quite straightforward to use.

Debugging

Git contains some commands that can greatly simplify debugging your code. In particular, you'll find that use git blame to identify the last commit (and developer) that touched a particular line of code. Sometimes, though, you'll want to perform a binary search on the history of a project in order to discover when a bug was first introduced. This can be done either interactively or automatically using git bisect. These are powerful tools to have in your arsenal.

Visualizing

There are many git visualization tools, but on the command line you can also visualize git history and branches with:

1
git log --graph --abbrev-commit --decorate

Or in a more condensed form:

1
git log --oneline --decorate --graph --parents

What do you think the history should look like if you followed instructions? Do your classmates' graphs look like yours?

Oh Shit, Git!?!

Git has many ways to make mistakes. It may have equally many ways to recover from mistakes. If you face a problem, others have too. Oh Shit, Git!?! is a handy collection of problems and git recipes for how to recover from them.