Topical Concepts
Source Version Control
The concept of tracking source code is simple enough. Over the lifetime of a software project, source code files will change frequently as features are added, tested, debugged and released. Without a tool of some sort to manage the changes to the source files, you invariably end up with discrete copies of the entire set of source files for each milestone. Not only is this inefficient storage, but it is difficult to track particular features or changes. It can also be easy to experience “regression” where features or fixes are made in one set of project source files but not in others. This approach can be made to work for individual developers working in isolation, but in a team environment, failure is inevitable.
Enter version control systems. Many systems exist that range from simple programs that create “versioned” copies of the controlled files within the same directory or subdirectory suitable for a single developer or very small team to elegant systems that might present a particular version of a set of source code as separate networked file systems and can support hundreds of individual developers. These systems exist as free and open source to proprietary systems costing thousands of dollars per developer.
There are three general categories of version control systems. The simplest might be considered “local data model” in that the revision information is stored in the local filesystem – albeit, this could be shareable on a multiuser system.
The most modern approach is known as “distributed.” In a distributed system, a server maintains the repository, much like the “client-server” model, but when a developer wishes to work with the project, the developer will create a local copy of the entire repository. This allows each developer to work in isolation and gives each developer local access to all version information.
Git
Interestingly, git has evolved to be the distributed VCS tool of choice. It is free and open-source, highly reliable, and trusted. Given these properties, git has also evolved to be the foundation for various online services that offer repository hosting services, notably GitHub, GitLab, BitBucket, and others. As a distributed system, it can copy (clone) remote repositories to the local system for development, and then update the remote repository (push) when work is completed. Interestingly, a remote repository is not needed at all if you are not sharing code with other developers. You can create a local repository for personal code management using a similar workflow.
GitHub Classroom
The Exercise
Overview
The goal of this exercise will be to create a GitHub repository (via GitHub classroom) and add your current project source code. This exercise will be largely prescriptive – simply follow each step and make observations along the way.
Step 1 – Sign in to GitHub and start assignment.
Go to this link: https://classroom.github.com/a/HPnQGMuD
Once logged in, you should be greeted with this screen:
except the repository should have your username at the end (instead of mine – highlighted in yellow).
and finally, if you refresh the page after a bit, you should see this:
Again, your username will be part of the repository name (where rothede is). This has only created the repository at GitHub, and you can check it out via the GitHub web interface if you wish. The template project was empty, so there will be no files (except perhaps a README.md file).
Step 2 – Setup GitHub Command Line Interface
GitHub has adopted some strict policies for accessing repositories in terms of authentication. Unfortunately, this means that your GitHub credentials cannot be used directly from the git command line utilities. There are a number of authentication mechanisms but they all require some setup. This setup seems OK…
First, install the GitHub CLI (GitHub, not git – we already have git). The package is not in a normal Ubuntu package repo, so follow instructions here: https://github.com/cli/cli/blob/trunk/docs/install_linux.md These instructions include the following command:
type -p curl >/dev/null || (sudo apt update && sudo apt install curl -y) curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg
&& sudo chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archivekeyring.gpg] https://cli.github.com/packages stable main" | sudo tee
/etc/apt/sources.list.d/github-cli.list > /dev/null
&& sudo apt update
&& sudo apt install gh -y
Step 3 – Authenticate
Now, you can authenticate with GitHub using the utility ‘gh’ you will need to do this whenever you wish to use git command line commands to interact with the remote repository.
The following shows me authenticating with GitHub:
note that this did pop a web browser window to complete authentication for this session.
You will also need to configure git with your email and a username which it will use to tag commits. Do this with ‘git config’ commands, as follows:
Step 4 – Clone the Repo
We want to populate the repository, and there are a number of ways to do that. We could actually upload files through the web interface, but that is not a very productive prospect. Instead, we will use the git command line interface to clone the remote repository to your local machine, add files to the local repository, and then eventually push the new files back to GitHub. This sounds more complicated, but it more accurately represents a normal workflow when you are developing.
So, run WSL/Ubuntu. From the command line, create and/or change to whatever directory in which you wish to locate the local copy of the repository. Note that cloning the repo will create another directory, so you can be in a generic (nonproject specific) directory. Here I am ready to start:
Now issue the command: ‘git clone https://github.com/MSOE-CPE2600/vector-lab-turney-rothede’ where rothede at the end is replaced with your username. You should see something to the effect:
Note that there is a new directory with the name of repository. Moving into that directory.
This should match what you saw in the repository through the web browser interface. This is now the working directory for the repository.
Step 5 – Populating the Repository
We can now add our source code to the working directory. Copy all of your source file to this working directory. I suggest copying files and not moving them so you can try again if something get messed up. Eventually you will likely get rid of the original files once under repository control.
If you are literally starting a project from scratch, you would create the new files here. Another time to start the repository is when you have some working code. In either case, the process is about the same.
I am going to create a repository with the initial version of my “vector lab” project. I need all of the source files (.c), header files (.h) and the Makefile. Of course, if there were any other dependencies that need to be tracked (data files, README files, etc) they should be copied over too.
We never want to track build products – that is .o files, .d files or then finished executable. It would not hurt to run make to ensure everything is working and present and then make clean to get rid of the build products.
Now, these files are not yet part of the repo and are not being tracked yet. We need to “add” files to the staging area, which in the case of new files will cause git to start tracking them. After that, we can “commit” the files to the local repository.
Making sure there are no unwanted files in the working directory, you can issue the command ‘git add .’ to add all of the files in the directory to the staging area. Alternatively, you can issue ‘git add *.c’, ‘git add *.h’, and ‘git add Makefile’.
After issuing your ‘git add’ command(s). Type ‘git status’ to see what git is up to.
You are now ready to commit to the local repo. Git commit will require a comment which it will attach to the tracked version. It is easiest to provide the comment when git commit is invoked. See below:
Step 6 – Pushing to Remote
Git is now tracking these files, but they are still only in the local repo. While we are connected to the remote repo, we have to manual push our latest changes when we wish to. When should that be? It depends. Let’s do it right now. The command is simple: ‘git push’:
Now you should be able to see your project in GitHub’s web interface
Step 7 – Editing Files
Develop a bit on your project. Perhaps update main() to enhance the help message. You will do this the same as always withing the working directory. Make your project and test your changes. Do not ‘make clean’ this time. Issue ‘git status’:
Notice that git sees the changes to main(), but it also sees the .o, .d, and exe files. We can simply ignore these extra files because we know we do not want them tracked. But, it might be nice if git ignored them too since we will never want them in our repo. To make that happen, create a new file called .gitignore and list all of the files you want got to ignore.
Here is mine:
Run ‘git status’ again:
Lets add the new .gitignore file to be tracked. We also need to stage the changes to main.c. We use git add for both purposes. This will be followed by a commit:
Note that the comment encompasses two somewhat different reasons for committing changes. There is a lesson here. It would be wise to make separate commits for each sort of change made to the project.
Step 8 – Another Edit
Edit another file or two. Here I edited vectarray.c and vectarray.h:
Perhaps not obvious before is that committing changes to the local repository is a two-step process – git add and git commit. There is a good technical reason for this but we do not always want this extra step. We can shortcut is a bit by asking git commit to also perform the add/staging function. Add -a to the command line to automatically stage any tracked files that have changed and commit them with the provided message. See below:
The last step of pushing to remote is always separate and does not need to be done with every tracked change. It would be a good practice to push to remote whenever you are finished with a work session.
Future Topics
We are really just scratching the surface of version control. The following topics will be explored in future assignments:
1. Tagging (milestones or releases)
2. Branching
3. Merging
Deliverable
By the end of the lab period today, you will have:
1. “Accepted” the GitHub Classroom assignment and created your repository.
2. Cloned and populated the repository with the current / initial version of your “vector” lab (including pushed initial version to remote).
3. Made at least one local change to your source code and added a .gitignore file to your repository and pushed those changes to remote.
Show the screen captures of each of the steps and submit the report to Canvas.
Note, you will be expected to complete Lab 6 independent of Lab 5 status.