Branching, collaborating, and undoing
Week 3 - Version Control - Part III (optional self-study material)
This page contains optional self-study material if you want to dig deeper into Git. Some of it may also be useful as a reference in case you run into problems while trying to use Git.
- Branching & merging
- Collaboration with Git: multi-user remote workflows
- Contributing to repositories: Forking & Pull Requests
- Undoing (& viewing) changes that have been committed
- Miscellaneous Git
1 Branching & merging
In this section, you’ll learn about using so-called “branches” in Git. Branches are basically parallel versions of your repository, which allow you or your collaborators to experiment or create variants without affecting existing functionality or others’ work.
1.1 A repo with a couple of commits
First, you’ll create a dummy repo with a few commits by running a script (following CSB).
cd /fs/ess/PAS2700/users/$USER/CSB/git/sandbox
Take a look at the script you will run to create your repo:
cat ../data/create_repository.sh
#!/bin/bash
# function of the script:
# sets up a repository and
# immitates workflow of
# creating and commiting two text files
mkdir branching_example
cd branching_example
git init
echo "Some great code here" > code.txt
git add .
git commit -m "Code ready"
echo "If everything would be that easy!" > manuscript.txt
git add .
git commit -m "Drafted paper"
Run the script:
bash ../data/create_repository.sh
Initialized empty Git repository in /fs/ess/PAS2700/users/jelmer/CSB/git/sandbox/branching_example/.git/
[main (root-commit) 3c59d8a] Code ready
1 file changed, 1 insertion(+)
create mode 100644 code.txt
[main 7ba8ca4] Drafted paper
1 file changed, 1 insertion(+)
create mode 100644 manuscript.txt
And move into the repository’s dir:
cd branching_example
Let’s see what has been done in this repo:
git log --oneline
7ba8ca4 (HEAD -> main) Drafted paper
3c59d8a Code ready
We will later modify the file code.txt
— let’s see what it contains now:
cat code.txt
Some great code here
1.2 Using branches in Git
You now want to improve the code, but these changes are experimental, and you want to retain your previous version that you know works. This is where branching comes in. With a new branch, you can make changes that don’t affect the main
branch, and can also keep working on the main
branch:
Creating a new branch
First, create a new branch as follows, naming it fastercode
:
git branch fastercode
List the branches:
# Without args, git branch will list the branches
git branch
fastercode
* main
It turns out that you created a new branch but are still on the main branch, as the *
indicates.
You can switch branches with git checkout
:
git checkout fastercode
Switched to branch 'fastercode'
And confirm your switch with git branch
:
git branch
* fastercode
main
Note that you can also tell from the git status
output on which branch you are:
git status
On branch fastercode
nothing to commit, working tree clean
Making experimental changes on the new branch
You edit the code, stage and commit the changes:
echo "Yeah, faster code" >> code.txt
cat code.txt
Some great code here
Yeah, faster code
git add code.txt
git commit -m "Managed to make code faster"
[fastercode 21f1828] Managed to make code faster
1 file changed, 1 insertion(+)
Let’s check the log again, which tells you that the last commit was made on the fastercode
branch:
git log --oneline
21f1828 (HEAD -> fastercode) Managed to make code faster
7ba8ca4 (main) Drafted paper
3c59d8a Code ready
Moving back to the main
branch
You need to switch gears and add references to the paper draft. Since this has nothing to do with your attempt at faster code, you should make these changes back on the main
branch:
# Move back to the 'main' branch
git checkout main
Switched to branch 'main'
What does code.txt
, which we edited on fastercode
, now look like?
cat code.txt
Some great code here
So, by switching between branches, your working dir contents has changed!
Now, while still on the main
branch, add the reference, stage and commit:
echo "Marra et al. 2014" > references.txt
git add references.txt
git commit -m "Fixed the references"
[main 1bf123f] Fixed the references
1 file changed, 1 insertion(+)
create mode 100644 references.txt
Now that you’ve made changes to both branches, let’s see the log in “graph” format with --graph
, also listing all branches with --all
— note how it tries to depict these branches:
git log --oneline --graph --all
* 1bf123f (HEAD -> main) Fixed the references
| * 21f1828 (fastercode) Managed to make code faster
|/
* 7ba8ca4 Drafted paper
* 3c59d8a Code ready
Finishing up on the experimental branch
Earlier, you finished speeding up the code in the fastercode
branch, but you still need to document your changes. So, you go back:
git checkout fastercode
Switched to branch 'fastercode'
Do you still have the references.txt
file from the main
branch?
ls
code.txt manuscript.txt
Nope, your working dir has changed again.
Then, add the “documentation” to the code, and stage and commit:
echo "# My documentation" >> code.txt
git add code.txt
git commit -m "Added comments to the code"
[fastercode d09f611] Added comments to the code
1 file changed, 1 insertion(+)
Check the log graph:
git log --oneline --all --graph
* d09f611 (HEAD -> fastercode) Added comments to the code
* 21f1828 Managed to make code faster
| * 1bf123f (main) Fixed the references
|/
* 7ba8ca4 Drafted paper
* 3c59d8a Code ready
Merging the branches
You’re happy with the changes to the code, and want to make the fastercode
version the default version of the code. This means you should merge the fastercode
branch back into main
. To do so, you first have to move back to main
:
git checkout main
Switched to branch 'main'
Now you are ready to merge with the git merge
command. You’ll also have to provide a commit message, because a merge is always accompanied by a commit:
git merge fastercode -m "Much faster version of code"
Merge made by the 'ort' strategy.
code.txt | 2 ++
1 file changed, 2 insertions(+)
Once again, check the log graph, which depicts the branches coming back together:
git log --oneline --all --graph
* 5bb84cd (HEAD -> main) Much faster version of code
|\
| * d09f611 (fastercode) Added comments to the code
| * 21f1828 Managed to make code faster
* | 1bf123f Fixed the references
|/
* 7ba8ca4 Drafted paper
* 3c59d8a Code ready
Cleaning up
You no longer need the fastercode
branch, so you can delete it as follows:
git branch -d fastercode
Deleted branch fastercode (was d09f611).
1.3 Branching and merging – Workflow summary
Overview of commands used in the branching workflow
# (NOTE: Don't run this)
# Create a new branch:
git branch mybranch
# Move to new branch:
git checkout mybranch
# Add and commit changes:
git add --all
git commit -m "my message"
# Done with branch - move back to main trunk and merge
git checkout main
git merge mybranch -m "Message for merge"
# And [optionally] delete the branch:
git -d mybranch
Exercise (Intermezzo 2.2)
- (a) Move to the directory
CSB/git/sandbox
.
Solution
cd /fs/ess/PAS2700/users/$USER/CSB/git/sandbox
- (b) Create a directory
thesis
and turn it into a Git repository.
Solution
mkdir thesis
cd thesis
git init
- (c) Create the file
introduction.txt
with the line “Best introduction ever.”
Solution
echo "The best introduction ever" > introduction.txt
- (d) Stage
introduction.txt
and commit with the message “Started introduction.”
Solution
git add introduction.txt
git commit -m "Started introduction"
- (e) Create the branch
newintro
and change into it.
Solution
git branch newintro
git checkout newintro
- (f) Overwrite the contents of
introduction.txt
, create a new filemethods.txt
, stage, and commit.
Solution
echo "A much better introduction" > introduction.txt
touch methods.txt
git add --all
git commit -m "A new introduction and methods file"
- (g) Move back to
main
. What does your working directory look like now?
Solution
git checkout main
ls # Changes made on the other branch are not visible here!
cat introduction.txt
- (h) Merge in the
newintro
branch, and confirm that the changes you made there are now in your working dir.
Solution
git merge newintro -m "New introduction"
ls
cat introduction.txt
- (i) Bonus: Delete the branch
newintro
.
Solution
git branch -d newintro
2 Collaboration with Git: multi-user remote workflows
In a multi-user workflow, your collaborator can make changes to the repository (committing to local, then pushing to remote), and you need to make sure that you stay up-to-date with these changes.
Synchronization between your and your collaborator’s repository happens via the remote, so now you will need a way to download changes from the remote that your collaborator made. This happens with the git pull
command.
In a multi-user workflow, changes made by different users are shared via the online copy of the repo. But syncing is not automatic:
- Changes to your local repo remain local-only until you push to remote.
- Someone else’s changes to the remote repo do not make it into your local repo until you pull from remote.
However, when your collaborator has made changes, Git will tell you about “divergence” between your local repository and the remote when you run git status
:
# (Don't run this)
git status
In a multi-user workflow, you should use use git pull
often, since staying up-to-date with your collaborator’s changes will reduce the chances of merge conflicts.
2.1 Add a collaborator in GitHub
You can add a collaborator to a repository on GitHub as follows:
- Go to the repository’s settings:
- Find and click
Manage access
:
- Click
Invite a collaborator
:
2.2 Merge conflicts
A so-called merge conflict means that Git is not able to automatically merge two branches, which occurs when all three of the following conditions are met:
- You try to merge two branches (including when pulling from remote: a pull includes a merge)
- One or more file changes have been committed on both of these branches since their divergence.
- Some of these changes were made in the same part(s) of file(s).
When this occurs, Git has no way of knowing which changes to keep, and it will report a merge conflict as follows:
Resolving a merge conflict
When Git reports a merge conflict, follow these steps:
- Use
git status
to find the conflicting file(s).
Open and edit those file(s) manually to a version that fixes the conflict (!).
Note below that Git will have changed these file(s) to add the conflicting lines from both versions of the file, and to add marks that indicate which lines conflict.
You have to manually change the contents in your text editor to keep the conflicting content that you want, and to remove the indicator marks that Git made.
On the Origin of Species # Line preceding conflicting line <<<<<<< HEAD # GIT MARK 1: Next line = current branch Line 2 - from main # Conflict line: current branch ======= # GIT MARK 2: Dividing line Line 2 - from conflict-branch # Conflict line: incoming branch >>>>>>> conflict-branch # GIT MARK 3: Prev line = incoming branch
Use
git add
to tell Git you’ve resolved the conflict in a particular file:git add origin.txt
Once all conflicts are resolved, use
git status
to check that all changes have been staged. Then, usegit commit
to finish the merge commit:git commit -m "Solved the merge conflict"
VS Code has some nice functionality to make Step 2 (resolving the conflict) easier:
code <conflicting-file> # Open the file in VS Code
If you click on “Accept Current Change” or “Accept Incoming Change”, etc., it will keep the desired lines and remove the Git indicator marks. Then, save and exit.
3 Contributing to repositories: Forking & Pull Requests
3.1 What can you do with someone else’s GitHub repository?
In some cases, you may be interested in working in some way with someone else’s repository that you found on GitHub. If you do not have rights to push, you can:
- Clone the repo and make changes locally (as we have been doing with the
CSB
repo). When you do this, you can also periodically pull to remain up-to-date with changes in the original repo. - Fork the repository on GitHub and develop it independently. Forking creates a new personal GitHub repo, to which you can push.
- Using a forked repo, you can also submit a Pull Request with proposed changes to the original repo: for example, if you’ve fixed a bug in someone else’s program.
If you’re actually collaborating on a project, though, you should ask your collaborator to give you admin rights for the repo, which makes things a lot easier than working via Pull Requests.
Forking a GitHub repository
You can follow along by e.g. forking my originspecies
repo.
- Go to a GitHub repository, and click the “Fork” button in the top-right:
- You may be asked which account to fork to: select your account.
- Now, you have your own version of the repository, and it is labeled explicitly as a fork:
Forking workflow
You can’t directly modify the original repository, but you can:
- First, modify your fork (with local edits and pushing).
- Then, submit a so-called Pull Request to the owner of the original repo to pull in your changes.
- Also, you can also easily keep your fork up-to-date with changes to the original repository.
Editing the forked repository
To clone your forked GitHub repository to a dir at OSC, start by creating a dir there — for example:
mkdir /fs/ess/PAS2700/users/$USER/week03/fork_test
cd /fs/ess/PAS2700/users/$USER/week03/fork_test
Then, find the URL for your forked GitHub repository by clicking the green Code
button. Make sure you get the SSH URL (rather than the HTTPS URL), and click the clipboard button next to the URL to copy it:
Then, type git clone
and a space, and paste the URL, e.g.:
git clone git@github.com:jelmerp/originspecies.git
Cloning into 'originspecies'...
remote: Enumerating objects: 31, done.
remote: Counting objects: 100% (31/31), done.
remote: Compressing objects: 100% (19/19), done.
remote: Total 31 (delta 4), reused 30 (delta 3), pack-reused 0
Receiving objects: 100% (31/31), done.
Resolving deltas: 100% (4/4), done.
Now, you can make changes to the repository in the familiar way, for example:
echo "# Chapter 1. Variation under domestication" > origin.txt
git add origin.txt
git commit -m "Suggested title for first chapter."
And note that you can push without any setup — because you cloned the repository, the remote setup is already done (and you have permission to push because its your own repo on GitHub and you have set up GitHub authentication):
git push
Creating a Pull Request
If you then go back to GitHub, you’ll see that your forked repo is “x commit(s) ahead” of the original repo:
Click Pull request
, and check whether the right repositories and branches are being compared (and here you can also see the changes that were made in the commits):
If it looks good, click the green Create Pull Request
button:
Give your Pull Request a title, and write a brief description of your changes:
Keeping your fork up-to-date
As you saw, you can’t directly push to original repo but instead have to submit a Pull Request (yes, this terminology is confusing!).
But, you can create an ongoing connection to the original repo, which you can use to periodically pull to keep your fork up-to-date. This works similarly to connecting your own GitHub repo, but you should give the remote a different nickname than origin
— the convention is upstream
:
# Add the "upstream" connection
git remote add upstream git@github.com:jelmerp/originspecies.git
# List the remotes:
git remote -v
origin git@github.com:pallass-boszanger/originspecies.git (fetch)
origin git@github.com:pallass-boszanger/originspecies.git (push)
upstream git@github.com:jelmerp/originspecies.git (fetch)
upstream git@github.com:jelmerp/originspecies.git (push)
# Pull from the upstream repository:
git pull upstream main
4 Undoing (& viewing) changes that have been committed
Whereas on the first Git page, we learned about undoing changes that have not been committed, here you’ll see how you can undo changes that have been committed.
4.1 Viewing past versions of the repository
Before undoing committed changes, you may want to look at earlier states of your repo, e.g. to know what to revert to:
First, print an overview of past commits and their messages:
# (NOTE: example code in this and the next few boxes - don't run as-is) git log --oneline
Find a commit you want to go back to, and look around in the past:
git checkout <sha-id> # Replace <sha-id> by an actual hash less myfile.txt # Etc. ...
Then, you can go back to where you were originally as follows:
git checkout main
The next section will talk about strategies to move your repo back to an earlier state that you found this way.
If you just want to retrieve/restore an older version of a single file that you found while browsing around in the past, then a quick way can be: simply copy the file to a location outside of your repo, move yourself back to the “present”, and move the file back into your repo, now in the present.
git checkout
Note the confusing re-use of git checkout
! We have now seen git checkout
being used to:
- Move between branches
- Move to previous commits to explore (figure below)
- (Revert files back to previous states — as an alternative to
git restore
)
4.2 Undoing entire commits
To undo commits, i.e. move the state of your repository back to how it was before the commit you want to undo, there are two main commands:
git revert
: Undo the changes made by commits by reverting them in a new commit.git reset
: Delete commits as if they were never made.
Undoing commits with git revert
A couple of examples of creating a new commit that will revert all changes made in the specified commit:
# Undo changes by the most recent commit:
git revert HEAD
# Undo changes by the second-to-last commit:
git revert HEAD^
# Undo changes by a commit identified by its checksum:
git revert e1c5739
Undoing commits with git reset
git reset
is quite complicated as it has three modes (--hard
, --mixed
(default), and --soft
) and can act either on individual files and on entire commits. To undo a commit, and:
Stage all changes made by that commit:
# Resetting to the 2nd-to-last commit (HEAD^) => undoing the last commit git reset --soft HEAD^
Put all changes made by that commit as uncomitted working-dir changes:
# Note that '--mixed' is the default, so you could omit that git reset --mixed HEAD^
Completely discard all changes made by that commit:
git reset --hard HEAD^
git reset
erases history
Undoing with git revert
is much safer than with git reset
, because git revert
does not erase any history.
For this reason, some argue you should not use git reset
on commits altogether. At any rate, you should never use git reset
for commits that have already been pushed online.
4.3 Viewing & reverting to earlier versions of files
Above, you learned to undo at a project/commit-wide level. But you can also undo things for specific files:
Get a specific version of a file from a past commit:
# Retrieve the version of README.md from the second-to-last commit git checkout HEAD^^ -- README.md
# Or: Retrieve the version of README.md from a commit IDed by the checksum git checkout e1c5739 -- README.md
Now, your have the old version in the working dir & staged, which you can optionally check with:
# Optional: check the file at the earlier state cat README.md git status
You can go on to commit this version from the past, or go back to the current version, as we will do below:
git checkout HEAD -- README.md
git checkout
Be careful with git checkout
: any uncommitted changes to this file would be overwritten by the past version you retrieve!
An alternative method to view and revert to older versions of specific files is to use git show
.
View a file from any commit as follows:
# Retrieve the version of README.md from the last commit git show HEAD:README.md
# Or: Retrieve the version of README.md from a commit IDed by the checksum git show ad4ca74:README.md
Revert a file to a previous version:
git show ad4ca74:README.md > README.md
5 Miscellaneous Git
5.1 Amending commits
Let’s say you forgot to add a file to a commit, or notice a silly typo in something we just committed. Creating a separate commit for this seems “wasteful” or even confusing, and including these changes along with others in a next commit is also likely to be inappropriate. In such cases, you can amend the previous commit.
First, stage the forgotten or fixed file:
# (NOTE: don't run this)
git add myfile.txt
Then, amend the commit, adding --no-edit
to indicate that you do not want change the commit message:
# (NOTE: don't run this)
git commit --amend --no-edit
Because amending a commit “changes history”, some recommend avoiding this altogether. For sure, do not amend commits that have been published in (pushed to) the online counterpart of the repo.
5.2 git stash
Git stash can be useful when you need to pull from the remote, but have changes in your working dir that:
- Are not appropriate for a separate commit
- Are not worth starting a new branch for
Here is an example of the sequence of commands you can use in such cases.
Stash changes to tracked files with
git stash
:# (Note: add option '-u' if you need to include untracked files) git stash
Pull from the remote repository:
git pull
“Apply” (recover) the stashed changes back to your working dir:
git stash apply
5.3 A few more tips
Git will not pay attention to empty directories in your working dir.
You can create a new branch and move to it in one go using:
git checkout -b <new-branch-name>
To show commits in which a specific file was changed, you can simply use:
git log <filename>
“Aliases” (command shortcuts) can be useful with Git, and can be added in two ways:
By adding lines like the below to the
~/.gitconfig
file:[alias] hist = log --graph --pretty=format:'%h %ad | %s%d [%an]' --date=short last = log -1 HEAD # Just show the last commit
With the
git config
command:git config --global alias.last "log -1 HEAD"