by Sophie Harbisher & Jack Walton
Surely storing files as I am doing now is OK...
Workshop shall take the form of a single motivated example to give context and introduce commands.
We shall follow Dracula and Wolfman who are investigating if it is possible to send a planetary lander to Mars.
What we will learn:
How do I create an account on GitHub?
How do I get set up Git?
It's simple to create an account on GitHub.
Got to https://github.com/ then enter a unique username, your email address and choose a password.
Congratulations - you've just created your GitHub account.
If you register as a student on GitHub you can access lots of additional software and features. This includes unlimited public and private repositories.
To be eligible for the GitHub Student Developer Pack, you must:
To register as a student go to https://education.github.com/pack.
If you are a student in the department of Mathematics, Statistics and Physics you have access to the school's GitLab server. This is similar to GitHub.
You already have an account - log in using your usual uni email and password.
The GitLab documentation provides additional information.
First, create a GitHub account.
In this workshop we use commands that work in Ubuntu. Other commands may be required to install git if you use other Linux distributions or Mac. Details can be found here.
To install Git using the following shell command.
$ sudo apt install git
Next we need to configure your user name and email address. Use the following commands.
$ git config --global user.name "CountVladTepes"
$ git config --global user.email "exampleemail0112358@gmail.com"
These are the basic options required to set up Git. For a list of other setting you may want to configure use
$ git config -h
or access the Git manual using
$ git config --help
To install git on windows to to https://git-scm.com/download/win where the download shouls start automatically.
Most of the default options are OK here. For this workshop we only need the Git command line.
What we will learn:
Repository - "A central location in which data is stored and managed."
How to create a repository using GitHub
Using Git to create a repository
README and .gitignore
files
In this section we shall cover three different ways to make a repository.
.gitignore
file.To create a repository in GitHub you need to select New repository from the + menu
This will take you to a page which lets you choose a name for your new repository and enter a description
Create a directory locally where the project will be stored
Go to the directory using
$ cd path/to/directory
$ git init
First, create a repository in GitHub. You can then clone it to your local machine using the following command
$ git clone github_repo_path
There are 2 options for the repository path: https
or ssh
. You can copy the file path to the clipboard from GitHub.
A README.md
file is a markdown script used to provide a brief introduction and description of the repository
This could describe the documents or detail how to compile/run code contained in the repository
When creating a repository in GitHub we have a tick box option to initialise with a README
For Git, use your favourite text editor to create a README.md
file. For example,
$ nano README.md
These are scripts used to inform Git which files to ignore when adding files to your repository
Often, you ignore files which are generated from running the code in the repository
For example, if you had a repository for a Latex document you would want to add the .tex, .pdf, .sty, .cls
files without the other files (.aux, .log
) generated by your Latex compiler
GitHub allows you to specify a predefined .gitignore
file when creating your repository (which can be edited to suit your individual needs)
In Git, we can define a .gitignore
from the shell using a text editor
$ nano .gitignore
What we will learn:
cd
- change directory
mkdir
- make directory
ls
- list directory contents
pwd
- print working directory
cat
- concatenate files and print to standard output
Dracula and Wolfman are using git and github to manage their next big move together.
Dracula creates a directory to work from:
$ cd ~/Documents/
$ mkdir planets
$ cd planets
We can use ls
to inspect the contents of planets
. The -a
flag tells ls
to not ignore entries starting with a dot.
$ls -a
. ..
We tell git to make the directory planets
a repository using init
$ git init
Initialised empty Git repository in ~/Documents/planets/.git/
If we look again at the contents of the directory we see git has created a hidden directory .git
:
$ ls -a
. .. .git
Git uses this directory to store all the information about your project. If we delete .git
we lose the project's history.
We can use git to check the status of our repository
$ git status
On branch master
Initial commit
nothing to commit (create/copy files and use "git add" to track)
Create a file mars.txt
to make notes about Mars' suitability as a base.
Use whatever text editor you like here (gedit, nano, vim etc.)
Here we skip the text editor altogether and use echo
to make our note
$ echo "Cold and dry, but everything is my favorite color" > mars.txt
Let's inspect our current setup:
$ pwd
/home/vlad/Documents/planets
$ ls -a
. .. .git mars.txt
$ cat mars.txt
Cold and dry, but everything is my favorite color
Now, if we consult git, we're told about a new file:
$ git status
On branch master
Initial commit
Untracked files:
(use "git add <file>..." to include in what will be committed)
mars.txt
nothing added to commit but untracked files present (use "git add" to track)
The "Untracked files" message tells us git is aware we created mars.txt
but isn't tracking the file.
To instruct git to track a file we use the git add
command
$ git add mars.txt
Check the status of our repository now that we are tracking mars.txt
$ git status
On branch master
Initial commit
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: mars.txt
So, git is tracking mars.txt
but the changes haven't been recorded as a commit yet.
A commit is a record of a change. This change will be permanently recorded in our history, and can be reverted to at a later date.
To make a commit we use git commit
. We also add a message which should be a short desciption of changes made.
$ git commit -m "Start notes about possible base on Mars"
[master (root-commit) 207ef6f] Start notes about possible base on Mars
1 file changed, 1 insertion(+)
create mode 100644 mars.txt
When we run git commit
, git takes everything we told it about with add
and saves a copy permanently in the hidden .git
directory.
Each permanent copy is called a commit, and has an unique identifier associated.
Our commit of mars.txt
was given unique (short) identifier 207ef6f
The -m
flag allows us add a short message to each commit. A good commit message should give a brief statement of any changes made to the file, and should generally complete the sentence “If applied, this commit will”.
If we git commit
without the -m
flag git will launch the text editor configured in core.editor
If we check the status of our repository we see everything is up to date.
$ git status
On branch master
nothing to commit, working directory clean
We can use git log
to lookup recent changes made to our repository.
$ git log
commit 207ef6feac67c29478eace753ff95f18b944924a
Author: Vlad Dracula <vlad@tran.sylvan.ia>
Date: Tue Nov 27 17:26:56 2018 +0000
Start notes about possible base on Mars
Changes are listed in reverse chronological order.
Looking up the output of git log
we can see the full commit identifier, the author of the commit, the time of the commit and the commit message.
Now, suppose Dracula wishes to add more notes about Mars.
Here, Dracula could use any text editor. However, here he opts to use echo
$ echo "The two moons may be a problem for Wolfman" >> mars.txt
Inspecting the contents of mars.txt
:
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
Using git status
we are told that git already knows about mars.txt
but that the file has been modified since the last commit.
$ git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: mars.txt
no changes added to commit (use "git add" and/or "git commit -a")
Git knows that we have made changes to mars.txt
.
But, we haven't told git that we will want to save these changes (which we do with git add
)
Nor have we saved the changes (which we do with git commit
)
We can use git diff
to inspect the differences between the current file and the last commit:
$ git diff mars.txt
diff --git a/mars.txt b/mars.txt
index df0654a..315bf3a 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1 +1,2 @@
Cold and dry, but everything is my favorite color
+The two moons may be a problem for Wolfman
Let's breakdown the output of git diff
:
diff --git a/mars.txt b/mars.txt
tells us the output is similar to unix diff command
index df0654a..315bf3a 100644
tells us which versions git is comparing. df0654a
and 315bf3a
are unique labels
--- a/mars.txt +++ b/mars.txt
tells us the name of the file being changed
@@ -1 +1,2 @@
Cold and dry, but everything is my favorite color
+The two moons may be a problem for Wolfman
shows us the actual changes
Let's add and commit our new change
$ git add mars.txt
$ git commit -m "Add concerns about effects of moons of Mars on Wolfman"
[master 0f293e9] Add concerns about effects of moons of Mars on Wolfman
1 file changed, 1 insertion(+)
Consulting git log
we can see our latest changes
$ git log
commit 0f293e9ed0f3727df26297c0bc454a49e94d9634
Author: Vlad Dracula <vlad@tran.sylvan.ia>
Date: Tue Nov 27 20:32:39 2018 +0000
Add concerns about effects of moons of Mars on Wolfman
commit 207ef6feac67c29478eace753ff95f18b944924a
Author: Vlad Dracula <vlad@tran.sylvan.ia>
Date: Tue Nov 27 17:26:56 2018 +0000
Start notes about possible base on Mars
You can think of Git as taking snapshots of changes through the life of a project:
git add
specifies what will go into a snapshot (putting things in the staging area)
git commit
actually takes the snapshot (records as a commit)
If nothing is staged when you run git commit
, git will prompt you to use git commit -a
. This will stage everybody for the snapshot.
However, git commit -a
is generally discouraged.
Let's watch as our changes move from our editor to the staging area and then to long term storage.
$ echo "But the Mummy won't appreciate the lack of humidity" >> mars.txt
$ git diff
diff --git a/mars.txt b/mars.txt
index 315bf3a..523d756 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,2 +1,3 @@
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
+But the Mummy won't appreciate the lack of humidity
Now, let's add this change to the staging area and investigate what git diff
tells us:
$ git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: mars.txt
no changes added to commit (use "git add" and/or "git commit -a")
$ git add mars.txt
$ git diff
There is no output of this command as git sees no difference between what's staged and what's in the current directory.
However, we can use the --staged
flag to see the difference between the last commit and what's currently staged.
$ git diff --staged
diff --git a/mars.txt b/mars.txt
index 315bf3a..523d756 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,2 +1,3 @@
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
+But the Mummy won't appreciate the lack of humidity
At this point Dracula spots a mistake in mars.txt
. The final line was meant to read:
But the Mummy will appreciate the lack of humidity
instead of
But the Mummy won't appreciate the lack of humidity
Now we wish to correct this change before it is commited. Let us unstage mars.txt
with the reset
command
$ git reset mars.txt
Unstaged changes after reset:
M mars.txt
$ git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: mars.txt
no changes added to commit (use "git add" and/or "git commit -a")
Inspecting the contents of mars.txt
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy won't appreciate the lack of humidity
We can then open our text editor and make our changes (or, using the command line)
$ sed -i "s/won't/will/" mars.txt
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
We now finish this correction off by adding and commiting our changes
$ git add mars.txt
$ git commit -m "Discuss implications of Mars' climate for Mummy"
[master 8edb142] Discuss implications of Mars' climate for Mummy
1 file changed, 1 insertion(+)
Sometimes a line-wise diff (the default of git diff
) makes it difficult to see the changes that have been made to a file.
The option --color-words
of git diff
can be used to highlight changed words
The output of git log
will eventually get too long to fit on your screen.
When this happens git will use a pager to display the output.
To get out of the prompt press q
To move to the next page of the log press spacebar
To search for some_word
anywhere in the log, type /
followed by some_word
. Find next instance with n
To avoid git log
taking over your screen you can limit the number of commits shown with the -N
flag; where N
is the number of commits to display.
$ git log -1
commit 8edb142c2cd6c9ce7aca04b2d8c9f511ed5d1d2e
Author: Vlad Dracula <vlad@tran.sylvan.ia>
Date: Tue Nov 27 20:34:35 2018 +0000
Discuss implications of Mars' climate for Mummy
Git does not track directories on their own --- only the files within the directory.
Let's check the status of our repository:
$ git status
On branch master
nothing to commit, working directory clean
And create a new directory:
$ mkdir moons
Checking git status
we may be surprised:
$ git status
On branch master
nothing to commit, working directory clean
If we wanted to make git track moons
we can add a file .gitkeep
to the directory.
$ touch moons/.gitkeep
If we check our status now:
$ git status
On branch master
Untracked files:
(use "git add <file>..." to include in what will be committed)
moons/
nothing added to commit but untracked files present (use "git add" to track)
The name of the file was unimportant, ie there's nothing special about .gitkeep
git status
shows us the status of a repository
git add
puts files into the staging area
git commit
saves the staged content
Commit messages should be used to give a description of changes in files
What we will learn:
What the HEAD is and how to use it
Identify and use git commit identifiers
Compare versions of tracked files
Revert to / recover previous versions of files
We've already seen that commits can be referred to by their identifiers.
The most recent commit of a repository can be referred to with the identifier HEAD
We've been adding one line at a time to easily track our progress in mars.txt
. Lets add another:
$ echo "An ill considered change" >> mars.txt
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
An ill considered change
Let's use git diff
to compare two versions.
$ git diff HEAD mars.txt
diff --git a/mars.txt b/mars.txt
index b36abfd..ca148ef 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,3 +1,4 @@
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
+An ill considered change
This is exactly the same as if we'd just typed git diff
Equivalently:
$ git diff
diff --git a/mars.txt b/mars.txt
index b36abfd..ca148ef 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,3 +1,4 @@
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
+An ill considered change
So... what's the point?
Usefully, we can refer to changes before HEAD
by adding ~N
to refer to the N
th commit before HEAD
$ git diff HEAD~1 mars.txt
diff --git a/mars.txt b/mars.txt
index 315bf3a..ca148ef 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,2 +1,4 @@
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
+But the Mummy will appreciate the lack of humidity
+An ill considered change
Comparing the version of mars.txt
in the current working directory with mars.txt
two commits ago:
$ git diff HEAD~2 mars.txt
diff --git a/mars.txt b/mars.txt
index df0654a..ca148ef 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1 +1,4 @@
Cold and dry, but everything is my favorite color
+The two moons may be a problem for Wolfman
+But the Mummy will appreciate the lack of humidity
+An ill considered change
Sometimes it's useful to see the changes we made at an earlier commit, rather than compare the differences between two commits. git show
allows us to do that:
$ git show HEAD~2 mars.txt
commit 207ef6feac67c29478eace753ff95f18b944924a
Author: Vlad Dracula <vlad@tran.sylvan.ia>
Date: Tue Nov 27 17:26:56 2018 +0000
Start notes about possible base on Mars
diff --git a/mars.txt b/mars.txt
new file mode 100644
index 0000000..df0654a
--- /dev/null
+++ b/mars.txt
@@ -0,0 +1 @@
+Cold and dry, but everything is my favorite color
Commits can also be referred to by their unique identifiers (which we can get from calls to git log
.
Our first commit was given ID 207ef6feac67c29478eace753ff95f18b944924a
$ git show 207ef6feac67c29478eace753ff95f18b944924a mars.txt
commit 207ef6feac67c29478eace753ff95f18b944924a
Author: Vlad Dracula <vlad@tran.sylvan.ia>
Date: Tue Nov 27 17:26:56 2018 +0000
Start notes about possible base on Mars
diff --git a/mars.txt b/mars.txt
new file mode 100644
index 0000000..df0654a
--- /dev/null
+++ b/mars.txt
@@ -0,0 +1 @@
+Cold and dry, but everything is my favorite color
Git lets use the first few characters of a commit ID (rather than always requiring the full 40 character ID)
git show 207ef6f mars.txt
commit 207ef6feac67c29478eace753ff95f18b944924a
Author: Vlad Dracula <vlad@tran.sylvan.ia>
Date: Tue Nov 27 17:26:56 2018 +0000
Start notes about possible base on Mars
diff --git a/mars.txt b/mars.txt
new file mode 100644
index 0000000..df0654a
--- /dev/null
+++ b/mars.txt
@@ -0,0 +1 @@
+Cold and dry, but everything is my favorite color
We can see the changes we made at any point in our repositories history.
Now we wish to learn how to revert to earlier commits.
Suppose we accidentally overwrite our file:
$ echo "We will need to manufacture our own oxygen" > mars.txt
$ cat mars.txt
We will need to manufacture our own oxygen
Oh dear. We now want to revert mars.txt
to the last committed version.
Normality can be restored with git checkout
$ git checkout HEAD mars.txt
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
The command git checkout
is used to restore an old version of a file.
We told git to recover the version of mars.txt
recorded in HEAD
.
However, we can use commit identifiers to go back further.
Using a commit ID:
$ git checkout 207ef6f mars.txt
$ cat mars.txt
Cold and dry, but everything is my favorite color
Let's put things back to how they were:
$ git checkout HEAD mars.txt
Actually, as it turns out, git checkout
can be used to do lots of different operations.
If we forgot to type mars.txt
in our previous command we would have entered 'detached HEAD' state:
$ git checkout 207ef6f
Note: checking out '207ef6f'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
HEAD is now at 207ef6f... Start notes about possible base on Mars
In 'detached HEAD' mode you can inspect the files in your repository exactly as they were in that commit.
$ cat mars.txt
Cold and dry, but everything is my favorite color
However, you shouldn't make changes in this state.
Instead reattach your HEAD
:
$ git checkout master
Previous HEAD position was 207ef6f... Start notes about possible base on Mars
Switched to branch 'master'
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
What we will learn:
So far we have considered only local changes to repositories
How do we communicate these changes to our remote repository on GitHub?
To push your local changes to the remote repository we use the following command
$ git push origin master
To pull the current version of the repository from the remote server use
$ git pull
The pull command actually combines two actions:
fetch
to get all the changes from the remote repositorymerge
to combine the remote changes with any changes made locallyWhat we will learn:
What are branches?
How to use branching effectively
Branches are initially a copy of the master branch
Changes can be made on a branch without changing the master
Particularly useful when code on master works but you want to try a something new with it. If it works then the changes can be merged to the master. If it does not work/breaks your code then the master still contains the working code.
Branching structures are easy to implement and use!
In your repository on GitHub go to the drop-down menu and either choose or create a branch.
To create a new branch on your local machine use the following command
$ git checkout -b new_branch_name
This will switch you onto the new branch. To change your working branch use
$ git checkout new_branch_name
Push the branch to the remote repository using
$ git push origin new_branch_name
Pull a branch from the remote repository (note if you don't pull the branch then it will not appear locally)
$ git pull new_branch_name
What we will learn:
Working on a project with others using GitHub
Different roles - maintainer, developer etc.
Using branches on collaborative projects
Introduce workflows
GitHub and Git makes it easy to collaborate on a project with other users!
Each (user) repository has one owner who is responsible for hosting the project but can have multiple collaborators. This is different if you have a organisation account with GitHub (not covered here).
The permissions for the different roles can be found on the GitHub user documentation
To add collaborators to a project, go to the settings tab in your repository and add collaborators via their GitHub username or email address.
When collaborating on a project, it is good practice to use branches for the development stage
Once the branch is approved it can then be merged into the master
What we will learn:
You can keep track of which version of your code produced different outputs by embedding the information in your executables.
From there this can be included in any output text or data files - this makes it very simple to check exactly what version of a code generated a certain output.
An example output file might be something like the following.
git version - v0.0.1
git commit - 65c92648502e5693f5b530e798d10e10895682d8
git date - 2019-11-13T20:13:04+13:00
build date - 2019-11-13T21:04:55+13:00
Here are some links which provide more information on including version information in your output and a package in python which helps manage your versions.
Tags point to specific point in the Git history. A tag is like a branch that does not change. Unlike branches, tags, after being created, have no further history of commits.
More information on tags can be found here.
Versions of your project are useful for identifying which changes have been made. These are given as numbers indicating major, minor and patch changes, for example 2.0.1
.
More information on versioning can be found here and a python package which identifies versions and declares dependencies can be found here.
What we will learn:
How to make the most of Git for thesis style documents
How Git can be used for collaborative documents
\include
or \input
to include them into a main file.gitignore
for any files that are produced by tex during the compile processWhat we have learned:
Why it is beneficial to use version control
Set up GitHub accounts
How to create a repository
That we can track changes using the modify-add-commit cycle
Understand the staging in Git
Know how to explore changes & revert to previous versions
Know how to communicate between local and remote repositories
What branching is & how it can be used for collaborative projects
How we can benefit from using Git for large documents such as a thesis
These slides are available on Jack's teaching website
Atlassian provide good resources and tutorials on using git.
If you are and Windows user and prefer a graphical interface for Git then you may want to download Git Extensions (for free!). Here is a short guide to help you get started.
Interesting dropbox vs git blog: https://michaelstepner.com/blog/git-vs-dropbox/
Some of the work which appears in these slides is derived from work of Software Carpentry and was shared and adapted under the creative commons license