11 minutes
Dive into Git
Git is a decentralized system control system (VCS) created by Linus Torvalds. This is a very powerful and popular tool. It is used for the Linux kernel development (and many other famous projects). It is so appreciated that the kernel developers are now wondering how they survived without it when it did not exist.
With so much praise, I wanted to test this tool. It’s not difficult to tame, however, the way it works is a bit different than other VCS (especially SVN, that I had already used), mainly because of the decentralization that characterizes Git.
Indeed, while most management software versions require a central server to which everyone connects to send and receive changes, Git allows each user to have a local repository that can synchronize with other users. Moreover, it’s completely possible to combine this decentralization with a master server, in order to have a repository that everybody can reach, just like classic centralized VCS.
Understanding Git
Git allows to take snapshots of your project files. Specify which files to back up, and then perform the backup (this is called a commit). When you commit, Git writes a manifest describing how your files look like at this moment. This allows not to save the whole directory each time.
Here’s how to use Git basically.
Create and retrieve repositories
The first thing to do to use Git is to have a Git repository. Two ways: either create or retrieve it.
**Create a repository: git init
To create a repository, simply run the command git init in the desired directory.
$ cd fake-project
$ ls
README hello.sh
$ git init
Initialized empty Git repository in ~/fake-project/.git/
Get a repository: git clone
You can get a Git repository via the Internet, with the command git clone [URL]. This means that you copy the files from the repository locally, but also the history of commits, branches, etc..
$ git clone git://github.com/octocat/Hello-World.git
Managing Snapshots
Before discussing the commands, you should first understand that a file can have four states:
-
Unversioned
Git ignores this file.
-
In Staging area
The file or file changes will be saved in the next commit.
-
Revised
The file has changed since the last commit, but it is not in the staging area, and therefore will not be updated in the next commit.
-
Committed
The file is saved and has not been modified since.
In summary, we will use git add to put one or more files in the staging area, git status and git diff to see the state and what changes have been made (or not) to files, and finally git commit to save.
Add files in the staging area: git add
Let’s return to our fake-project example. A git status shows us the files state of the repository.
$ git status -s
?? README
?? hello.sh
Question marks indicate that the files are not versioned. To put them in the staging area, we use git add.
$ git add README hello.sh
or
$ git add .
to add all the files from the repository (recursive mode).
A git status confirms that the files are ready to be committed:
$ git status -s
A README
A hello.sh
An important thing to understand: when the files will be committed, they will be in the state they were when the last git add was performed. So, if a file changes after git add, these changes will not be included in the Git snapshot. We should launch a git add again to save the latest file version.
See the files state of the repository: git status
As we saw earlier, git status shows whether the files are ready to be committed, have been modified or deleted.
$ vim README
$ git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
# (use "git rm --cached ..." to unstage)
#
# new file: README
# new file: hello.sh
#
# Changed but not updated:
# (use "git add ..." to update what will be committed)
# (use "git checkout -- ..." to discard changes in working directory)
#
# modified: README
#
or with -s, for a shorter output:
$ git status -s
AM README
A hello.sh
View differences : git diff
The git diff command can be used in several ways. Without parameters, git diff gives the difference between what is in staging area and what have been changed afterwards.
$ vim README
$ git status -s
AM README
A hello.sh
$ git diff
diff --git a/README b/README
index 8b13789..6a4238f 100644
--- a/README
+++ b/README
@@ -1 +1 @@
-
+blablabla
The –cached parameter shows the changes made when moved to the staging area.
$ git status -s
AM README
A hello.sh
$ git add .
$ git diff
$
$ git diff --cached
diff --git a/README b/README
new file mode 100644
index 0000000..6a4238f
--- /dev/null
+++ b/README
@@ -0,0 +1 @@
+blablabla
diff --git a/hello.sh b/hello.sh
new file mode 100644
index 0000000..e69de29
The git diff HEAD command allows to know all the changes since the last commit, when the file is in staging area or simply modified.
Create a snapshot of the staging area: git commit
Now that we have our files in staging area, we can save them using the git commit command. Before that, we give our name and mail, they will be recorded during the commit.
$ git config --global user.name 'Your Name'
$ git config --global user.email you@yourdomain.com
We can now commit. We provide a message which describes the commit with the -m option.
$ git status -s
A README
A hello.sh
$ git commit -m 'my first commit'
[master (root-commit) cba1144] my first commit
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 README
create mode 100644 hello.sh
And that’s that! The staging area have been saved and cleaned. A git status confirms it.
$ git status
# On branch master
nothing to commit (working directory clean)
A little diagram that summarizes what happened:
A git commit -a allows you to commit all modified or deleted files without going through the git add step. This command does not commit new unversioned files, you need to add them at least once in staging area with git add in order to directly commit later.
$ vim README
$ commit -am 'My second commit'
[master 8151463] My second commit
1 files changed, 2 insertions(+), 0 deletions(-)
This way, we skip the “staging step” to jump directly to the snapshot.
It is possible to remove a file from the staging area with the git reset HEAD command.
$ git status -s
M README
$ git add README
$ git status -s
M README
$ git reset HEAD -- README
Unstaged changes after reset:
M README
To cancel the last commit: git reset HEAD^
Manage files: git rm et git mv
The git rm command allows to delete a file from your Git repository. The file is literally removed from the working directory the index. The –cached option allows to delete the file from the index only. The file will become unversioned, and therefore, will no longer be managed by Git.
$ git rm --cached README
rm 'README'
$ ls
hello.sh README
git status -s
D README
?? README
$ git commit -m 'bye README'
[master a870e51] bye README
0 files changed, 0 insertions(+), 0 deletions(-)
delete mode 100644 README
$ git status -s
?? README
git mv renames a file and updates the Git index accordingly.
$ git mv oldname newname
is the same as
$ mv oldname newname
$ git add newname
$ git rm oldname
Branches
We approach a big strength of Git: its branches management. Forget most of the nightmares of others VCS, branches are here quite simple to manage. For those who do not know the principle, branches allow you to create ramifications of your project. Imagine you want to add feature to your project. Commit the changes directly in the core branch will make your software unstable until you have not completed the development of this new feature. To work around this problem, you can create a test branch, which will be a copy of your main branch, then make and test your changes without altering the main branch. Once the testing branch seems stable, you can merge the code with your main branch.
Branches management git branch and git checkout
Let’s start by making an inventory: git branch to find out on which branch we are working:
$ git branch
* master
We have a branch called master. It’s preceded by a star, which means that we currently use it. It was created by default by git init.
Let’s create a branch with git branch [name], and then move on this new one with git checkout [name].
$ git branch testing
$ git branch
* master
testing
$ git checkout testing
Switched to branch 'testing'
$ git branch
master
* testing
There is a command that combines the branch creation and the move: git checkout -b [name]
The testing branch has been created. This is a copy of the last commit to the master branch. Although identical so far, our branches are now separated: all commit on one will not be passed to the other.
To delete a branch, git branch -d [name].
You can’t delete a branch if you’re currently on it.
$ git branch
* master
testing
$ git branch -d testing
Deleted branch testing (was a870e51).
Merge branches: git merge
Branches are isolated, which allows you to make changes to one without affecting the other. What if we decide to incorporate changes from one branch into another? This is what we do with git merge [branch], which allows to merge the current branch with a specified branch.
An example where we create a file that we commit in the testing branch, before merge it in the master branch:
$ git branch
* master
$ git checkout -b testing
Switched to a new branch 'testing'
$ ls
hello.sh README
$ git branch
master
* testing
$ touch trololo
$ git add trololo
$ git commit -m 'trololo commit'
[testing e49b586] trololo commit
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 trololo
$ ls
hello.sh README trololo
$ git checkout master
Switched to branch 'master'
$ git merge testing
Updating a870e51..e49b586
Fast-forward
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 trololo
Most of the time, Git merges your branches without any problem. But, if the same block of code in a file is changed on the two branches, it will cause a conflict. In this case, we have to resolve this conflict by hand, in the file. Once done, a git add followed by a git commit will be needed before repeating the git merge.
Branch informations : git log
To trace the history of commits and merge a branch, we have git log.
$ git log
commit e49b586f5cd44e053f3491094e484c160cd3b052
Author: jmathevet
Date: Wed Aug 16 20:00:14 2011 +0200
trololo commit
commit a870e513b9a2fdb8c6f0940b237dea59b240c5be
Author: jmathevet
Date: Wed Aug 16 18:59:31 2011 +0200
bye README
Project sharing
As already mentioned, Git is based on a decentralized architecture. Thus, any Git repository can be both client and server, which is very useful when working offline. That said, it is also interesting to have a Git repository accessible by a team in order to have a common basis, which everyone can retrieve and modify locally, to finally pass the changes to the common repository.
Manage repository aliases : git remote
In order to not to type the repository URL each time we need to access it, Git has an alias system. git remote displays the list of saved Git repositories for a repository. By default, if you cloned yours, an orgin alias has been automatically created. git remote add [alias] [URL] is used to add an alias, git remote rm [alias] to remove.
$ git clone http://android.git.kernel.org/platform/external/iptables
Cloning into iptables...
remote: Counting objects: 617, done.
remote: Compressing objects: 100% (347/347), done.
remote: Total 617 (delta 285), reused 576 (delta 263)
Receiving objects: 100% (617/617), 351.46 KiB | 436 KiB/s, done.
Resolving deltas: 100% (285/285), done.
$ cd iptables
$ git remote
origin
$ git remote -v
origin http://android.git.kernel.org/platform/external/iptables (fetch)
origin http://android.git.kernel.org/platform/external/iptables (push)
$ git remote add bob git://github.com/bob
$ git remote
bob
origin
$ git remote -v
bob git://github.com/bob (fetch)
bob git://github.com/bob (push)
origin http://android.git.kernel.org/platform/external/iptables (fetch)
origin http://android.git.kernel.org/platform/external/iptables (push)
$ git remote rm bob
$ git remote
origin
Update local branches from the remote repository: git fetch, git pull
You have cloned a repository, you can commit anything you want locally. Meanwhile, a developer commits a new feature on the remote repository. Your work is not lost: you can now get the changes via a git fetch [alias], and then merge via a git merge [alias]/[branch]. The git pull [alias] command allows to go faster: it performs a fetch followed immediately by a merge.
Send your branches to the remote repository: git push
Conversely, to send branches to the remote repository, use the git push [alias] [branche] command. If the branch you are trying to send already exists on the remote server, it will be updated, if it does not exist, it will be created.
Wrap-up
We just finished our Git discovery. I hope I gave you’ll want to get started. For my part, I am fully satisfied to have lost a few hours to get familiar with this software. I think it would be now very difficult for me to use again a classic centralized CVS like SVN, Git is so convenient!
Some resources to go further:
- The Pro Git Book
- Git Magic