Git Basics ========== :author: Aaron Ball :email: nullspoon@iohq.net Git can be a very complicated thing. Someone once told me that we mere humans have a very difficult time with it at first. I myself have had a tremendous[ly difficult] time learning how to use Git (many thanks to http://marktraceur.info/[marktraceur] for all the help). It is an incredibly robust and so a very complicated solution. What source code management system isn't though (especially one that is command line)? This document should serve as a very high level view of how to use Git. It will not cover advanced functionality such as http://git-scm.com/docs/git-cherry-pick[cherry-picking], http://git-scm.com/docs/git-merge[merging], http://git-scm.com/docs/git-rebase[rebasing], etc. If something is not documented here, please see the http://git-scm.com/docs[Git docs] or suggest it on the discussion page. [[working-with-branches]] Working with Branches --------------------- Branches in Git look are like tree branches. The Git repository itself is the trunk and the branches are the various projects in the repository. Typically (hopefully) these projects are related to each other. In the case of a development project with a frequently changing database schema that you wanted to back up, the repository would have two branches: the files branch where the code files are stored, and the database branch where the database dumps are stored. [[viewing-branches]] Viewing Branches ~~~~~~~~~~~~~~~~ Viewing branches is simple. Type *git branch* and you should see output similar to the following: ---- $ git branch * database master ---- To use a different branch, a the checkout command is required. In this case, we will switch from the _database_ branch to the _master_ branch. Note:Some decompression happens here so if the branch to be checked out is very large, this will likely take a few seconds. ---- $ git checkout master Checking out files: 100% (6110/6110), done. Switched to branch 'master' ---- [[commits]] Commits ------- Git does not have commitmentphobia. In fact, it loves commits as if it were its only purpose in life. In most if not all source code management software, a commit is essentially a set of changes to be merged into the master repository. To create a commit, there are several steps that need to take place. Firstly, the changed files to be pushed to the repository need to be added. For this, we use the _git add_ command. ---- $ git add ./ex1.blah $ git add ./example2.blah ---- One handy bit for this is the _-A_ switch. If used, git will recursively add all files in the specified directory that have been changed for the commit. This is very handy if many files were changed. ---- $ git add -A . ---- Once the changes files are set up for commit, we just need one more step. Run _git commit_ and you will be taken to a text editor (likely vi - specified in the repository configuration) to add comments on your commit so you and other developers know what was changed in your commit in case something is broken or someone wants to revert. _This piece is key if you are using the git repository as a code repository rather than a versioning repository for backups. Please write in meaningful comments._ There is actually one more piece to committing a change if you have a remote repository on another box or a different location on the local box. So other developers can pull the repository and get your changes, you need to _push_ your changes to the remote repository. Please see the link:#Pushing_Changes_to_the_Remote_Repository[Pushing Changes to a Remote Repository] section for more information on this. To do this, we use the _git push_ command. [[logs]] Logs ---- All of this commit and commit log business is a bit worthless if we can't look at logs. To look at the logs we use the _git log_ command. This will open up your system's pager (typically less is the one used) to view the logs for the current branch. If you wish to view the logs on a different branch, you can either check out that branch, or you can type __git log BranchName__. A handy option for the _git log_ command is the _--name-status_ switch. If you use this switch, git will list all of the commit logs along with all of the files affected and what was done (modified, deleted, created, renamed) in each individual commit. [[remote-repositories]] Remote Repositories ------------------- Git is a distributed code versioning system which means that every person that has pulled the repository has a complete copy of the original. This is really great for working remotely because you don't have to be online and able to talk to the remote repository to see change history. [[adding-a-remote-repository]] Adding a Remote Repository ~~~~~~~~~~~~~~~~~~~~~~~~~~ Git needs several things to add a remote repository. Firstly, it needs a local alias for the remote repository. It also needs a username to log in to the repo with, as well as the ip address or hostname of the repository, and the path to the actual repo directory on the remote server. With that, to add a remote repository the command looks somewhat like this: ---- git remote add origin gitman@someserver.org:repos/CleverProjectName ---- Now, let's break down what that all means since it seems a tad complicated. [cols=",,,,,",options="header",] |=== |git remote |add |origin |gitman |@someserver.org | :repos/CleverProjectName |This is the command to work with remote servers in git. |Tells git we are adding a remote |The local alias for the remote. Origin is typically used here. |The username to log in to the remote server with. |This is the server where the repo is stored |This is the path to the actual repository directory. Since it does not start with a / it starts in the home directory of gitman (\~/). |=== [[fetching-a-remote-repository]] Fetching a Remote Repository ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Now that we have a remote repository added to our local git repository, we simply need to fetch the repo. To do this we use the _git fetch_ command. Here is where that alias from the remote add command comes in handy. ---- git fetch origin ---- This command will fetch all branches of the origin repository. [[pushing-changes-to-the-remote-repository]] Pushing Changes to the Remote Repository ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Now that we have a local copy of a repository to work on and have made some changes, some amount of code synchronization needs to take place with an origin repository so each of the developers can have the latest-and-greatest. With that, a commit only pushes code to your local copy of the repository. What needs to happen after a commit is to push the change to the origin repository so everyone else will also have access to your change set. To do this, we use the _git push_ command. There are two parameters for this though. The first is the local alias for the remote repository (typically referred to as origin since presumably the remote server is where your repository originated). The second parameter is the branch name. Since we often have more than one branch, this is a good piece to pay attention to so you don't submit a database dump file to the code branch. ---- git push origin master ---- [[dealing-with-size-issues]] Dealing with Size Issues ------------------------ Since git is a code versioning system that contains as many versions of a file as the number of commits, its size can grow out of hand rather quickly, especially when dealing with binaries. Luckily, there is a handy command for this very situation: **git gc**. This command compresses all of your repository branches in the context of each other. This can reduce the size of your local and/or remote repositories very effectively. I have a repository that should be several gigabytes with about 60 commits per branch (it's a repo used for versioned backups), and _git gc_ reduced it to about 370 megabytes.