Git Fundamentals
-
A Brief History
We'll start with a brief history of version control so that we can understand where we've come from and how we got to where we are now. The very first version control systems were developed in the early 70s and operated on a single file and had no networking support. These were systems such as SCCS and RCS. They operated on a single file so you can have a file such as foo.c and how multiple versions of that file, but there was no correspondents between different files within a repository. There was no notion that version 1.1 of foo.c went with version 1.1 of bar.c, it could be arbitrary. So, we only had single files. This lead to the obvious innovation of having a multi-file system or the second generation and this is simplified by centralized version control systems such as CVS, Visual SourceSafe, Subversion, Team Foundation Server, and Perforce. All of these are multi-file centralized systems so you can check out into a working copy on your local system all the files necessary for particular version of a repository. Along came the third generation which are the distributed version control systems such as Git, Mercurial, Bizaar, and BitKeeper. These work on changesets. These changes sets can be shift around and both clients and servers can have the entire repository present which allows us to do some interesting things. So, you can see this gradual evolution of going from single file to multi-file to changesets. Going from no networking such as centralized to a distributed and all the additional capabilities that get added with these new generations, a version control systems. If you want to read more about the history of version control, I will refer you to Eric Sink's article at this address.
-
Advantages of DVCs
Some of the advantages of the distributed version control system over a centralized one include the ability to have different topologies. If we want to use a centralized model, we still can by having developers push their changes to one central repository. This is commonly done in enterprise environments. We can also use a hierarchal model. The hierarchal model has developers pushing their changes to a subsystem-based repository and those sub-system repositories are periodically merged into a main repository. This is done in Linux kernel development because the Linux kernel is too large. There are separate sub-system repositories for graphics, networking file system and other portions of the Linux kernel. Those sub-system repositories are periodically merged with the main Linux kernel so that development can continue on its way. We can also use a distributed model where developers push their changes to their own repository and then the project maintainers will pull those changes into the official repository if they're deemed valuable. This is very common in open source projects on GitHub where if you want to contribute changes, you can fork the main repository, make your changes and then issue a pull create request to the project maintainer. Another advantage of DVCS is that backups are extremely easy. A backup is simply a clone of the repository. So, your failover strategy if something happens to your main server is very set right-forward to simply stand up another server and clone the repository to it. Another advantage of DVCS is reliable branching and merging. Branching and merging is a very straight forward operation and doesn't entail the pin that you might would be familiar with from large merges in central version control systems. This allows us to do things like feature branches or bug fix branches. So we're creating a new feature. We will create a separate branch for that and eventually merge it back into our mainline of work. This allows us to always work under version control. So, even if we got a set of changes that might not pass all of our test at the moment, we can still commit it locally so that we are working under version control and have stable rollback points and only then, push that change up to our central server or share it with the public when it's ready for public consumption. We can also easily apply fixes to different branches. So, if we've made a fix on a version one branch, we can pull it into our master or mainline work very easily by taking that patch and applying on to different branches. DVCSs make this operation easy. We've also got full local history for the repository which allows us to do some very interesting things like computing repository statistics on our local machines very quickly. We can also analyze regressions. Most DVCSs have the notion of a bisect command which will search the repository looking for where a bug was introduced. So, you can search back through your repository and find it where bugs got introduced which will give you additional information on how to fix it. What change actually caused that bug to be appeared in the codebase because it's not necessarily the last change that introduced the bug, it might not be found for a while. So, there are some interesting things that you can do by having a full local history present. Most operations in the DVCSs are local operations. You can also introduce new ideas such as using your version control system for deployment. Heroku does this. You can actually do a git push heroku prod_branch and what this will do is it will push to a server on heroku and it will push a local branch which I'm calling prod_branch here up to heroku. What this is going to do is push our changes out and then heroku will look at the repository and deploy our solution from there. So, we can use DVCSs in new and interesting ways for doing reliable branching and merging, different server topologies, computing repository statistics, performing deployments. There's a lot of interesting things that can be done because we have a complete repository at our fingertips.
-
About Git
Git was created by Linus Torvalds, who is also the creator of Linux. Git's creation was prompted by the Linux-BitKeeper separation. BitKeeper is a commercial DVCS that was used by the Linux kernel team from 2002 to 2005. When BitKeeper decided to stop supplying the Linux kernel team with free licenses for BitKeeper, Linus started up the git project in 2005 to create their own DVCS. It's written in Perl and C and runs on a wide variety of operating systems including Linux, Mac OS X, Windows, and many other commonly used operating systems available today. It's main design goals include speed, simplicity, strong branching, and merging support, a fully distributed nature, and for to scale well for large project. Remember, this was designed to be used on the Linux kernel which is a very large piece of software.
-
Installing Git on Windows
Let's look at how we can install git onto a variety of operating systems. First up is Windows where I would recommend using the msysgit project. Let's install git onto Windows. We are going to use the msysgit project and I can go to the Downloads tab and I'm going to download the very latest, so this is the full installer for Git 1.7.10. So we will download that to our downloads folder ( Pause ) and launch the installer. ( Pause ) Let's walk through the setup wizard and some of the options that you are going to want to set. We will select the default install folder and I personally do not like to have it on the Desktop and I don't really need git in the quick launch either. We can choose to have Windows Explorer Integration, but if you want Windows Explorer Integration I would recommend looking at Git Extensions. Stay away from tortoisegit as that is an older project and much more closely mimics towards SVN. It doesn't expose the full power of git. If you want Windows Explorer Integration, I would highly recommend looking at Git Extensions. We will leave the rest of these options on and I'll choose to install the TrueType fonts for all console Windows. Program group is fine. Now, this is the important menu item. I can adjust my PATH to include git commands so that I can use a normal command prompt in PowerShell. By default, we only allow it and Git Bash. I personally don't mind including some Unix tools on my Windows Command Prompt. This is only going to replace, for instance, the find command which isn't commonly used in Windows anyways. It replaces it with the much more powerful Unix version which I think is a good overall change. So, I'm going to say, yes there's a big red warning, but I'm going to run both git and the Unix tools from Windows Command Prompt. We can choose what line-endings style. By default, git only has line feeds in the repository. In Windows, we both used both carriage returns and line feeds since we know line-endings. So, I can choose which way that I want to deal with, line-endings. Some people will advocate, checkout as-is and commit as-is so that means that your repository will have carriage return line feeds. It really depends on whom you're sharing with. If this is going to be a cross OS project that is going to be buildable on both Windows and Mac and Linux, you want to use the first option. If you are going to only be working on Windows, you do have the option of this, checkout as-is, commit as-is. These days, I would recommend the first option which is the default. It would go ahead and install git. ( Pause ) All right now, that's done. I will click finish and I will bring a PowerShell. ( Pause ) PowerShell is here so I can type git version and see that msysgit 1.7.10 is in fact installed. If I want to change to my code directory, I can now say, make a directory called test, change to test and do a git init to create a repository and now I successfully create a repository which assures me that git is now working on the system.
-
Installing Git on Mac OSX
If you're using a Mac OSX, you can use homebrew to install it using brew install git. If you're not using homebrew, you can also download a DMG package which will allow you to install git onto your system. On OSX, installing git using a homebrew is very straightforward, by doing a brew install git. ( Pause ) To verify that it was successfully installed, we can do a git version and see that the correct version is actually now available at our command prompt. If you're using Linux, you can use apt-get install git-core on Debian and Ubuntu distros
-
Installing Git on Linux
or yum install git-core on Fedora. Most other package managers have git available. You'll just have to check your distros for instructions. ( Pause ) Let's see how we can install git onto Ubuntu. I'll do a sudo, apt-get, install and git-core and agree to that. If I now do a git version we can see that git has been installed and is now ready to use.
-
Configuring Git
Now that git is installed on your system, let's look at how we can configure it. Git provides 3 different configuration stores. The first of this is the system-level configuration and it's stored in /et cetera/gitconfig or if you're on Windows in program files/git/et cetera/gitconfig. This git configuration applies to the entire computer that is installed on. And you can access it by using gitconfig-- system. The second level is user-level. We use the git config-- global, it's global for a particular user and it's stored in the user's home directory in a file called .gitconfig. The last is a repository-level configuration. You access this by using git config without any specifier and it's stored in the .getdirectory/configfile in each repo. ( Pause )
-
How to Configure Git
Let's see how we can configure git. Right now I don't have a configuration file. It doesn't exist yet. It is not very common to modify the system level with git config, but much more common to modify the global or user-level git config and the repository based one. So, let's start off with a git config and I'm going to just ask it to list of-- I'll say, global list of all global options. So it says, that file does not exist which we already knew. I now going to do a git config and I'll give it the global option 'cause I'm going to configure some global user options of configured global username as James Kovacs and now I will do a git config global on user.email. ( Pause ) And now if I run git list, you will see that we've got both user name and user email set. If I catch that git config file, you'll see that git config is a simple name value pair in a file and we've got a header called user with individual properties called name and email. So I can add additional properties. Some other common ones that you are going to want to set up are git config, global and set your core editor. Your core editor is the default editor that you want to use when editing commit messages or viewing diffs and other pieces of information from git. If you're an Emacs user, you can use Emacs. If you are Vim user like me, you can use Vim. If you want to use Notepad or Notepad++, all of these are possible. I also add another config option called help AutoCorrect and I'll set it to 1. What help AutoCorrect 1 does is, let's go to the git fundamentals directory and if I do a git status, a command that we'll see in just a second and I misspelled it. With AutoCorrect, it will do a fuzzy match on that command name and guess what you want to use instead. By setting it to 1, it waits 0.1 seconds before it actually executing commands. So you are basically saying do it immediately. If you set AutoCorrect to zero, it doesn't do auto correcting. If you ted to a higher number, then it will wait that many tens of a millisecond before performing that option. So, I find it helpful especially when typing quickly. If you make a minor spelling error in a git command, it will use a fuzzy match to determine which command that you wanted to use. Another option that we are going to want to set is git config and set the color of the UI to auto. What auto will do is it will use colors to show a lot of git information. So, when were doing diffs or when we're showing status, it will colorize the output. By setting it to auto, it's going to try to detect whether it's running within a script. If it's running within a script then it will not put out any output color, output escape sequences so that logs are easier to pars, but if it is detected as running within the terminal it will output the escape code to colorize the output. The last option that we're going to look at is global and core auto carriage return line feed options. So, what should git do when to type carriage return, line feeds? There's a variety of different options, we can use true or false, or input and we'll talk about each of this in turn. True, means to convert carriage return, line feeds into solely line feeds. So, when you commit to the repository, it will change the carriage return, line feed combination which is typically use in Windows into solely a line feed which is then stored in the repository. When you check those files out, it will convert those text files back. It only performs this action on text files, not on binary files so you are not going to corrupt your binary files. Another option is false which says do nothing. That means, commit-- carriage return line feeds to the repository and store them there and don't do anything when you pull them back out. If you're only doing Windows Development with git then this option is fine. But if you're doing cross-platform development you will end up having carriage return, line feed in your repository which will then end up checkout on to other platforms like Linux, BSD and Mac OS X, which don't generally use carriage return, line feeds. They use line feeds only. The last option is input, which means convert carriage return, line feeds into line feeds when you put it into the repository but don't do any conversion on the way back out. So that said, where should you use each of these options? If you're on Windows, I would recommend using true, store solely line feeds in your repository for text files and convert them to carriage return, line feeds when you're pulling at the repository. On Mac or Linux, I would recommend using inputs so that if you do happen to grab a Windows text file and it had-- it has carriage return, line feeds, it would be properly converted into a line feed only version in your repository. If you are doing Windows only development and don't want git messing around with your line-endings then you can use false. This can have consequences if you ever do check this out on a Linux or Mac system later on as you'll have the unexpected carriage return, line feeds. This is running on Mac OS X, so I'm going to use input. So let's check the result of all these configuration options. I'm going to just change back at this repository and I'll see a git config global list, and see all of the options that have been set. I can also see these options by catting my git congfig, there they are. If you're going to use diff tools, which we will talk about in module 4, there's some configuration option that you can specify for configuring your own personal favorite diff tool for performing diffs in merges. You'll just have to look at up in your-- in the documentation for your diff tool, how to configure git for it. If you can't find it within the diff documentation, often you will be able to find it on Stock Overflow or another information source. Let's change to our git fundamentals and here we have our .gitfolder and it's got a config file as well. It specifies all the information for this repository. Now, what we can do is we can say git config and I can change my user.name to something else, John Smith. And if I now do a git config list we can see that user.name has been over written to John Smith at the very bottom. So these changes are-- these config sources are hierarchical. The user-level one overwrites any system level settings and repo-level settings overwrite user-level settings. So if you did want to change how for instance, line-endings, endings were handled, you could do a git config, core auto, carriage return line feed and change it to true just for this repository. Now, if you want to renew something, you can do a git config, unset, core auto, carriage return line feed to remove that setting and we'll do the same thing for user.name. And if I do a git congif list, we can see that those settings have been stripped back out again. You can also simply edit the config files themselves, it's really up to you. So I can go down here, there's a empty heading left from those changes. So I go ahead and write that. It has the same effect. It's up to you whether you prefer doing it in a text carriage return or using the git config commands.
-
Summary
In this module, we looked at a brief history of version control systems, where we came from and where we are now. We talked about some of the advantages of Distributed Version Control Systems. We started talking about git, what is it, where did it come from and what are some of its design goals? We saw how we could install git on a variety of different operating systems and how we could configure it for our own particular environment.
-
Working Locally with Git
Overview
In this module, we're going to look at working locally with Git. We're going to start by talking about how we can create a local repository, adding files, how we can commit those pending changes to the repository, view history of commits made, looking at differences between commits in the history, the difference between the working copy, the staging area, and repository itself, the leading files, as well as, updating files, and how we can clean the working copy of any extraneous files that shouldn't be there. And then we'll wrap up with we can ignore files using a gitignore for common things like log files or build artifacts that we don't want in a repository. ( Pause )
-
Creating a local repository, adding files, and committing changes
I have an empty directory which I'd like to turn into a local Git repository. I can do this by running Git init. Git init has created a dot Git directory which contains the repository and all of its metadata. I can add a file to this repository by first echoing hello Git to a readme dot TXT file and running Git status. That tells me that the readme dot TXT file is an untracked file, Git does not have it in its repository yet. I can run a Git add and run status again and Git now notices that that file is a new file and is staged to be added to the repository which I can do by running Git commit. Git commit brings up the defaults text editor, in my case bin though it could be notepad, notepad to or any other text editor of your choice, and I can say I added readme dot TXT. If take a look at this history, I can see that I have one commit in my repository now. You can see the author information, data information, as well as, the comment I made. You can also see the commit SHA. Git identifies commits by a SHA1 hash of the commit. Often, you can deal with these commits by simply using the abbreviated 5 to 8 characters, first characters of that commit SHA rather than the full SHA. Now, if I actually update that readme dot TXT file by adding a second line and run Git status, Git knows a bit the readme dot TXT file and notes that it is modified. I can add all modified files by doing a dash U or updated, running Git status again. I've now staged that change to be committed. So I will do a get commit and I can use the dash M option which allows me to provide the message directly in line so I don't need to go to my text editor.
-
Viewing history and diffs
If I do a log again, you can see I now have two Gits, two commits in my Git repository listed in reversed chronological order. So the most recent commit is at the top and later commits are at the bottom. Now, if I want to find what has changed between these commits, I can run a git diff. I can specify the initial commit, so that one, the very first commit to my repository, and the later commit, and that notes that the hello again has been added. It's also provides some context around it so that I can see what other source lines are close to the change lines. Now, always working with the SHA1 hash is can be difficult and so Git provides an easier way of specifying these things. The latest commit is known as Head. I can also go back from the head by using a tilde syntax, so tilde1 is one commit back from the Head so I can go from Head tilde1 to Head which provides with the same one. If I do not specify a commit then Git assumes that I mean Head so I can abbreviate this further to just Head tilde1 dot dot. Now, if I-- lets add a few file. Add file one, add file two, and if I look at status, both of these are untracked files. If I run a Git add dash U, that's going to add all updated files to my staging area. Git has this notion of a staging area which is files that are going to be added in the next commit or changes I'm going to be adding in the next commit. If I run a dash U and run status again, you'll notice that nothing has changed. It's because, the dash U option only adds updated files, files that have changed Git knows about. These could be either changes to files or it could also be deletions of files, dash U will noticed those as well. I can either add these explicitly by names so I can say file one dot TXT and file two dot TXT, or I can use the dash capital A option. Capital A adds all files including untracked ones. You have to be careful when you're using this option to make to sure that's you're not accidentally adding files you don't intend to running Git status. You can see that both files are staged and ready to be committed so I'm going to say Git commit dash M added cool new feature. If I looked at the log, you can see that I've got the new commit in their and if I do a diff on Head tilde1, I'm going-- remember I'm going back from the Head so this is between the updated readme dot TXT and the added cool new feature, you can see that I've added to new files. They're empty files, if I got file one dot TXT, it's empty that's why we're not seeing any content. But those files have been added to the repository.
-
Staging changes as multiple commits
Now let's go ahead and I'm going to edit them-- I'm going to edit file one dot TXT. So, adding some code here and I'm also going to edit the readme, updating readme with new information. If do a Git status, I've got two pending changes in my working copy but they might completely different. I might have noticed-- I might having adding a feature or fixing a bug in my code and noticed a typo in the readme so I've done two things at once. So, I'd like to actually stage this as two different commits. Git allows me to do this easily. I can say Git add file one dot TXT, and if I look at the status, you can see that file one dot TXT changes have been staged but the readme dot TXT changes have not been, I can now Git commit and say fixed bug number 1, 2, 3, 4, and now I can say Git add readme, and say Git commit, and give a fixed typo and readme dot TXT, and added additional information about other features. So by having this staging area, I can pull in parts of my working copy at one time in order to break commits up into logical units. Now you do want to be careful about this because if I had multiple code changes, I haven't actually run all of those pieces together. So I-- I might not want to-- you want to be careful about what you're actually staging in. So you wouldn't want to only pull in-- accidentally pull in part of feature and then your commit doesn't make. It doesn't actually compile. So just be aware of that.
-
Deleting and renaming files
Now let's say I realized that I don't need file two anymore. So file two is no longer used so I'll just remove file two dot TXT, just using an operating system command, and if run a Git status, Git notices that that file has been deleted. If I do a Git add dash U, and a status again, that stages that deletion into the next commit, into that staging area. ( Pause ) Now although I have that file deletion staged for the next commit, if I realized that there's additional changes that I want to make, I can still do so. So if I need to add file three, text for file three and do a git status, I can go ahead and add this as well. ( Pause ) If I realize that I need to move a file, I need to rename it, I can do that using normal operating system commands. For instance, I can rename file one to new filename and if I do a get status, it will look to get initially like there is a deletion of file one and a new untracked file. Now, I'm going to do a git add dash A which will add any deletions to these staging area as well as any new files and let's take a look if what happens. Now, git is going to examine the contents. See the contents of those two files are the same and realize that this was a rename operation. So, I can-- now that all these are in the staging area, I'll reorganize the feature, so there we go. And if I do and git status, all those changes have been committed to my repository. You also notice now that I do and git log that I've got more commits than will fit on the single screen and it's actually page forming.
-
Undoing changes to the working copy
So let's take a look at what happens if I make an edit that I don't like. So let's go into read me and I'm going to just delete everything. So there, I do and git status, I've got a modified file. I realized-- oh, I didn't want to actually make that change. I can do or git checkout to pull that into the repository by defaulted grab of the head version by looking at it and git status, I've got nothing to commit, so I don't have no pending changes. And if I look at the contents of read me, it has all of its contents back then again. So, you can check out files from the repository in order to clean up or revert changes that you might have made by mistake or realized in hindsight was about idea. Now, let's say I go ahead and I'm going to edit the read me file. So I'm going to delete all of the contents and I'm also going to remove file one dot TXT. I'm also going to remove and we renamed it new file dot TXT, and if I go and git status, I have a bunch of changes, I could do individual checkouts on each of them. The other thing that I can do is to git reset. And here I'm going to do a hard reset and looking at the status, I've reset myself, my working copy back to the head so I've actually removed all of those changes.
-
Undoing/redoing changes in the repository
Now, let's look out the log. That feature reorganization, I realized, hey, that didn't go quite as planned. I can actually do a git reset soft head tilde1. So let's take a look at what that actually did. If I look at the log, my head now has that reorganization of the feature omitted. I've basically taken that last commit out of my repository. If I look at my status, you can see that all of those changes that I made in the reorganization have been moved back into the staging area and they're also reflected in my working copies. So I can now go in and change that commit around, I can make corrections, I realized-- oh, I forgot to run may unit test and there's a broken one, I can rule that back out, make the fixes and recommit a working commit into my repository. So let's go ahead and commit that and I'm going to say, reorganized, so just to the same old messages before, git, commit, oops, dash M. Let's do a git commit dash M, reorganized file, files for feature. If I do a git status, my working copy is clean. If I do and git log that new commit is in there. Now, let's say that I decide that, that features just was-- there is something fundamentally broken that commit. If I do and get reset hard to the head tilde1 that is going to move my head back, the head commit back, it's going delete that last commit, reorganize files for feature and discard all the changes. So, those changes have gone if I do a git status, you can see I have nothing in my working copy, and if I do a git log, that commit has vanished entirely. There are ways to get it back which we'll look out in the later module, but this allows me to fix things in my local repository before I might push to a public repository.
-
Cleaning the working copy
Now, let's say that I've created a bunch of temp files, so I've got temp 1, and temp 2 and I look at a git status. So, I've got these spare files kicking around. This could be the results of-- they could be build artifacts, they could be extraneous log files, it could be anything and I've liked to clear these out. There is a git clean command which me allows me to remove files. So if I run and git clean by it self by default, I have to specify a forced option. So I can specify an end which means, what would I do? So, the results of this would be removing the temp 1 and temp 2 dot TXT files and a dash F options actually performs the operation. So if I do a git status now, you can see that my working copy is once again clean. So it's a very easy way to clean up if you have stray files kicking around in your working copy, you can go a git clean to remove those.
-
Ignoring files with .gitignore
Now, let's say I do have something like a log directory, so let's make directory logs and I'm going to add some logs in there, log dot TXT and I do git status. Logs are constantly changing as the application runs and so, I don't want to actually commit this to my repository. Git provides a gitignore file. So, I can add a gitignore to my root and this specifies files that I don't want to actually commit to my repository. I can do it relative. So if I say logs, that will be any log directory, anywhere very deep in my application or I can explicitly root it an absolute path and this is relative to the root of my repositories. It is not root to the file system, it's the root of the repository, so I can say, I'm going to omit everything, logs, and asterisk dot TXT or I might want to say logs slash asterisk dot log or I can even say just logs or anything in that log directory, it's you choice how you want to put it together. If I now do a git status, you'll notice that there is one changed file, it's the gitignore file. The gitignore file does get committed to the repository and then shared across the team, if you got multiple people working in this repository, but logs is a no longer being pulled in. So this is a great way to ignore build artifacts, binaries, log files, anything that you don't want actually committed to your repository. So I'm going ahead and add dot gitignore file and commit it, added dot gitignore. Looking at a status, you can see that my working copy is clean. Looking at the log, you can see I've got this nice history of everything that has been done in this repository up to this point, I can-- we'll look at in a future module, how we can actually search through this repository, find interesting commits, as well as, we will be looking at how we can branch so we can have a local features branch, do some development and merge back in. We'll also be looking at how we can share this repository with the world so that multiple developers can be working on the same code base at the same time and be able to merge their changes back together.
-
Summary
In the screen cast, we have looked at how we can create a local repository, adding, updating, and deleting files from the repository. How we can commit set of changes and how we can break up larger sets of changes into individual commits, how we can view the history and the repository and depths of that repository history. We talked a bit the difference between the working copy, the staging area, and the repository itself, how we can clean up the working copy of any extraneous files, and lastly, wrapping up with ignoring files with the gitignore.
-
Working Remotely with Git
Overview
In this module, we will look at Working Remotely with Git. To start with, we'll talk about cloning a remote repository to our local machine. We'll then look at how we can list off which remote repositories are associated with our local repository, how we can fetch changes from that remote, as well as merge those changes into our local working copy. We'll look at pulling from a remote, which is a combination of fetching and merging. We can also push changes remotely that we've made to our local repo. And finally, we'll wrap up with how we can work with tags.
-
Cloning a Remote Repository
Let's look at cloning a remote repository to our local machine. For this example, I'll use the jQuery repository on GitHub. I need to get my clone URL which is the URL where the source is located. I'll be using the http-based one and I'll copy it to my clipboard. Coming over to Git, I can say git clone and then provide the URL. This is going to download the entire history of the project, all of the commits that have ever been made to the jQuery repository. Now, you might think this might take quite a while if you've used other version control systems. But even for a repository of a fair size such as jQuery, it only takes about 20 seconds to actually download all of the commits. And there we have it. Let's change the jQuery repository and take a look at what we've got. I'll do a git log and you can see that we've got a list of commits from the project. If I want to see a more condensed version, I can say git log and provide the oneline option. So we have a commit per line.
-
Basic Repository Statistics
Now, in addition to oneline, I might want to know how many commits are in this repository, I'll use the word count function with the -l to count line by line. And you can see that there's 4,073 commits that we've actually downloaded from GitHub. We can get a slightly more interesting view of this by adding the -graph option which provides a graph on the left-hand side showing the different branches and merges that have happened. You can see on that fourth line down that there was a separate branch that was then merged into the third line. So we can see how the history of this project has changed. I can continue on and you can see some more branches and merges for various requests that have been made. So someone might have branched or forked the repository and then issued a pull request back to the project saying, "Here's a bunch of fixes," and git makes it very easy to incorporate these fixes back into the master repository or that central repository where all the coordination is done from. Now, we can get a variety of stats on this git repository because we've got the entire history right on our local machine. We can use the, in addition to the log command, we've got a shortlog. Shortlog is actually short for format equals short. But it's just easier to type in shortlog. What does shortlog give us? It lists of the authors and the commit messages from each of them. It also provides us with the number of commits each has made. Right now, we're listing them in alphabetical order. I can also ask for a shortlog and specify that I want a summary so I don't want to have the individual commit messages, I want them ordered-- the N option orders them numerically by number of commits decreasing, and I want to include the user's email addresses. So displaying this, we can see that John Resig has made 1209 commits. He committed another 503 under a different user name and Jorn Zaefferer has made 308. It gives you a history of how many-- or some statistics regarding who has actually made these different commits. Now, if we want to take a deeper look at this, there are a variety of packages that will create git statistics locally, but Github also provides a number of statistics for us. We can look at the graph's option and see the contributors over time, the commit activity; we can visualize additions and deletions. There's many, many statistics that can be computed over these git repositories. So let's look at the contributors over time. And GitHub is summarizing-- it's joined together John Resig's said JE Resig's commits and grouping by email address, doing a bit of additional processing there, but you can see his commits over time to the project as well as many others who have contributed. So, that's why commit-- so, you can see that there's a variety of statistics that can be computed over a git repository very easily because everything-- the entire repository is local to your machine.
-
Viewing Commits
Coming back over to jQuery, we can also take a look at any of the commits that have been made. I can, for instance, look at the head. So what was the last commit made to jQuery? And you can see that there was a copyright change and some other changes to the repository. So interestingly, the major thing that was done here is that we are reverting a former-- a previous commit. If I say git show HEAD until the 1, it is going to show the commit that was actually just reversed. And we can see this too in a git-log online that the first commit here, 247d reverts this previous commit, 740-- issue 741. And that 532b is the commit that has been merged back at or reverted. I can also do a git show Head until the 10 or I can also provide hash or the show hash. So, if I wanted to look at that very bottom commit 5642646-- 626, there it is right there. I've got the full history of all commits that have been made. I can also set my history back to look at what the entire source repository look like. I can create a branch and look at the state of the repository at any given point in time. We'll be talking much more about branching in the next module. We can also take a look at git remote. Git remote shows that we've got one remote called origin and what is origin is just the git default name for where this source came from. If I do a -v option or verbose, it will show the URL, both the fetch and the push URL, for that particular remote. This can be different. For instance, you might be fetching from an https URL but then you want to push to it as h-based one. And there's-- we'll talk about reasons for doing that.
-
Git Protocols
Git can operate on a variety of different protocols such as http or https. These use the default ports of 80 and 443 though these can be configured just as with any http URL. These URLs allow both read and write access and you can demand a password for one or both of reading and writing. So if you have a private repo, you can demand a password for reading as well as writing. More commonly, such as on gitHub, a public repository will allow anonymous read access but require a password for write. These URLs are firewall-friendly and don't require configuration by your corporate IT infrastructure. There's also the git protocol. It operates on port 9418 and starts with a git con wak wak (phonetic). This is a Read-only URL and only allows anonymous. It's also commonly used on GitHub. Its main disadvantages that it isn't firewall-friendly. 9418 is not a well-known port that is commonly open so you need to talk to your corporate IT infrastructure to open up that port if you want to be pulling repos down using that protocol. You can also use the SSH protocol on port 22 which is the telnet port. This is a standard secure shell that's very common in UNIX environments and you can see it's using the git act. Git is the username to log in to the remote system with. It is both a read and write, so if you have permissions, you can both read and write to this URL and it uses SSH keys for authentication. So if you have given git your public SSH key, the SSH protocol will actually use your private key for authenticating with git. So you don't need to provide a username password. It's all done by the SSH infrastructure. The last git protocol that's used is the file protocol. There's no port associated with it. It is only useful for local operations but it is both read and write. So if you need to play around with cloning repos and pushing and pulling changes and you want to just do that locally, it's very easy to set up by just pointing git to the fully qualified path name for that repo on your local system. So let's list off the entire directory contents. And you can see there's a .git directory here. If I looked at git/config, I'm going to just display the file, you can see there is that remote origin as well as the branch that we're working on master. There's some additional data here that says where to fetch from and also when we're merging, what we're going to merge into. So we'll talk about merging very shortly.
-
Viewing Branches and Tags
I can display all the branches in this repository. We only have our one local branch called master. I can add the -r option which will display remote branches. And you can see that there's a number of branches that are part of-- that have been shared remotely by the jQuery team. Branches are often used for sort of temporary working copies or to separate out main line development from bug fixes. We can also look at the tags. These are stable points, these are known points in your code base where you can often tag versions. So there's all the different versions that have been-- of jQuery that have been tagged by the jQuery team.
-
Fetching from a Remote
Let's switch over to our GitFundamentals repository. And it is as it was before, if I do git log, you can see the added .gitignore. If I ask for the remotes, we don't have any. This is a local repository and does not communicate with any other repositories. The remote repository was automatically added when I cloned the jQuery repository. But if I have a local repository and I want to add a remote destination to it, I can then use the git remote and ask it to add, I'm going to add origin but I could call this anything I wanted, origin is an arbitrary name. And you can have more than one, so you can pull from multiple repositories. So if someone sends you a pull request, you could add their public repository which might be a fork of yours and then pull their changes into your local working copy to examine. So you can have multiple remotes and this is commonly done in git in order to evaluate patches or pull requests that have been made to your project. Now that that remote has been added, I can run a git fetch. Git fetch will pull down any changes from that remote repository; you can run it as many times as you want. If you have multiple remotes, I can specify the remote to fetch from. Now, what I want to do is-- if I look at git log, those changes haven't been incorporated-- there's been no changes incorporated into my local repository. If I do a git log on origin master, origin master is the name of that remote branch. You can see that I've got this new Updated README from another location. That is a new commit that was in the repository but is not in my local working copy. So how do I get into my working copy? I can do a git merge and specify the merging from origin master into my current branch called master. Often, you have this correspondence between the local branch name and the remote branch name, though not always. Now that I've run that merge, I can do a git log to see that it's there. One thing I do want to point out is note that this is a fast forward. That means that the remote repository had everything up to or the local branch had everything up to CACC commit but didn't have this 9523 commit. So, git was able to simply apply that new commit on top and didn't have to actually modify the code or merge changes from multiple streams and create a new commit. So, it is able to fast forward, basically, move the head pointer to the new location. So I now have this commit from this remote location. We'll be talking much more about merging and branching in the next module where things can get more complicated.
-
Pulling from a Remote
If I do a git branch -r, I can see that remote branch, origin master that I just merged from. Now, this act of doing a git fetch followed by a git merge origin master is so commonly done that git has a shortcut for it. That shortcut is git pull. Git pull does exactly what we just did. Now, I've done a git pull but there has been no correspondence set up between my master branch and the origin master, the remote one. So git is saying, "I don't know what to do." We can set this up easily by modifying that .git/config file but git 1.7 above provides an easier way of doing this. I can say git branch, set-upstream so this is setting an-- what's called an upstream tracking branch. The upstream tracking branch is basically what branch remotely does my local branch mirror. What-- and I'm going to just establish correspondence between my master branch, the local one, and the one coming from origin. So once I set that remote tracking branch, I can do a git pull and pull any changes down. If I didn't want to establish a remote tracking branch or this upstream tracking, I can always do a git pull and I can say origin master and specify the remote name and the remote branch that I want to pull in from. But it is very common to set this upstream branch and then just perform git pulls very simply. When I actually cloned the jQuery repository, the active cloning sets these upstream tracking branches automatically for me.
-
Pushing to a Remote
Now, let's actually take a look at what it takes to push changes back up to a remote repository. So I'm going to edit my README file and I'm going to say, "Sharing remotely is fun and easy." And we'll look at the git status, we got one modified file. I'm going to do a git commit -am, so I'm going to add any modified files that git knows about. I don't have any new files so I don't have to perform a separate git ad on those. I can just do a -am and provide a message "Sharing is easy." So I've added that change, and if I run a git status, it notes that there's nothing to commit and it also notes that my branch is ahead of origin master by one commit. There's pending changes that I need to push. So I'm going to do a git push and I'm going to be prompted for a GitHub username and password. Now I could type these in here but having to manage usernames and passwords isn't exactly ideal. So I'm going to do something slightly different. I'm going to remove the origin so I'm going to do a remote rm to remove that origin, so if I do a git remote -v, you'll see I don't have that anymore. And I'm going to readd it, but I'm going to add the origin as the SSH version-- ( Pause ) -- the SSH version of the URL. The advantage of the SSH version is that it's going to use my SSH key to authenticate with GitHub. So now when I do a git push, it's not going to prompt me for my password, it's going to simply push that change up to GitHub. Let's come over to our browser and I'm going to go to GitHub.com GitFundamentals and you can see coming down here that that README.txt has been updated and has that new content in there. And we've authenticated using the SSH key that I've configured for my user and that has been authorized by GitHub. So when you were pushing changes back up to a git repository, it's easier to use the SSH URL rather than the http URL. Http requires a username, password whereas the SSH version can use your SSH key to do the authentication for you.
-
Creating and Verifying Tags
Now that those changes are up there, let's look at actually noting something of interest. So I want to tag my repository, I want to say, git tag and I'm going to provide a name so I'm just saying, okay, this-- I'm going to release this master branch right now as version 1. So I'm going to tag this as v1.0. And now if I do a git tag, we'll see that I've got a 1.0 tag. I'm able to now branch from that point, it's basically a stable point that points to the 2232 commit at the very top. Regardless of what happens, that v1.0 tag is always going to point to 2232. I've made-- that was an unsigned tag. I can also add an annotation or a message to associate with the tag, so v1.0 with message. I can provide a -m option, actually to provide the message, or it will bring up my default editor, this is v1.0. And if I get to a git tag, we now have two tags. A third option is to provide a signs tag which is done with the -s option and I'll say v1.0 signed. If you're signing a tag, it automatically requires a message, signing v1.0. Now, what I'm asked for is my passphrase to unlock my signing key. So I'll type that in, and if I do a git tag, you'll see I now have three tags. If I ask to say git tag and then use the -v option which in this case means to verify. Let's try verifying an unsigned tag, so I'll verify the 1.0 with message. It will display the actual tag and who tagged it? Me. But it will note that no signature was found, so it couldn't verify that this tag was actually created by James Kovacs. Now, if I try to verify the v1.0 signed, you'll see that not only does it have the tagger, the tag name, and the message, but it also notes that it was signed by me and that the signature is actually valid, it's actually using public-private cryptography to ensure that no one has come in and modified it. So if you're exposing a public project and you want to ensure that certain commits can be verified, in other words, this is an official commit. You can then, you're signing to identify these commits, essentially sign the commits saying, "I, James Kovacs, have said that this is the official v1.0 release."
-
Pushing Tags to a Remote
Now, if I do a git push, it says everything is up-to-date. If I go to the website and look at the code, you will notice that I don't have any tags. Let's go to the tags over here. There are no tags. Now why is that? By default, git will not push tags. So I need to do a git push and provide the tags option. When I actually perform that, it will create new tags on GitHub and that remote git repository. Let's come back over. I'll refresh the browser and taking a look at the tags, we can see the different tags. And if I changed one of these tags, it will show the state of the code base at the time that that tag was made. You can see this more easily by switching over to the jQuery code base where they actually have tag releases. So you can see here, this is the master branch, this is master so it is the latest and greatest version of jQuery, it hasn't been officially endorsed. I can come over here and say, I want to see the tags, what was the state of the code base at version 1.7, oh let's go 1.6.4. And this is the version of all those files for the 1.6.4 release. If we look at the version.txt file, and you see it says 1.6.4. Coming back over here, I'm going to switch back to my master branch. So now, I'm back on master and look at the version.txt and you can see it's 1.7.3 pre. So, tagging gives you stable points in your source code, for instance, when you have an official release, a beta, an RC, or maybe you want to tag each of the individual build that succeed on your build server. That can all be done with git and then shared widely with whoever is interested. So, I hope you've seen that sharing with git is easy and straightforward. Much of what you'll be doing is you will be pulling down changes from a remote repository or from your collaborators which will then get merged into your own development. You will make your own changes that can be committed locally and when you're ready to share with the world, you can do a git push to share those changes back out again and make them available for all to see.
-
Summary
In this module, we've looked at how we can clone our remote repository to our local machine, how we can fetch and pull from that remote repository, get the changes down into our local working copy, how we can push our changes back up to the remote, and finally, how we can work with tags, both creating tags and sharing them remotely.
-
Branching, Merging, and Rebasing with Git
Overview
In this module, we're going to talk about Branching, Merging, and Rebasing with Git. We're going to talk about how we can work with local branches. How we can stash changes that we might not want to commit right now but say for later. How we can merge branches together. Rebase commits on to another branch. How we can cherry-pick commits from one branch on to another. And also, how we can work with remote branches.
-
Visualizing branches
In order to talk about branching and git, we need some way to visualize those branches. In the last module, we looked at the git log command with the graph oneline option which gives us a list of the commits on the current branch with a graph of those commits, they're in the left hand side represented by those asterisks. Now, we want to add a few more options to allow us to visualize branches. Adding the -all allows us to visualize all branches rather than just the current one and adding decorate applies any labels to the commits such as the HEAD label, tags, remote branches and local branches. You can see examples of all of these types of labels on the 2232 commit at the top of this log. Now, typing this out every time, we can get a bit cumbersome. So what we can do is we can add this as an alias in our git config. So, I'm going to say, git config and specify the global option, meaning that I want to add it to my .gitconfig in my home directory. I'm then going to say, alias and I'm going to give this alias a name, lga for log graph all. And then I can provide all the options that I want to use. I can omit that initial git and simply specify log with the graph option. I'll specify it oneline, all, and decorate. ( Pause ) With this in place, I can now run a git lga and it does exactly the same thing. If you want to take a look at how this is displayed in the git config, we can simply display the git config and you can see on the bottom there that new alias for lga has been added. So it's up to you. You can either add, use it using a git config command or you can directly edit that .gitconfig file.
-
Creating local branches
Now that we have this in place let's go ahead and start adding some branches. So I'm going to decide, oh I'm going to add a new feature branch. So I'm going to say git branch and I'm going to make all this feature1. And I've now created a new feature branch which I can now checkout. If I do a git lga, you can see that I've got 2 branches, master and feature1, these are local branches and they are both point to the 2232 commit. Remember, branches are simply labels on the SHA-1 hashes of individual commits. Now, let's go ahead and I'm going to echo Feature1 and add it to the README. ( Pause ) If I do a git status, README is modified so I do a git commit -am, Added feature1. And if I now do a git lga, you can see that feature1 is now pointing to the 4D55 commit as well as HEAD. So, both of those are pointing to that particular location. You can see master is still pointing to the 2232.
-
Difference between branches and tags
The big difference between branches and tags is that branches will follow the commits. As you add additional commits on that branch, the branch will move along. With tags, they always stay on the same commit. They're just a friendly name for that SHA-1 hash. So, let's go ahead and checkout the master. So, we've checkout the master branch again and if I do a git lga, you can see that HEAD is now pointing to the 2232 commit. There's a number of other things I can do. Let's say I want to do a fix. I'm going to call this branch, a fix1 and I'm going to do it off of a particular commit. I can do it off of the 974B commit. So, I can specify a commit to base it off of. More frequently, what you're going to be doing is you're going to be creating fix branches from well known tags or branches, or local branches. So, with this in place, I'm going to checkout fix1 and I will say echo, fixing bug, number 1234 and I'll just append it to the README. And I'll do a git commit -am and say, fixed bug1234 and if I do a log now you'll see that our branching structures getting a bit more complicated. We can see feature1 is pointing to the 4D55 commit. Master is still pointing to the 2232 but we got this new, fix1 that's pointing to the 5A78 commit.
-
Renaming and deleting branches
Let's go ahead. I'm going to checkout master again and we've decided that we don't want to-- that fix1, really it was fixing bug1234, so we really want to call it bug1234. I can do this by giving the git branch and the move option. So to rename a branch, I can basically move it and I'll say, fix1 and I'm going to move it to bug1234. So, you can see that we've now easily renamed that bug. Now I decide, you know what I'm done with bug1234, I'm going to do a git branch and I'm going to delete that branch. Git prevents from me from doing this. It says, "This branch isn't merged, it hasn't been merged into master yet." So, it hasn't been merged into another branch. So we could-- we're going to loose that commit. If we really want to go ahead and do it, we can say git branch -D, bug1234 and by using the capital D, we're saying to git, for stat deletion. So I'm going to go, "Okay that is deleted." Now I might go ahead and say, "Okay, let's-- I want to work on another feature and I'm going to work on feature2. I'm setting on master, I'm going to git branch and say, feature2 and then I'm going to check it out. Git provide the faster way of doing this. I can do a git checkout -b which means create the branch and say feature2. I now go into echo, Feature2, on to README, git commit, Added feature2. ( Pause ) And we can see we now have feature1 and feature2 off of our master branch. So we can create branches, the feature branches very easily. They're local and only visible to-- and within this repository.
-
Recovering deleted commits
Now, I might decide-- I might think just a second, I deleted bug1234 and I didn't-- I said, "Go ahead, delete it." I know I've got it and I realize, "Oh, no, I really didn't want to delete that." Although I have deleted it, git provides me a way of getting it back. If I do a git reflog, this is a log of all references where HEAD has pointed. So, if we look back in our reflog, we can see where the HEAD was changed to that commit for adding feature2, changing from masters to the feature2 branch. We can go all the way back, we can go back if you go to HEAD at 3. We see the 5A78 commit which is the commit to fix bug1234. So what I want to do is I want to git branch, I'm going to you say, bug1234 and I can specify that commit SHA, so I can-- 5A78C8B. If I do a git lga, you can see that bug1234 is back. I'd reapply that branch label to that particular commit. So it-- and I can now do a git checkout bug 1234 and I can even do a git show HEAD, which shows the addition to README.text of fixing that bug. Now I should warn you that git doesn't keep this dangling commits around forever. By default, git will keep them around for 30 days and after 30 days its garbage collection mechanism will clean up these commits. So you have to-- you can't sort of rely on this but usually what happen is, as we were working on feature, delete something and then realize, "Oh, you know what I really wanted that," you can use the reflog to get that information back.
-
Stashing changes
I'm going to just switch over to feature2 and I'm going to start working on a new piece of it. I'm going to just echo Feature2 changes on to the README.text file. So I've made some code changes somewhere in my project. I've got things happening and someone comes along with the bug report, it's a production problem, I need to fix it right away. These changes that I'd made are not ready to be check in to the feature2, they're half way through, half thought ideas. But I don't want to loose them. I can do-- save this work off using a git stash. So, I'm going to do a git stash and you'll notice that those pending changes on my working copy have been rolled back. If I-- look at the README file, it doesn't have that additional feature2 changes added to that file. If I do a git stash list, those changes are in my git stash. This is a little holding area for pending changes. So, I can go off, I can git checkout bug1234, fix the bug, get things working so I can do a git. I can echo another fix to bug number 1234 to the README.text file, commit that ( Pause ) and then I can checkout feature2 again and then do a git stash apply to pull those changes back. If I cat the README file, you can see the feature2 changes is now reapplied to that file. This is done recursively so that stash could be deeply nested to changes to README files, code files, resources what have you. They've all been rolled back and now I can reapply them. If I do a git stash list, you can see that stash is still on the list. Now let me do a git reset hard back to the HEAD to toss it those changes. So if I do a git status, you can see that that working copy is clean. And catting the README file, I no longer have feature2 changes, the end the file. There's another command that I can use which is git stash pop. This pops, the top item off of the stash and applies it to my current working copy. So it's exactly the same as apply, the only difference is that it removes it from my stash list. Let's re-stash things, and let's say I made another change so I'm going to echo, more changes and I'm going to append or I'm going to add a new file called AdditonalFile.txt. If I run a git status, you can see I've got that AdditionalFile, I'll git stash that as well. First add that to my working copy so that git actually knows about it. And if I git stash this, you'll notice that my working copy is once again clean and my git stash list, I've now got 2 pending changes that are part of the-- that are in the stash. Now, if I decide I don't want, let's say I'd decided that AdditionalFile, I don't really need it, I can do a git stash drop. And it drops the reference to that stash, running git stash or stash list again, you can see I've only got one pending change. I might decide that these additional features on feature2 really need their own branch so I can do a git stash branch and give it a new branch name such as feature2_additional and it's going to create that new branch, check it out and apply the stash to it. Doing a git stash list, you'll notice that it popped that change off of the stash. I can now run a normal git commit -am, added additional features to feature2. And doing a git log, you can see that that has been added to that feature2_additional branch. So stashing is a very useful way for as a temporary holding area for changes that are not ready to commit to a branch but you don't want to loose.
-
Merging branches
Let's go back to master and we're going to take a look at merging. Let's see our current branch structure. You can see that we've got this branch going from master. You can see our HEAD is currently at the 2232, and we've got feature1 setting over here at 4D55. We've got some bug fixes that are branch off from the earlier one and then we've got these two feature2 and feature additional branches. So let's decide, let's say we decided that it's time to merge in feature1. So I'm going to do a git merge feature1. You'll notice that this was a fast-forward merge because git was basically able to take this master label and just move it to feature1, it didn't have to merge any files, it could just literally move at label to a new location. But in the log command, you can see that that exactly what it did. Now that that feature is merged I can do a git branch -d, it's a little d, feature1, git knows the feature1 already been merged so it's safe to delete this label, we're not going to loose any commits, I don't need to use the dash capital D option. So that is gone. And we've done a fast-forward merge, we now have new changes that include feature1 integrated into master. Now, let's say that we want to merge in that feature2 additional branch, so get merge, feature2_additional, and there are merge conflicts. So, let's take a look at the merge conflicts in README.text. Git has added this standard merge notes in here, saying this is what the HEAD looks like. And here is what was in feature2_additional. So there were changes to the same lines and git doesn't know what to do with this. So we can either resolve this merge manually using our text editor or we can use merge tool. Merge tool allows us to use a variety of different tools for performing the merge. I've gotten the-- I've got the setup with kdiff3 which is a three way merge tool that is available for Windows, Linux and Mac OS. So this is a three way merge. We've got in the middle here our local copy which is the current branch, as you can see its got feature1 on it which is where we are at now. We've got the remote branch, the branch that we're trying to merge in, that feature2_additional and the base is the common commit that both of these come from. So you can see that we're trying to change the same line and add both feature1 and feature 2, and that's were we have this merge conflict. We can right click and then select. Do we want the lines from the base version, version 2, or the local or the remote version? So I'm going to go ahead and pick version 2 to get feature1 and then I'm going to just copy in the changes, from this one. So we want to merge it, probably look something like this, there are other merge tools such as Beyond Compare which works for both Windows and Linux but it's not available for Mac and it allows you to say, "I want the changes from the first branch followed by the changes from the second branch or vise versa or one branch with the other." It's a much more full feature tool. So I got these changes in here now. I'm going to save off those changes and quit the merge tool. And if I do a git status, you can see that I've got a modified README.txt file. I've also got this .orig file kicking around which we'll need to clean up. Now, what we wanted to do is do a git diff cached, cached asks git to compare the repository to the staging area. So you can see that the-- we've added these 2 lines and we're ready to commit. So now that we've resolved this merge conflict, we do a git commit -m, merged feature2_additional into master. ( Pause ) And now that we've done that, we can remove the README.txt.orig file because it's no longer necessary. And, so we've seen a fast-forward merge and we've also seen dealing with merge conflicts. ( Pause )
-
Rebasing changes
Let's take a look and say we're working on a new feature, feature3 which is base off of that tag v1.0. So I'm going to do a git branch and I'll call this feature3 and I'll base off the v1.0 tag and I'll do a checkout of feature3 and let's go ahead and edit file1.txt. And I'll say, adding yet more code. And I'll do a git commit, Added feature3, and if we look at our log you can see that we've got this new feature3 that is base off of the origin master or this also the same at the moment as tag v1.0. Now, I might decide that, you know, git show feature3. I'll really don't want to do a merge which is going to result in this intertwining branches. I've pulled down, may be I've pulled down this feature2 from another coworker and why I want to do is I really want just take the changes I've made in feature3 and replay them. Move that commit to make it look like it's always been off of the master branch. I can do this using a rebase. I can say, I'm right now on feature3 and I'm going to say, rebase this branch onto master and basically putting it off its current location and relocate it on top of the master. It was able to-- git was able to just move this over because there were no conflicting changes. So let's take a look at our graph now, and you can see that, that feature3 has just been added on top of master. Now if I, checkout master and do a git merge of feature3, it's going to be a fast-forward merge because it can once again just move that master label 2.2 the same SHA-1 hash revision as feature3. Now rebasing doesn't always go as plan. Let's say that we decided to rebase the bug1234 fix on top of the master branch as well. So let's go ahead. I'll git checkout, bug1234 and I'll do a rebase on to the master branch. So I try to replay these changes but there's been a merge conflict taking and look at the README.txt file were the merge conflict occurred, we can see that we've got these set of changes from the HEAD but we're trying to apply this line 11 change to the same line as line 4. So git doesn't know what to do. We can use the same technique, I can run git mergetool and I'll use kdiff3 again. And we can say that, "Okay, we want to use those lines," and then I'm going to just simply copy and paste. Let's say it belongs at the middle here. I'll just paste it in, save that and quit. Looking at the git status, we've got this README.txt changes stage. So if I do it, get diff cached same as when we were doing a merge, we can see these additional lines that were inserted when we are fixing the merge conflict. I can now do a re-- git rebase continue, to continue the merge and it notes that we've got additional conflicts. So, let's go ahead and look at these additional conflicts because we're basically replaying. We do a git lga. We're trying to replay this first 5A78 change. Over top that resulted in a merge conflict and now we're trying to replay the CDDE change and that resulted in a conflict. So I'll do the same technique. I'll do a git mergetool. ( Pause ) And I'll say, "Okay, what I want to do is I want to take the changes from B." And then apply C. Actually those changes I can just apply the C changes in this case. ( Pause ) So I will go ahead and save these changes. So I've resolved my merge conflicts, I can quit this, I can say, git rebase continue and it's done applying that bug fix. If I do a git status, you can see that the working copy is dirty because it left the stranded orig.file kicking around so let's remove that. Let's look at the graph and you can see that, originally the changes were off of this bug fix down here. But we've rebased them, we've moved these changes up to off of the master. So it's now, those changes have been replayed on top of master and if I git checkout master and do a-- I now do a merge, a bug1234, rather than merging this 2 branches, I'm going to do a fast-forward merge and I now have the HEAD pointing to the right place. I can now do some clean up, I can do a git branch -d on feature3 and do a git branch -d on bug1234 because these have been merged into master, I no longer need them. If I wanted to, I could also remove feature2 and feature2_additional, 'cause these are just labels to particular commits. And you can see as you have the branching behavior, I can see how those different commits were merged into the 2790 commit but you'll notice that this bug fix, these 2 bug fixes were-- when we rebase them are simply replayed on top and now our HEAD is pointing to master. ( Pause )
-
Cherry-picking changes
So we've seen merging and rebasing. Let's take a look at another technique. So let's say we were working-- I'm going to do a git branch and I'm going to call this v1.0_fixes and I'm going to base it on the v1.0 and I will checkout v1.0_fixes. So I'm going to make some fixes here. So I will-- fix1 and I'll append that to file1.txt and do a git commit -am, ddded fix1. And now I'm going to make another change, I'm going to echo fix2 to file2.txt and do the same thing here. ( Pause ) And looking at the history, you can see that we've got this 2 fixes, fix1 and fix2. So I'm working on some bug fixes to this and I realized that I need one of this fixes on the master branch. So let's checkout the master branch. ( Pause ) Now I can't do a-- I don't want to do a merge because I really don't want to merge that change, fix1 and fix2 into the master. All I want to do is I want to grab that one fix, fix1 and apply it to master. A rebase also doesn't work because I'll be applying that whole piece. The other challenge is although I could in this case do a git rebase or merge on strictly that commit, I might have-- that commit might may buried in amongst, a whole bunch of other ones. I really want that one commit in amongst, a whole variety of them. I can use a technique called git cherry-pick. Git cherry-pick allows me to select one single commit and apply it. So I'm going to grab the 6FA4324 commit and cherry-pick it on the master. If I've looked at my log, you can see that I've added fix1, it's taken that one commit and applied it on the master. It's not done in rebase, it's not done in merge, it's just added it on. So cherry-picking is a very convenient technique from moving patches if you've applied the patch to a maintenance branch, reapplying those same patches to other branches such as your master is a very-- it's very easy in git because you're just taking it certain commits and applying them to different places in your treat. Now git does keep track of which commits have been applied. So if later on I decide to merge the fixes. I'll do a git merge and I'm going to merge in the v1.0 fixes, it's going to ask me for a merge message here. I'll just say sure that's look-- that looks good. It will notice that earlier fix was already added. So it will notice that 6FA4 commit was already added. It won't try to reapply it. So if I look at file1, you'll see just the one fix1, it won't have do, it won't add a second fix1. So git's very good about merging and rebasing. In cherry-picking, it knows where the history came from which commits have been applied on which branches and it can safely apply just the ones that haven't been done yet. So I look at file2, so those changes are there but it's not-- it hasn't duplicated that fix1 line twice by reapplying that merge a second time.
-
Creating a remote branch
All right, so let us talk a bit about how we can work with remote branches. So here we've got the origin master, it's far behind the current HEAD. So if I just do a normal git log, you can see all the changes that are pending. If I do a git status, you can see that we're on the master branch and we've got pending changes that need to be applied. So how do we go about, so let's say I want to do a git fetch from our master branch. So I'm going to fetch from origin master. ( Pause ) So I don't have any pending. So nothing came down. I don't have any additional changes. I might have but in this case, since I'm only working on this one repo, I don't have any changes. So I now want to push my changes backup to origin master. So I can do a git push origin master or because this is being tracked I can simply do a git push. It's going to push those changes up. And looking at the graph, you can see that the origin master remote-- remote branch has now been fast-forwarded to the current master. If we come over and take a look at GitHub, let's refresh the browser. You can see that we've got a tag, we'll switch over to the master branch but there is only that one branch and that branch has all of our changes in it. And you can see the latest commit message, merge branch v1.0_fixes the same as over here. Now what I'd like to do is I'd liked to expose the v1.0_fixes branch. So how can I do that? I can do a git push, I'm going to push to the origin and I'm going to push the v1.0 fixes. So I'm going to give the name of the local branch. It's going to create a new branch coming over to GitHub. We can see that we now have a new v1.0_fixes branch. So this is a way that we can push changes up to our remote repository. If we do a git branch -r that will list the remote branches. So you can see we've got both the master branch, an origin and the v1.0_fixes branch.
-
Deleting a remote branch
Let's say, that we want to delete that remote branch. We decided we don't want to-- we don't need it anymore. Deleting remote branches because people might have based changes off of them, is something you should to with caution but it is possible. So here I decide that v1.0_fixes branch is no longer necessary and I want to delete it. Git makes these rather corky. Let's-- before we delete this remote branch, let's see how we actually push a remote branch. So what I'm going to do is I'm going to git-- do a git push and I'm going to push to the origin. So that is the name of the remote and I'm going to push v1.0_fixes, this is the local branch name. If we don't specify anything for the remote branch name, git assumes that it's the same as the local branch name. If we want to specify something different, we put a colon and we say, v1.0_fixes_remote_branch_name something really obtuse and off it goes, and it pushes that local branch to a new branch called v1.0_fixes remote_branch_name. Refreshing over in the browser, we can see that new branch name. So if we want to delete a remote branch, we'll going to push to origin and we want to push nothing. So we leave the local branch name empty. Put the colon and say, v1.0_fixes_remote_branch_name. I hit tab and it'll automatically failed in the full name which is origin. The origin isn't actually necessary. ( Pause ) And it is failed but the origin in there. That is a problem with the completion. So I'm just going to push that remote branch name and you'll note that it is deleted. So I can do the same thing, git push origin v1.0_fixes and that will delete that remote branch from GitHub. So let's refresh here and you can see both of those remote branches have now been deleted. So deleting remote branches is kind of corky in git but it's straight forward. You just have to git push origin and then not specify at the local branch name. Leave it empty, put a colon and then specify that remote branch name. ( Pause )
-
Summary
In summary we have looked at working both with local and remote branches, stashing changes and also managing your history through effective uses of merging, rebasing and cherry-picking.
-
Course author
-
James Kovacs
James Kovacs is a Technical Evangelist for JetBrains. He is passionate in sharing his knowledge about OO, SOLID, TDD/BDD, testing, object-relational mapping, dependency injection, refactoring,...