New Git Stuff

Ever since the internship started, I felt that I have learnt a good deal of new things, which I would not have learnt otherwise through self-learning or school projects. These include setting environment path and using symbolic links to execute any program’s binary file like a Linux command(wherever on the disk the file is), and also Git-related stuff(important for collaboration and version control). These kinds of stuff, I felt that it would be difficult to fully master as some of the things you just do not know of their existence and hence would not Google to find more about them. Well, thankfully the people at the workplace were helpful enough and hence my statement at the beginning of this post. I’ll just talk about Git stuff here(more people will use it) and leave out Linux stuffs(few would use them) for your own self-research.

For my software engineering projects in the past 2 semesters, the only thing I knew about git were: push, pull, status and commit. Conflicts seemed like the end of the world and I had this policy of banning people from editing my code files to prevent conflicts. Turns out that for a large project, while you are implementing a feature in a file, others’ code for another feature in the same file gets merged on the main branch and you have to update your branch from the remote from time to time. So here’s a few things I learnt.

When working on another branch, don’t blindly git pull from the remote main branch! Pull means fetch followed by merge in an atomic operation. It forces changes from the remote branch onto your local repository as much as possible, leaving behind hard to fix conflicts(personal opinion). Also, pull applies commits on the remote repository on TOP of your own commits. The person reviewing your pull request will be cursing at you for making him/her read through the long commit history to get a glimpse of what you have done through your commit messages. Instead, do a git rebase after a git fetch. Rebasing applies your commits on TOP of the changes from the remote repository. In addition, the process pauses when there is a conflict, allowing you to fix them one by one as commits are being applied and you can resume the process with just git rebase –continue.

Secondly, make commits clean. Commit messages should really reflect the changes committed. As an amateur, I made the mistake of doing git add . after doing a bunch of random tasks and feeling that I should save my changes and then using some useless commit message. If you are contributing to TEAMMATES, your pull request will most likely get rejected and you will be asked to fix/clean up your commits. Another situation where git add . is undesirable is when some properties or settings file got changed as a result of a project build. These files may be changed in order for the project to work on your machine, but may not work on others’ machines as they may be machine-specific. Pushing changes to these files can cause others to waste precious time debugging code, only to find that it is caused by the configurations. Thus, you should make a conscious effort to avoid dirty commits caused by these files. Add the files to a commit by doing git add on the relevant files one by one. Alternatively, for lazy types like me, you can use tab-completion(for BASH) or regular expressions if the file names to be committed together can be captured with some regular expression. For example, to commit index.html and index.js, you can use git add index.*. Before adding files for commit, you might want to review the changes using git diff — <filename>. Filename can be replaced with regular expression too. It might be inconvenient to have to commit multiple times, but it is better than having to fix your commits later on when being asked to.

Thirdly, if you have to pull in changes, but have uncommitted changes you might not want later on, don’t commit for the sake of committing so that you can pull from the remote repository. These can become dirty commits if you end up having a future commit that undoes all these changes. Instead, use git stash save <stashmessage> to put aside your temporary changes and clean up your branch, allowing you to do a git pull or git rebase. To see the list of stashed changes, use git stash list. The items in the stash are identified by something like “stash@{0}”. To apply changes(and remove from stash) from an item in the stash, use git stash pop <itemid>. This applies uncommitted changes which you can edit upon or commit them.

Supposedly, the last part of this post is about fixing/cleaning up commit history. However, I find it difficult to write it here without contextual demonstration as I learnt it by watching others fix my commits, scrolling through the command history afterwards and then going through the same painful process several times on my own. Hence, I am omitting it from his post.

If you are new to Git, hope this helps you create less painful commit history for others to read. If you are a veteran at this, just ignore this post. You may correct me if I am wrong but DO NOT flame me for any wrong info. After all, my mastery of Git has just only levelled up a little bit, so don’t expect too much out of this post.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s