• Hulu
  • TV
  • Movies
  • More TV. On more devices.
Search
Hulu Tech Blog

Dominate Dragons with Git

November 26th, 2012 by Jeff Yang

People love git for many reasons. Git is super fast and works offline. Git offers cheap branches and effortless merging. Git has customizable project workflows.

Those reasons are all great. But git means much more than that to me. For most people, git is a tool used for managing source code. These people probably interact with git somewhat frequently. To me, git is a tool used for writing source code. I interact with git all the time.

But wait, you say. Git helps you write code? Don’t you have to write code before you use git? Write logical chunks of code and commit them. That’s how it works right?

Nope. That’s not how I roll. I commit a lot. I don’t care if my code is logical. I don’t care if it’s hacky, or ugly, or if it isn’t DRY. I don’t even care if my code works. I commit it all anyways.

I wasn’t always like this. Git changed me. Git transformed the way I code. A lot has been written about customizable project workflows in git. This is different. This is a customized personal workflow. My customized personal workflow.

Be a Hero


You are playing a video game. At some point in the game, you encounter a dragon. What do you do when you reach that point? You save. You call it something like “before dragon”. As a rule, you should save before every critical point. You attack the dragon with a sword. Ouch! The dragon roasts you with his fiery breath.

fire dragon

The dragon killed you! What do you do? You restore of course! No loss there. This time, you search around and find a shield to help in battle. Now you successfully block the stream of fire and kill him with the sword. Yay! Now you save again. You do this because if something happens, you can always restore to “killed dragon” so you don’t have to fight the dragon again. Another rule, save after every critical point.

dead dragon

Congratulations! We saved the kingdom! But that poor dragon! Maybe we don’t have to kill him. Let’s try it out! After all, if you aren’t happy with the result, you can always restore to “killed dragon”. First, restore to “before dragon”. Now try negotiating with the dragon. As it turns out, the dragon loves riddles.

If you answer the riddle correctly, he asks the next riddle. If you answer incorrectly… chomp! What do you do? Once again, you save after every riddle. But now you just need to make progress with each question, you don’t need to restore to the previous question. Many games offer a quicksave (and quickload) option. This is a very convenient way to save that overwrites the same save game over and over. So, before every answer you hit quicksave. If you’re wrong, hit quickload. Take a moment to think about how trivial this makes the “negotiation” even if the riddles are very hard.

riddle dragon

Congratulations! We saved the kingdom AND the dragon! Think about how great it is to be able to save. You can do anything you want. Explore! Experiment! Be fearless! Don’t hesitate–just do it! Don’t like it? Want to try something different? Restore! Something great happens? Save! When you take advantage of this it doesn’t only save you time, you actually end up playing an entirely different game.

This is so powerful. This is freedom without consequences. It’s like the movie Groundhog Day. In it, Bill Murray picks up all sorts of skills, learning, among other things, French, the piano, and ice sculpting as he repeats the same day over and over. Through multiple iterations he is able to fine tune all of his interactions, allowing him to get any girl, and allowing him to become the town hero.

Imagine if you could have as many do-overs in life as you wanted? How would you approach things differently? You wouldn’t need to worry as much. You wouldn’t need to prepare as much either. You could try crazy things. You could aim for your perfect outcome. Think about how awesome you’d be!

An Example


The real world might not offer you these powers, but real life source control can. In this example, I’ll be using Ruby/Rails. I’m going to write something that converts a date filter into a mysql where clause query.

So basically something like:

equal current_year

would translate to

"BETWEEN '2012-01-01 00:00:00' AND '2012-12-31 23:59:59'"

Let’s start with some scaffolding code.

class Blog
  def self.build_where_clause(date_filter)
    return nil
  end
class DateFilter attr_accessor :operator, :value
def initialize(operator, value) @operator = operator @value = value end end end

Now add this file and commit it.

git add blog.rb && git commit -m "scaffolding for Blog"

I’ll write some unit tests and add them (code not shown).

git add test.rb && git commit -m "unit tests"

Let’s start working on our build_where_clause method.
Start by adding the possible values.

 class Blog
   def self.build_where_clause(date_filter)
+    case date_filter.value
+    when :current_year
+    when :next_year
+    when :current_quarter
+    when :next_quarter
+    when :current_month
+    end
+
     return nil
   end

Now, I need to actually implement something… remember, before every critical point… commit.

git commit -am "starting implementation"

Quicksave


Let’s implement one case to get a feel for things.

+require 'active_support/core_ext'
+
 class Blog
   def self.build_where_clause(date_filter)
     case date_filter.value
     when :current_year
+      start_date = Date.today.beginning_of_year
+      end_date = Date.today.end_of_year.end_of_day
+
+      case date_filter.operator
+      when :equal
+        return "BETWEEN '#{start_date.to_s(:db)}' AND '#{end_date.to_s(:db)}'"
+      end
+

Here, I make a commit. I’m going to name the commit “stuff”. What a terrible name! Well, I don’t really want to stop and think of a name–that slows me down! I want to code! More on this later.
The test failed. Let’s fix it.

     when :current_year
-      start_date = Date.today.beginning_of_year
+      start_date = Date.today.beginning_of_year.beginning_of_day
       end_date = Date.today.end_of_year.end_of_day

This is really part of the current commit, so I’d like to do a quicksave here. The equivalent in git is:

git commit -a --amend

Let’s finish implementing current_year.

 start_date = Date.today.beginning_of_year.beginning_of_day
    end_date = Date.today.end_of_year.end_of_day
+      db_start_date = start_date.to_s(:db)
+      db_end_date = end_date.to_s(:db)
+
       case date_filter.operator
       when :equal
-        return "BETWEEN '#{start_date.to_s(:db)}' AND '#{end_date.to_s(:db)}'"
+        return "BETWEEN '#{db_start_date}' AND '#{db_end_date}'"
+      when :not_equal
+        return "NOT BETWEEN '#{db_start_date}' AND '#{db_end_date}'"
+      when :less_than
+        return "< '#{db_start_date}'"
+      when :less_than_or_equal
+        return "<= '#{db_end_date}'"
+      when :greater_than
+        return "> '#{db_end_date}'"
+      when :greater_than_or_equal
+        return ">= '#{db_start_date}'"
       end

And quicksave again.

git commit -a --amend

Slay the Dragon


We’ve finished current_year, time to do next_year. I can do this super quick! Copy, paste, change start and end date, DONE!

     when :next_year
+      start_date = (Date.today.beginning_of_year + 1.year).beginning_of_day
+      end_date = start_date.end_of_year.end_of_day
+
+      db_start_date = start_date.to_s(:db)
+      db_end_date = end_date.to_s(:db)
+
+      case date_filter.operator
+      when :equal
+        return "BETWEEN '#{db_start_date}' AND '#{db_end_date}'"
+      when :not_equal
+        return "NOT BETWEEN '#{db_start_date}' AND '#{db_end_date}'"
+      when :less_than
+        return "< '#{db_start_date}'"
+      when :less_than_or_equal
+        return "<= '#{db_end_date}'"
+      when :greater_than
+        return "> '#{db_end_date}'"
+      when :greater_than_or_equal
+        return ">= '#{db_start_date}'"
+      end
+

This works and was really easy to do, but now I obviously need to refactor it. This isn’t going to be quite as simple as the above implementation. I’ll equate this to fighting the dragon in our analogy. Therefore I’ll create a “before refactor” commit. Then I start fighting… er, coding.

 class Blog
   def self.build_where_clause(date_filter)
-      [deleted code]
+    start_date = get_start_date(date_filter.value)
+    end_date = get_end_date(date_filter.value)
+    return nil if start_date.nil? || end_date.nil?
       db_start_date = start_date.to_s(:db)
       db_end_date = end_date.to_s(:db)
@@ -25,33 +24,6 @@ class Blog
         return ">= '#{db_start_date}'"
       end
-      [deleted code]
     return nil
   end

git commit -am "stuff"

+  def self.get_start_date(value)
+    case value
+    when :current_year
+      return Date.today.beginning_of_year.beginning_of_day
+    when :next_year
+      return (Date.today.beginning_of_year + 1.year).beginning_of_day
+    when :current_quarter
+    when :next_quarter
+    when :current_month
+    end
+
+    return nil
+  end

git commit -am "stuff"

+  def self.get_end_date(value)
+    case value
+    when :current_year
+      return Date.today.end_of_year.end_of_day
+    when :next_year
+      return (Date.today + 1.year).end_of_year.end_of_day
+    when :current_quarter
+    when :next_quarter
+    when :current_month
+    end
+
+    return nil
+  end

git commit -am "stuff"

Friend the Dragon


This works, but I don’t like it. There is repetitive code and the start and end date calculations should really be treated as one unit. It’s a mess. I want to start over. With a project of this size, you could go either way, forward or back. But I’m sure you’ve encountered hairy refactoring scenarios with multiple files, and you just want to rethink it and start all over again. Often, I end up restarting the refactoring with a completely different perspective. What you might be tempted to do is hit undo many many times, and maybe redo a few times if you went back too much. You have to remember how much to undo and you have to remember which files to undo. Not this time! Instead of undo we will restore to the exact spot I want to start from.

To look at my log I type:

git log --oneline master..HEAD
5d66c8c stuff 5290a6a stuff fd0173a stuff 4f1031b before refactor e71ed80 stuff 606ca1b starting implementation 5d28af8 unit tests 36efc45 scaffolding for Blog

To restore my “before refactor” commit I type:

git reset --hard HEAD~3

This tells git to reset to 3 commits before the HEAD commit. You could also specify the hash like so:

git reset --hard 4f1031b

If this is an alternative that you might come back to, you can type git tag experiment1 before you do the reset to “save” the current commit as a tag called experiment1. Very quick and easy.

Side note: git has something called the reflog. Every change you make, every commit, every reset, it’s all recorded in the reflog. It’s kind of like git log, but it shows your change history rather than your repository history. As long as you’ve committed your code, it’s very hard to lose it. For example, what if you did want to tag the commit but you did the reset already? Or what if you typed git reset –hard HEAD~4 by accident? Try it out with “git reflog” or “git log -g”.

Now, let’s try to refactor this code in a different way.

 class Blog
   def self.build_where_clause(date_filter)
-      [deleted code]
+    start_date, end_date = get_date_range(date_filter.value)
+    return nil if start_date.nil? || end_date.nil?
       db_start_date = start_date.to_s(:db)
       db_end_date = end_date.to_s(:db)
@@ -25,33 +23,6 @@ class Blog
         return ">= '#{db_start_date}'"
       end
-      [deleted code]
     return nil
   end

git commit -am "stuff"

+  def self.get_date_range(value)
+    today = Date.today
+
+    case value
+    when :current_year
+      start_date = today.beginning_of_year
+      end_date = today.end_of_year
+
+    when :next_year
+      start_date = today.beginning_of_year + 1.year
+      end_date = start_date.end_of_year
+
+    when :current_quarter
+    when :next_quarter
+    when :current_month
+    end
+
+    start_date = start_date.beginning_of_day if !start_date.nil?
+    end_date = end_date.end_of_day if !end_date.nil?
+
+    return start_date, end_date
+  end

git commit -am "stuff"

Show Off!


Great! We are done refactoring! Let’s implement current_quarter.

     when :current_quarter
+      current_quarter = today.month / 3
+      start_date = Date.new(today.year, (current_qurter * 3) + 1, 1)
+      end_date = start_date + 3.months - 1.day
+

A coworker stops by and wants to see a demo of my code. But my code isn’t working! Let’s even assume that you don’t trust the refactored code yet. Not a problem!

Create a show_off branch:

git checkout -b show_off

Reset to “before refactor”:

git reset --hard 4f1031b

Show off my code.

Go back to dev branch:

git checkout dev

Isn’t this cool? We can jump anywhere we want in the code and then jump right back to where we left off! If you prefer, you can also use tags.

Track Down Bugs


Back to current_quarter. Something is failing. I spot a typo and fix it.

-      start_date = Date.new(today.year, (current_qurter * 3) + 1, 1)
+      start_date = Date.new(today.year, (current_quarter * 3) + 1, 1)

git commit -a --amend

My unit tests are failing. Let’s put in some debugging code and fix the problem. Here is my diff.

   def self.get_date_range(value)
     today = Date.today
+    puts "today = #{today}"
     case value
     when :current_year
@@ -39,10 +40,16 @@ class Blog
       end_date = start_date.end_of_year
     when :current_quarter
-      current_quarter = today.month / 3
+      current_quarter = (today.month - 1) / 3
       start_date = Date.new(today.year, (current_quarter * 3) + 1, 1)
       end_date = start_date + 3.months - 1.day
+      puts "current_quarter = #{current_quarter}"
+      test_var = current_quarter * 3
+      puts "test_var = #{test_var}"
+      puts "start_date = #{start_date}"
+      puts "end_date = #{end_date}"
+
     when :next_quarter
     when :current_month
     end

Take Advantage of Diffs


Look at that last diff. It’s quite clear which part is the fix and which part is not. Even though I have debugging code in multiple places, I don’t have to remember where it is. All I have to do is pull up this diff. Now, when I encounter an issue, I am free to throw whatever I can think of at the problem. I can add debugging code all over the place, in multiple files. I can delete whole chunks of code or short circuit it to quickly get to the meat of the problem. I don’t have to worry about what I changed or about screwing anything up. In this case, I commit the one line and throw away the rest. Sometimes I will even create a debugging code commit so that I can leave it in while working. Later, I can remove that commit in one clean stroke without having to remember any of what I did.

Diffs are a central part of my workflow. I always have my diffs pulled up on my screen. We’ve already seen how useful they are for filtering out debug code. Diffs are great for spotting bad code in general. You essentially get to code review your code before every commit and it’s all in digestible chunks. I’ve caught many potential bugs this way. You can reduce your mistakes drastically by having a habit of checking your diffs.

More importantly, the diff allows git to keep context for me. At any given point I have a current train of thought. The diff highlights my current train of thought for me. There was a typo that I fixed in my code above. Notice how in the previous diff I don’t even see that? I already fixed it! I don’t need to think about it anymore. In this way, I keep my brain free of baggage. Every piece of code in this post is in the form of a diff because it’s such a great way to display your current context. You know exactly what I’m thinking and exactly what changed.

Squash!


Let’s go back to the code.

While I’m working, I notice that there are some cases that I missed in my unit tests. I fix that and create a commit.

git commit -am "squash with unit tests"

I don’t want to deal with squashing it right now, I’m in the middle of something. I leave the commit there. More on squashing in a bit.

I finish the rest of the work and commit it.

git commit -am "finish rest of work"

Now I review the log of all my work so far.

git log --oneline master..HEAD
ebf3815 finish rest of work 99d4ce5 squash with unit tests 7ef5312 stuff ddd0260 stuff 2e0a5b3 stuff 4f1031b before refactor e71ed80 stuff 606ca1b starting implementation 5d28af8 unit tests 36efc45 scaffolding for Blog

We’ve created a bunch of commits to facilitate our work. Some of those commits have logic split up across several commits, some are missteps, some are incomplete, and some leave you in a non-working state. We need this while working but later on nobody cares how you got here. They care about being able to view (and roll back) changes in logically consistent and atomic operations. You want to clean up your history so everything is clear and easy to understand. To do this you combine your commits together. Git calls this squashing. The steps are:

  1. Create an interactive rebase
  2. Reorder your commits
  3. Choose which commits to squash

Let’s start the interactive rebase:

git rebase -i master

This shows you the following (note this is in reverse order from git log):

pick 36efc45 scaffolding for Blog
pick 5d28af8 unit tests
pick 606ca1b starting implementation
pick e71ed80 stuff
pick 4f1031b before refactor
pick 2e0a5b3 stuff
pick ddd0260 stuff
pick 7ef5312 stuff
pick 99d4ce5 squash with unit tests
pick ebf3815 finish rest of work

Now we reorder our commits. Move the “squash with unit tests” commit after “unit tests”. Then we choose which commits to squash by changing the “pick” to “squash”.

pick 36efc45 scaffolding for Blog
pick 5d28af8 unit tests
squash 99d4ce5 squash with unit tests
pick 606ca1b starting implementation
squash e71ed80 stuff
squash 4f1031b before refactor
squash 2e0a5b3 stuff
squash ddd0260 stuff
squash 7ef5312 stuff
pick ebf3815 finish rest of work

What squashing does is combine the commits together. We are moving the unit test commit next to the other unit test commit and combining them together. We are combining the implementation commits together.

Here is the log after we are done (once again in reverse order from above):

git log --oneline master..HEAD
81c0339 finish rest of work 6cd3185 starting implementation 27ed5cc unit tests 36efc45 scaffolding for Blog

This seems like a lot of extra work. Or is it?

Sort Cards Efficiently


You need to sort a deck of cards. You pick up the deck in your right hand and go through each card, moving it to your left hand in the correct place. It takes a while and when you are done you have a sorted deck of cards in your left hand.

Reset to before sort. You pick up the deck and toss the cards on the family room floor. You spread them out, sliding them to the correct positions. When all the cards are spread out you clean them all up into one deck at the end. Yes, you have to clean up the deck at the end but this is MUCH faster. You are optimizing for the actions that occur most frequently (sorting) rather than the one action (keeping the deck orderly).

I LOVE this. You are trading space for time. In the real world you have to handle garbage collection yourself but often that cost is negligible. This is a life hack. I use it EVERYWHERE. Folding laundry. Building a crib. Planning a trip. Browsing the web. This is an entire blog post by itself.

In exchange for speed while coding I use a lot of extra commits. At the end I garbage collect that and figure out which commits belong where. You can adjust depending on your needs. The better you name your commits, the easier this part is. I find that thinking about names slows me down so I try to spend as little time on naming as I can get away with.

Git Gives You Freedom


Git gives you the freedom to work the way you want to work.

  1. Freedom from Baggage. Baggage is all the stuff you try to keep track of in your head. You can only remember so many things. Baggage takes time to retrieve. Baggage causes stress. Baggage makes you look back. Get baggage out of your head and into git. You don’t have to keep track of your code or remember your context anymore.

  2. Freedom from Fear. Fear holds you back, makes you slow, and causes hesitation before you act. Don’t be afraid of forgetting things. Don’t be afraid of moving forward or doing too much. You can easily jump back to any previous state in an instant. It doesn’t matter which path you took, or how much code you’ve changed since then. You’re free to do anything you want once you’ve saved your commit. You can forge ahead and back out without fear of consequences.

High Quality, High Productivity


Ultimately our goal is to write high quality code and be super productive. We want to get stuff done, get it done fast, and make it great.

  1. Focus. Freedom gives you focus. You don’t have to remember, you don’t have to think, you don’t have to worry. Your head should be clear, focused, and streamlined. In my example, every step was focused and concise–even if it was incorrect. Focus helps with quality because you are only concentrating on one thing and git only shows this one thing. With only one thing to think about, bad code pops out at you. Focus let’s you be productive because you’re not wasting effort on other things and because it’s hard to become distracted.

  2. Speed. Freedom gives you speed. When you don’t have to do extra work, or think about other things, you can work faster. When you aren’t afraid to act, when you aren’t afraid to try, there is no need to hesitate. It’s a given that speed can help your productivity. What about quality? Speed is awesome for quality. Fast iterations are so powerful. We solved the dragon’s riddles easily. We tried both ways of dealing with the dragon and chose the best option. Iteration gives you the ability to tweak. Be aggressive. Improve your code, improve your product.

In the beginning of this post I asked how you would approach things differently if you could have do-overs in life. This is what I chose to do with git. What are you going to do? Explore how you can use git to tailor a workflow that fits you.

Jeff Yang is a software developer on the Ad Platform team at Hulu.

Edits:
Explained HEAD~3 better.

Last comment: Apr 15th 2014 19 Comments
  • yao says:

    gotta get with git!

  • Peter says:

    This was really informative. One of the rare articles that actually makes me was to try what the author is talking about and see if I have the same success.

    The one section that was confusing was when you used the tilde. I had to search around for what that meant. It’s a little clearer now, but personally, I spent too much time focused on trying to understand how you got to ~3 and not understanding.

  • Frigeon says:

    Nice, make a mess to clean up faster method…been using that one almost everywhere but programming, thanks for sharing!

  • Scott says:

    Jeff, very nice article. I’d add one thing. I happen to know Jeff and he is an extraordinary coder. I would speculate that having good coding skills is a prerequisite to using this technique. The skills outlined in this post are similar to the discipline needed for superior code.

    However, it would be interesting to see if this technique could be taught to a person learning to code as a way of developing some of that discipline.

  • Jen says:

    Great post. Well-written, convincing, and informative. Git is the way to go! Sharing this with all my software developer friends. By the way, who is the talented artist that drew the dragons? I don’t see mention of the artist anywhere.

  • Bruce says:

    5/5 would read again

  • BeyondSora says:

    This is awesome!!! This is something I’ve always wanted to do but was never aware that git is actually capable of all this. Thanks for the great post :-)

  • Derick says:

    Excellent read. I love Git, and I’ve always known that its power allows for techniques like these, but I’ve never thought about employing a workflow quite like this myself. I’ll certainly be giving it a shot starting tomorrow morning.

  • Kaali says:

    You can improve the workflow a bit by using interactive rebase autosquash. If you prefix your commit message with “squash!” or “fixup!”, git rebase -i –autosquash will automatically mark those commits as squash or fixup commits.

    With Git 1.7.4, you can use the new commit aliases to make this easier, see the man for –fixup and –squash

  • Jim S says:

    Cool write-up — I wish more companies had a tech blog like Hulu.

    (Although somewhere around your edit note at the bottom, you are missing a closing italics tag, which causes the entire http://tech.hulu.com/blog/ page to be italicized below that point, at least on my Mac Chrome.)

  • Dan Stewart says:

    This post inspired me to record a screencast on git rebase versus merge. It’s close to 15 min, but it covers git init, commit, branch, rebase, and merge. I added autosquash just for Kaali. Enjoy, and thanks for this article: http://youtu.be/Z8kBgrF98cA

    • Jeff Yang says:

      Wow thanks Dan! Great screencast. Definitely makes some concepts easier to understand. And nice job using git to tell the tale of the dragon!

  • Jim says:

    Great article! I hope I could read it earlier!

  • Hang says:

    Great one Jeff! Sometimes I will have a “dirty” branch and do all the similar kill-the-dragon approach on that, but I never realize how it helps.

  • http://ii.ca says:

    Hi there! This post couldn’t be written any better! Reading through this post reminds me of my previous roommate! He constantly kept preaching about this. I will forward this information to him. Pretty sure he will have a good read. Thank you for sharing!

*
*