Fork me on GitHub

Drew's World

Rants, News, Etc on my Life and Projects

Git Based Workflow Revised

by Andrew De Ponte (@cyphactor)

If you happened to read my previous post An Ideal Git Based Team Workflow this posting is a result of taking that concept and discussing it heavily with my team as well as further contemplating the issues and solutions.

The Real Requirements

After reviewing my thoughts and discussing things further with my team I determined that a number of aspects of the previously defined workflow are not needed. In fact they were overcomplicating the workflow due to a lack of understanding of the actual requirements. One such requirement was the concept that branches should parallel deployment environments. This is not actually a requirement and I was sadly imposing it as one in my initial workflow. In hindsight this was primarily due to the fact that it was the focal point of a number of discussions with one of my developers, Michael Genereux. This singular mistaken requirement imposes a huge amount of complexity and constraint into the workflow for no significant value in my opinion. The set of requirements that I now believe are true requirements of our workflow are presented in the list below.

  • Overhead on the developers should be as minimal as possible
  • Overhead on the integrators should be as minimal as possible
  • Hotfix deployments should be able to trivially be made
  • Content deployments should be able to trivially be made
  • Iterative development cycles can happen parallel to Hotfix and Content development & deployment
  • Code review and feedback is needed for our inexperienced developers so that they can learn

The Two Roles

After, isolating the correct requirements I determined that in order to satisfy the minimalistic overhead requirements as well as the code review/feedback requirements we were going to need two different roles within our workflow. The first role is what I will from this point on refer to as an “Integrator”. In reality an “Integrator” in my mind is simply a developer that is skilled enough to follow carefully designed policy, doesn’t need much guidance, and fully understands the core concepts of Git and why the workflow policies exist. The second role is what I will refer to as a “Newb”. A “Newb” in this case is simply a less experienced developer that may need more guidance, heavier code review, and more direction in general.

The Integrator

The group of developers which fall within the “Integrator” title are responsible for not only doing their normal iteration development but also for performing code review and providing feedback to the “Newbs”. Beyond that they are responsible for following the provided policies to manage Hotfixes, Content changes, and the normal iterative development cycle. The policies for each of these areas with respect to “The Integrator” role are provided below.

Hotfixes

In this workflow I am defining hotfixes simply as changes that have to be made and deployed abruptly out side of a normal iteration deployment. These are normally severe bugs that drastically hinder the user experience and must be fixed asap.

Developers of the “Integrator” role are required when creating a hotfix or series of hotfixes to create a remote branch for the hotfixes within the current weeks iteration. The branch should be based on the release that the hotfixes are going to apply to. Secondarily the name of the branch should be of the following format, "week_<week number>_hotfixes". Once the remote branch exists development of the hotfixes should occur on the newly created hotfixes branch. Once the branch reaches a state in which it is ready to be released into the accelerated pipeline it should be tagged using the release naming scheme and deployed.

Content Changes

In this workflow I am defining content changes simply as changes which need to be deployed on a more frequent basis than the normal iteration deployments. Beyond that they are generally changes to assets or textual changes needed by marketing, etc. which have less of a functional involvement.

Developers of the “Integrator” role are not usually the developers making these changes directly. However, they are responsible for reviewing, integrating, and providing feedback to the “Newbs” that are making these changes through the “Newb” workflow as defined below. Hence, the “Integrators” should pull in the “Newbs” content changes after reviewing them and merge them into the current weeks remote branch for content changes. This branch should have a name of the following format, "week_<week number>_content". If this branch doesn’t exist at the time in which an “Integrator” needs it they are responsible for creating it. Once this branch reaches a state in which it is ready to be deployed, it should be tagged using the release naming scheme and deployed into the pipeline.

Normal Iteration

In this workflow a normal iteration is simple a release cycle of two weeks. Within a normal iteration “Integrators” are responsible for not only managing content and hotfixes from “Newbs” but also managing normal iteration development from “Newbs”. Beyond that “Integrators” are responsible for performing their normal development work for the current iteration. All development changes should be performed inside specific topic branches and never directly on the master branch. If features/bugs are large enough that they may consume multiple days of development they should have their own remote topic branch. This allows other developers to participate or at a minimum obtain the partially completed changes which you have made. Beyond that never developing directly on master makes abrupt context switching trivial. Once a topic branch has reached a point in which it is ready to be merged into the current weeks iteration and ready for the pipeline the “Integrator” should merge it into the master branch and push the updated master branch to the central repository.

The Newb

The developers which fall within the “Newb” title are responsible for performing their development tasks and learning the process and value of the workflow policies as well as the technologies and languages which they are using. Ideally the “Newbs” progress to “Integrators” eventually. Their tasks similar to the “Integrators” may consist of development work for Hotfixes, Content Changes, or Normal Iteration development. However, one way they differ is in the process and policies they have to follow when performing these tasks. The workflow for each of these types of changes with respect to “Newbs” is provided below.

Hotfixes

A “Newb” similar to an “Integrator” is required to do all development within topic branches. However, when a “Newb” believes they are ready to have their changes included in the current weeks hotfixes branch they must make a Pull Request via GitHub to their upstream repository. GitHub will then notify all of the “Integrators” of the new pull request and allow them to review the changes, provide any feedback and merge the changes in. GitHub will also allow the “Integrators” to throw the changes back to the “Newb” with feedback rather than merging the changes in.

Content Changes

With respect to a “Newb” content changes work exactly the same way as Hotfixes. They simply develop their content changes and submit an appropriate Pull Request via GitHub to their upstream repository so that an “Integrator” can appropriately deal with their changes and provide any feedback.

Normal Iteration

“Newbs” follow the same process for Normal Iteration development as they did Hotfixes and Content Changes. Develop the changes in a topic branch and submit a Pull Request via GitHub.

Requirements Review

I believe that the above workflows and policies provide a solid basis for a well rounded team with very little overhead and a decent amount of flexibility with respect to parallel releases being made at different rates. Please note additional parallel releases can be added simply by adding additional conceptual types of changes to my current list of Hotfixes, Content Changes, and Normal Iteration Development.

Release Naming Scheme

Given that we are developing by either week or two week iterations I have decided that we should use the following naming scheme for tagging releases:

release-YY.<week_num>-rc<release candidate counter>

The YY represents the year for 2010 it should be 10. The <week_num> is the week number and the <release candidate counter> is a counter for this iteration that reset at the beginning of each iteration. Once a final tagged release for production is ready to be made it should use the following naming scheme.

release-YY.<week_num>-rf<release candidate counter>

All the variables in the naming scheme for final tags are the same as those in the latest rc for a weekly iteration. For example if the latest rc is release-10.34-rc6 and no changes have been made since that tag and the code is ready to be tagged for a production release it should be tagged as release-10.34-rf6.

Topic Branch Naming Scheme

When naming topic branches it is required that you use the following naming scheme.

<type char><ticket id>_<name>

The <type char> is a character that defines if it is a task (t), bug (b), or feature (f). The ticket id is the unique identifier of the ticket in the ticketing system that corresponds to this topic branch. The <name> is a short but meaningful name that describes the topic branch.

Don’t Tie Deployment Environments to Branches

So, as I stated in the opening I was assuming in the previous article that each deployment environment should have a branch that parallels it. This was an idea that was presented to me by one of my developers and I mistakenly ran with it. The following is an explanation of why I believe doing this is not valuable and potentially costly. Also, I am not pointing this out to pick on my developer. In fact it is quite the opposite. I am glad that he forced me to think about alternatives in great depth because the process has only more solidified my belief in the workflow and policies presented above. Beyond that, the explanation of my opinion may help him or others come to the same or even a better conclusion/workflow.

For some reason I couldn’t initially place my finger on what was wrong with using branches that paralleled the deployment environments of development, qa, staging, and production. I knew that I really hated the overhead of having to deal with cherry picking changes from one branch to the next to simulate the pipeline but I still simply saw that as its own issue which inspired me to come up with the previous and very flawed workflow, An Ideal Git Based Team Workflow.

Today after an insane amount of contemplation, discussion, and fiddling with the workflow the answer finally came to me. The problem with the Deployment Environments mapping to branches is that from a workflow perspective we don’t really care about the deployment environments other than the tagged release that is currently deployed in each environment. This information can easily be obtained without branches that map to the environments by correlating the deployed tag in a given environment to the matching tag in Git.

What we do care about with respect to a workflow however is the ability to be able to cleanly share code changes as well as be able to handle a number of parallel releases. I have chosen to represent these parellel releases via different classifications of changes in the above workflow. Specifically, Hotfixes, Content Changes, and Normal Iteration Development.

When you test these two models it is very interesting because in the model where you have branches for each deployment environment you end up spending a huge amount of overhead simulating something for no reason. The primary cost in overhead is the amount of merging and cherry-picking that is necessary to keep the states of the branches in line. Secondly, it does NOT allow you to make parallel deployments of various types unless those changes happen to fall within the various stages of your deployment pipeline.

The second model in which you have classifications of changes that map to parallel releases, the overhead of maintaining the deployment environment branches is completely eliminated. It also models the actual situation with respect to code and the natural requirements surrounding what needs to be done with/to that code. Therefore, it requires no additional overhead or maintenance.

Conclusion

The above is why I believe the model in which branches parallel deployment environments is costly and has no perceived value in comparison to its costs. As always I am interested to hear peoples thoughts and opinions with respect to my postings, so don’t hesitate to share.

An Ideal Git Based Team Workflow

by Andrew De Ponte (@cyphactor)

If you don’t know me there is one thing you should know about me. I love tools that help make things easier. Git has definitely been one of those tools for me. However, I always felt like there was something in the power of Git that wasn’t really being taken advantage of because of peoples past knowledge and training of centralized source control systems such as SVN. Git provides many advantages which are difficult to argue for an individual developers workflow. However, I am more interested in a solid workflow for a team that has very little overhead using Git.

The team of developers that I lead up for RealPractice just released our first Beta (still pretty alpha) of our product. Before this point in time I really hadn’t put in place a specific workflow largely because we had a small enough team that it was easily manageable and secondly because we simply had a huge amount of development before one release. Therefore, if things were broken during that period of time it was fine as long as what was in got fixed for the release. However, now that we have made the release I require a team workflow that gives us much more control of the development process, what gets included in the product, and at what stages things get included.

Centralized Branch Workflow

To begin with I looked heavily at a model that I have used in the past and seen used before. One of my developers Michael Genereux recommended this workflow as well when I told my team I was looking for a good workflow. However, I have only used this with very very small teams (2 man teams) in the past. Michael seemed to be a big supporter of this method so I discussed it with him and started playing with this workflow again.

This workflow uses Git in a centralized maner where you have an origin and that origin has branches for each of the stages/environments, experimental, development, qa, staging, and production. The idea is that development happens on the respective branches appropriately and gets merged backwards into the experimental so that it can follow the normal flow through the environments as one would want. In my mind this has a few major issues which are as follows:

  • Development occurs in upper level branches. Ideally even if you are writing a bug fix you want that bug fix to go through all of the stages starting at the beginning. One could argue that you could develop the bug fix in the experimental branch and merge it into development branch and so on. However, this leads to the second issue.
  • This model works fine if you have a very small team that is extremely good at managing and knowing their commit history because when a change is made to experimental it somehow needs to get put into the development branch. Some would say you simply need to merge it into development branch. This argument is fine if you are a single dev and develop in a linear fashion on the experimental branch. However, if there is a team of people all sharing the same experimental branch then it can be a pain in the ass to identify which commits belong to bug fix Y or feature X and need to be cherry-picked into the next branch.

As you can probably see from above points alone, managing the code base and what gets included in it can be very hard and consume an insane amount of time. Especially, if you have to go through and isolate commits that need to be cherry-picked into the next branch all the time. Plus, think about the fact that I am only talking about one level experimental to development. This same process has to occur at every level of the process experimental all the way through production.

GitHub to the Rescue (Pull Requests)

So after playing with the above a bit more and thinking pretty heavily I decided there has to be a better way. I started thinking about how I develop and how I could commit code that would then go through the environmental pipeline appropriately and not have the same overhead. Thats when I realized that the biggest overhead in the above process is the fact that the devs have to isolate commits for cherry-picking all the time. Hence, I started looking for solution in the Git space and didn’t find anything directly within Git that seemed clean. Therefore, I started thinking about tools out side of Git and GitHub came to mind. With GitHub if you are using the pull based distributed model GitHub provides a feature called a “Pull Request”. This basically allows a developer to send a message to the upstream repository requesting that they pull in some changes. The beauty of the GitHub “Pull Request” is that it associates a range of commits with the pull request.

Hence, it allows a developer to develop a bug fix or feature in their repository and then simply create a “Pull Request” that includes the the proper commit range. Then when the upstream developer/integrator receives the “Pull Request” they can pull it into their repository and merge it in appropriately knowing exactly what the proper commit range is. This drastically reduces the amount of time needed to isolate the commits using the Centralized Branch Workflow. In my mind having the Pull Requests contain the commit range is brilliant on GitHub’s part because it is an unbelievable time savor. Plus they provide a web interface for testing conflict states of “Pull Requests” and merging them in as well as a communication mechanism to submiting the developers in case you have to deny a pull request.

How to use this GitHub Pull Request Model

The following is how I want my developers to use this model. In ones own local repository they should create a topic branch for every task, bug, etc. Generally the branches should be named with the id of the task or bug so that we can identify commits with bugs and tasks in history. This also eases the local workflow to allow me to switch a dev from one focus to another abruptly (I try to do this almost never, but sometimes it happens). Generally, a dev should be developing the bug fix, task, or feature and reach a point at which they believe that they have enough for me to pull their changes into the development branch. At that point they should create a “Pull Request” because GitHub will help them by figuring out the commit range. If they continue to develop past the point where they want me to pull and neglect to make the “Pull Request” until later then they will have to isolate the commit range they want me to pull using GitHub (which is still easier than just using Git by itself). Note that keeping each task, bug fix, etc in its own topic branch makes commit isolation much easier for developers.

Once, they have sent the pull request all integrators are notified of the pull request and then have the opportunity to pull the changes into the appropriate branch and move those changes up the pipeline as they see fit. Some developers see the act of making a pull request as overhead. However, they should note that it isn’t a side effect of GitHub it is a side effect of having to move change sets up the environment pipeline. GitHub just provides a tool that makes it easier than doing it without GitHub.

Moving Changes up the Pipeline

The last step in this process is the act of moving things up the environment pipeline. This is much the same process as in the Centralized Branch Workflow with one distinct difference, it is in a much more controlled environment because you don’t have all of your devs sharing the same branch space. Instead you as the integrator control the branch space. You may be asking, who cares? Well, I do for one because it makes moving changes up the pipeline much easier because it requires far far less cherry-picking. In fact the majority of the time all that is needed is to merge pull requests into development, merge development up the pipeline, and so on and so forth. Do note however that every once and a while you do need to cherry-pick but it is far and few in comparison with the Centralized Branch Workflow.

So this is the current workflow that I am going with. I haven’t completely decided on how much access I will give my team with respect to direct pushing if any. But, that is just a balance that has to be found over time in my experience. Anyways, I hope the above gives some insight into the workflow and why I have chosen it.