In Part 1 of this 3-part series on Continuous Integration, we introduced Version Control and the CD workflows around it as the common underlying technologies for fast-moving technology companies. In Part 2, we created the private Gitlab system on AWS.
With the final post of this series, with a private Gitlab system in place, we will work toward a simpler workflow. After you have a Version Control System in place, the key decision you need to make is deciding on a “Branching Model” that works for your environment or that best fits your organization.
This link is a good overview of the perspective for how the Gitlab organization recommends people branch using their tool, although most any Branching Model is fair game for you to use with this system.
In this post, I’ll show a simple example of how you can take a project and start or stop various pieces of it in parallel or as things come up. This might include things like random urgent requests, even when you haven’t finished the previous portion of the same project, or tracking the tasks you need to accomplish.
I’ll provide milestones where you can combine and track major pieces of what you are trying to accomplish, so that as you get pulled in different directions, you can “stash” or commit your changes for that branch and come back to it later. You can also have many personal or group branches for the same project.
If you are diligent about keeping your own notes in a README, a TODO file or in the commit message, it is easy to pick back up where you left off, even if over a long period of time. Just having a place where you can keep all of your “stuff” and keep track of where you left off (some place remote from your local machine and in an organized manner) is a huge benefit, versus trying to keep it all in your head or relying on a remote “backup”(with no branches).
We’re going to create a simple python project called “python-getting-started” to step through the workflow. Go through the same steps as we did above, or follow this link again (Creating a project). Follow the commands for “Creating new repository” that is displayed after you create the project in Gitlab.
git add README.md
git commit -m “add README”
git push -u origin master
Then let’s create a “develop” branch.
git branch develop
git checkout develop
This gives you a master branch with the README in it, and that master branch is pushed up to your Gitlab server. You also created a “develop” branch which takes a copy of the master branch while splitting away from the master branch.
The thing to remember is that when you do git commands on your machine, until you do a “git push,” you haven’t moved anything to the server and everything is local.
Doing “git push — set-upstream origin develop” pushes the develop branch with the same README file up to the server. The “–set-upstream origin develop” is only necessary the first time you are pushing a branch to the server; from then on, only “git push” from within that branch is all that is necessary.
If you run the command “git branch” from your command prompt, you will see which branches you have locally. The asterisk represents the branch you are currently in.
“git branch -r,” for example, shows the branches that exist on the remote server, which in this case will be “origin/master” and “origin/develop.”
From here we have a simple basis for working and keeping things in a structured way. The idea is that “master” is production, or the final version that the world can see, and should always be production-ready once in a working state.
The internal working version we will call “develop.” In this example, you could call it pre-production, or staging, or whatever you like. You would then create new branches from “develop” for the tasks that you need to work on. If collaborating with others, you would work in your branch, while others would work in their own branches. Everyone would merge to develop once your develop branch is deemed ready; you would merge that to master or create a release branch that would be moved into production. This is the idea of doing a branch-per-task workflow. Every organization has a way they break down work into individual tasks inside issues/tasks/tickets/etc. Issues then become the team’s central point of contact for that piece of work. Task branching, also known as issue branching, directly connects those issues with the final product. Each issue is implemented on its own branch with the issue name or identifier in the branch name ideally.
You’ll want to keep branches as small as is reasonable for things that can be changed at one time. Long lasting branches have the risk of lots of things changing in the meantime in that develop or master branch, and your updates may not line up with the far out future that you are now trying to merge into with your long lasting branch. In other words, you might have to make more changes to integrate with this new version of the master branch if many people are working in the same project. This is just an example of how you can work that is not specific to “programming,” but rather, showing a workflow and the structure and organization that is inherent to it.
Some people say Git is overly complex, but you don’t have to be a git expert to start taking advantage of its benefits. The commands below are the majority of git commands many people use on a regular basis. You will use them in such a repetitive fashion that the patterns become second nature and you won’t even think about it.
One thing I would not recommend is using one of the git GUI’s (if you can help it) for starting to learn git. Use the CLI git commands, and from Gitlab’s Project page for the project you are working on, click on Commits, then click Network. This gives you a graphical view of your commits, and branches. This graphical view is very similar to the view many of the GUI’s give, for how the activities are related. With that view and using the CLI for learning the operations that the “button” in a GUI is performing, you end up with a similar experience to using a GUI, but you will understand more. From my experience the GUIs try to assume what you want to do, and they hide functionality. If you learn a subset of the functionality from the CLI, you will better understand what is happening, and then if you still prefer the GUI, you will have a much better understanding of what is going on. Obviously you can do what you like, but from showing many other people that have never used git or any version control system, the ones that used a GUI to start with ended up having more problems and didn’t understand nearly as much of what they were doing. The ones that used the CLI had many less issues, and a better understanding comparatively.
Below is an example of a common pattern you might use. You will want to create the issue on your gitlab site, if one wasn’t already created, for tracking purposes. Then when you need to start working on something you:
git checkout develop
git branch my-task-branch
git checkout my-task-branch
# Do your work
git commit -am “message describing this revision, Fixes #21”
# If all seems good
git checkout develop
git merge my-task-branch
Quick note: The “Fixes #21” in the commit, because that is in the commit message, if issue 21 exists for that project, issue 21 will be auto closed as you push that branch to the server. The server parses the commit messages, and matches on that pattern to know that you “Fix”ed that issue.
Additional common commands:
git add . # adds all files in your directory and subdirectories to be tracked as part of your project.
git branch -d my-test-branch # deletes the branch in the local repo.
git push remote :remote_branch_name # removes the remote branch on the server
git pull # updates the local branch from the remote branch
These are only a few of the git commands, and there are many ways of using git. It is possible that you could use git for a reasonably long time, and only use these commands and possibly a few more. My goal is to lower the barrier to entry for using git and a system built around it, Gitlab. So more people that are not familiar with this type of automated, collaborative, distributed way of working, can take advantage of its benefits. If people assume it is complex and they think they will not be able to know enough in a reasonable amount of time, they won’t even start.
This next part, I’m referring to some sort of infrastructure or some sort of environment, not just files to track files. Such as a production environment, instead of using a version control system to track your text documents.
This workflow, with a branch that is intended to match production has basically 2 ways it can happen. You either push or pull between your production environment and the master branch. You can have your system push from your master branch to your real production environment, or make changes to your production environment and save what is your production environment back to your master branch. If the direction is always from master to production, you can start to automate that process to make it fast, clean, efficient and tested. This change from, making sure your version control system matches production(backing up production into a version control system), to push from version control, is the fundamental shift from people making unstructured changes to production, and driving consistent changes in a structured automated Continuous Integration System. If master is pushed to production you always have a known working version of “production” as a version in your system. You can switch back to “known good” in the case of issues happening. There is tooling and systems that make the move from version control system to real world environment(Production) easier to implement. This is the function of Continuous Integration and Continuous Deployment systems. This is where Gitlab’s builds, multi-runner, runners and there .gitlab-ci.yml instruction set comes in, along with testing frameworks, like serverspec, unit testing frameworks, and many others. This also allows for changes or tasks to be verified using repeatable tests of changes as those changes move or merge from branch to branch being “promoted” or merged upstream to production.
Here is a link for installing the gitlab-ci-multi-runner. The gitlab-ci multi-runner is what controls your runners which are what control your builds, runs tests and sends the results to GitLab. GitLab CI is the open-source continuous integration service included with GitLab that coordinates builds & testing. It is where all the actions run that you want to happen for your project, and moving between your branches. This can really be anything you can think of and instruct it to do. The multi-runner is recommended to be installed on a separate system then your Gitlab server.
We’re now going to create a Project Specific Runner for this project, using a standard python docker image. Once your multi-runner is installed and registered with your gitlab server instance, go to your project page, go to Settings in the left margin and click on Runners in the left margin. We’re going to register a project specific Runner to run in your multi-runner. The runner is a Docker image that runs as a container within your multi-runner process. There is a link for how to setup a new project specific runner in the middle of the page.
The Continuous Integration example I am going through below is from this link. I previously set up my Heroku account (follow the instructions from the Continuous Integration example link above), which is used as the Production and Staging environment. I have variables for my Heroku API keys setup in my gitlab project. This keeps the values for the variables hidden during builds. While specifying the variables key in the .gitlab-ci.yml file from the variables section under the project in Gitlab, the “key” is what is displayed in the console during the build process for all to see, and the “values” are what are actually used. So developers or anyone else can branch and run builds without having access to the Heroku environment or accounts in this case.
Use this for creating the runner (“Create runner” section) using the registration token that is specified from that page.
From the runner page for your project, you should see the runner you just created. If you click on that runner, you should see a “Last contact” section, and the time since last contacted should be recent. Refresh if necessary. Run “git branch” to confirm you are still in the develop branch (an asterisk should be next to “develop”).
We will now create a new branch for bringing files into your project as if you are creating the files as part of working on the project. Run “git branch creating-base-files,” “git checkout creating-base-files,” and download the project zip file here. Extract the files to your python-getting-started project folder on your machine. Run “git add .” to add all the new files to your creating-base-files branch since you still should be in that branch. Run ‘git commit -m “files added to create base project.”
Between having your project specific runner setup, and this branch with the .gitlab-ci.yml file in the root of your project, this file is what is used by the CI system as the instructions for what to do for your “Build”(.gitlab-ci.yml documentation). You should have the files in the picture below on your local system in your project folder, in the creating-base-files branch. (.gitlab-ci.yml will be “hidden,” so use “ls -la” if on a linux like system to see the hidden files. You will also have a .git folder, this is where all of the files for git are).
You will see in the .gitlab-ci-yml file there is a “test” section, a “staging” section and a “production” section. These are arbitrary names; you can choose whatever you like, but the “test” section is run against all branches as they are pushed to the server. The staging section is run for when a develop branch is pushed to the server, which deploys to the heroku staging environment. The production section is run for the master branch, which is deployed to production(heroku in this case).
Your python-getting-started project and a project specific runner both are ready to use. You now have your own Continuous Integration System setup, so let’s see what it does for you.
This video shows creating a feature or task branch from develop, making a trivial change to the README, committing that change, and pushing the new branch up to the server and watching the output from the CI’s build console. This runs the “test” section only from the .gitlab-ci.yml
Next I show promoting or merging that feature or task branch into the develop branch. This tests the branch, and deploys the changes to a staging environment, heroku in this case.
Lastly, I show promoting, ultimately those same changes, from develop into master or production. This tests the branch, and deploys those changes to the production environment also in heroku.
The example workflow that is shown is the branch from develop that is your own task or feature branch. Then do your work, and commit as it makes sense to have points at which you can revert back to if necessary. Push your task branch up to the server to have the “test” section of your .gitlab-ci.yml run against it to confirm all works as expected and passes the tests. When everything works as expected, check out the develop branch, and merge your task branch into develop. This will run the test section again, and deploy your merged changes up to the staging environment. The ultimate goal is that all testing can then be performed as part of this stage from your CI/CD system, and if those pass successfully, that same system will push your changes to production without manual intervention.
I hope I have shown you how to start to use a Continuous Integration model of working. To make the fundamental switch from changing production directly, backing it up, and being always in “catch up” mode, to pushing in an automated way that is tested and trusted, you can drive your production environment wherever you want it to go.
The structured workflow allows individuals to work within a system with automation, while still getting the collaborative benefits of working in a group towards a common goal of merging your portions of the overall project. If you are not familiar with this way of working, you’ll see what is involved to make this work on your own systems. It requires a different way of working, and spending more time on some things to spend less time on others. The idea is that spending more time up front on building the basis for automation, testing, and deployment will pay off significantly. Once the basic system is set up, it should be like ‘variations on a theme’ for how additional projects are set up. If your organization is not using a Continuous Integration type model for a production environment today, I encourage your to test this out and attempt to move just portions to a more continuous, automated, tested process. Automate the easy stuff, so you can focus on more complex and interesting problems until those become easy and automated. Then repeat. Good luck!