Published by breki on 15 Dec 2008
Since I haven’t posted anything on development stuff for a long time, I decided to write about something that I think is very important – doing continuous integration (CI) builds the right way. Below is a collection of advices I collected during the years regarding what to do and what to avoid when developing build scripts if you’re oriented towards CI principles:
- DO make the main build procedure easy to run. Create a “Build.bat” batch file that will be obvious and simple to run. And place it in an obvious place. And make the main build target the default one. There’s nothing worse than spending the whole day looking around the code to find out how to build it (and this is a common problem once the main bulk of developers leave a project and someone else arrives a year later and is given a task of maintaining the project).
- DO document the main build procedure. No Asimov’s Foundation series here, just a few lines of short sentences describing build targets. Or your could simply add “description” attributes to your NAnt/MSBuild targets?
- DO use the same build steps for local (developer’s) builds and CI server builds. The developer must be able to verify the build before he/she commits the code to the repository. Having a different procedure running on CI server just means that sooner or later you’ll end up with half of your CI builds marked as failed because developers could not reproduce builds on their local machines.
- DO keep the main (“indicator”) build running under 10 minutes. Developers need to be able to detect any problems with their code changes as soon as possible. Forcing them to sit around and wait for half an hour in order to get the feedback will only results in fewer commit cycles and more problems with integration. “Continuous” means continuous, not “I’ll do it after the lunch“.
- DO separate the build procedure into several stages if your build takes longer than 10 minutes. The indicator build should make sure the code is built, analyzed, unit-tested and packaged. Everything else can be moved to next stage(s), so…
- DO separate tests into unit and integration/acceptance tests. Unit tests typically work on isolated classes (typically using mocks) and do not use external resources like databases, Web services and similiar. This means they are fast, which makes them ideal for the first stage of the build. Integration tests: they test interactions between various parts of the system. Mocks are still used, but only to mimick certain external systems. These tests typically run on an actual database, which means setting them up and running them is slow and should be moved to the later stages of the build (or even moved physically to a different CI server so we can parallelize the build. While we’re on the subject…
- DO use test categories to separate unit tests from integration tests. This way you can have both types of tests in the same assembly while telling the test runner (like Gallio or NAnt) to run only certain categories of tests.
- DO select some of the important integration tests as “smoke” tests and run them before all other tests. This way you won’t have to wait for all of the tests to finish before detecting that the build has failed. The important thing is to make the build fail fast so that bugs can be fixed as soon as possible.
- DO use FxCop, StyleCop and similar tools – and right from the start of the project. These tools can be a very useful way to auto-review the code of less experienced developers in the team. Which means less work for the project lead – he can concentrate on the substance of the code and leave the form to be polished by authors themselves. And FxCop can sometimes really discover bugs which would be difficult to detect otherwise.
- DON’T rely on developer’s locally installed 3rd party libraries and tools to help you build the code. Instead, store all of the libraries and tools you need (OK, I’m not talking about VisualStudio and SQL Server here 😉 ) under the source control and reference them from there. Also, communicate to the developers the potential problems of having these tools installed in GAC – GAC is your enemy! Why? By having the 3rd party stuff stored under the project’s source control, your team then has the control of which versions of these tools are actually used in the project. Since people tend to work on different projects (sometimes at the same time), relying on their local environment will mean that sooner or later there will be a conflict or a hidden bug because of different versions. From our experience there’s really no need for any of the commonly used stuff (NUnit, MbUnit, Gallio, NCover, Sandcastle, FxCop… just take a look at the lib directory of one of our open source projects) to be installed on the machine to be able to use it. One notable exception is TestDriven.NET, but then again, you don’t really use it to run builds from the command line.
- DO separate the build script into a common one and the one specific for the individual project. This will make your common script reusable for other projects and you will also have a chance to polish the script in small steps. The project-specific script should just define the stuff that’s unique for the project (for example what files to include in the build package) and then just call the common script to do the rest of the work. You can see a sample common NAnt script and a project-specific script which I managed to construct in 2 years of working on different .NET projects). Feel free to abuse it. We’re moving to a different building tool anyway, but I’ll write about this some other time.
- DON’T spam your developers! Do not send them e-mail messages for each successful CI build. Set the up CI server to send e-mail notification on failed and fixed builds only. This is the only way to make sure they are informed about problems with the build. Otherwise they’ll just ignore all CI e-mails altogether.
- DO treat the database as just a script. Some people think of databases as some evil deities which, once set up, should be left alone or they will inflict some horrible curses on the developer who dares to touch them. Databases are just text files containing the SQL code and data and should be stored under the source control. I would go even one step further and say that they should be recreated/re-migrated as part of each build. This of course means that every developer should have a database engine installed on their development machine. But this is the only way to make sure your .NET code and DB code are synchronized. And that your SQL scripts work! Some would say that they have huge databases with millions of records and they can’t afford to obliterate them 10 times a day, but I would ask them: do you really need 1 million user records to test the code for reading user records? Don’t confuse integration tests and performance tests. Performance tests are usually not part of the CI build because they take too long and setting them up can be a bit of a pain.
- Automation for the people: Continuous Integration anti-patterns by Paul Duvall
- Evolutionary Database Design by Martin Fowler et al.
- Continuous Database Integration by Peter Hancock
- Get Your Database Under Version Control by Jeff Atwood