The Subjective Geek: Software Development 101

Indulge me while I take a paragraph or two to get to the point.

I've worked as a Software Engineer at a lot of places now. I've stayed in some for many years, others only months; if there's not anything interesting left for me to achieve, I usually start to feel like moving on. I'd like to think this is best for everyone: I get to take my ideas and experience and give a different company the benefit of them, and my (now ex-)employer gets to bring in a new employee with a different set of ideas and experience. This cross-pollination of ideas is a part of the life of software development, but there is a balance. Keep your pool of developers too static, and you risk stagnating. Turn them over too quickly, and you'll be spending too much money on recruitment and training, and not keeping people around for long enough to see a return on that investment. Get big enough, and you can get these benefits just by letting people transfer within your company: the best of both worlds. That's not what I want to talk about, though.

I want to talk about the baseline ideas and experiences I bring to a new company. The stuff that I just want to get out of the way so I can start actually innovating.

Now, not everyone needs these ideas. In fact, my ideal employer doesn't need any of these ones. Every time I start somewhere new, these are the things that I hope are already covered, so I can get on with solving interesting problems. These are also some of the things I ask about when I'm interviewing a new employer; they're problems that I've solved over and over, and I know the consequences of not solving them. If I sound a prospective employer out in and interview, I find out that they haven't solved these problems, and it sounds like there won't be any support for me solving them when I start, I'll seriously think about turning down the offer. Some of them are optional; some of them are even inappropriate for some teams. Some of them are so basic that you can't expect to build decent software at all without them. Every one of them I've had to introduce to an established team at some stage or other.

Source Control
This is the number one must-have for software development. When I'm working on a small, personal project which nobody else will ever touch I still use source control. It's just too simple to set up, and the benefits are too enormous.

One of the first things I learned about source control was that Microsoft Visual Source Safe is not source control. At least, not at the sort of level you want even for a small, personal project, and certainly not at the level you need to build commercial software. I've suffered through it with several employers now, and there are just too many things I keep wanting to do that it doesn't let me. Git seems to be the current darling of the open-source world; it's free, it's quick to get started with, and tasks like branching and merging are said to be quick and straight-forward. I haven't worked with Git yet: I always seem to end up using the well-known staple, Subversion (and earlier in my career, it's spiritual predecessor, CVS).

If you haven't solved your source control problem yet (or if you think you have, but you're using VSS), Subversion is a good place to start. It provides all of the basics, it's easy to set up, and it does most of the things you're going to need to do for a small- to medium-sized team. There are plenty of hosted solutions available (with all sorts of optional add-ons), but I really think something this important is worth hosting in-house. It's easy to do, you're in control, and you won't go dark every time your ISP hiccups. Do back your repository up. If you go with a hosted solution, make sure you have some way of backing up the repository yourself in case the worst happens.

What are the benefits? Well, for starters, you can see your entire development history. If you introduce a bug but can't quite remember exactly what you changed, you can see the entire history of a single file, a whole folder, or your entire project. If someone else introduces a bug, you know who to go to to discuss a possible fix. If you really mess things up, you can roll back to a previous version. You can merge in code from multiple developers as you go, even within a single file, and you can branch the code off while you work on significant changes, and then merge it back into the trunk later. If you don't have a decent source control solution in place, you have all the makings of multiple on-going headaches; if you do, you have a powerful tool on hand.

There's one thing I'd like to add here, and not everyone will agree with me. I believe in small, frequent, buildable commits. I don't like seeing a commit which touches 35 source files, has two pages of comments (which still finish with "lots of other little fixes"), and resolves a dozen bugs and nine different feature requests. You lose a lot of the benefits. Keep your commits small and granular; try to fix only one or two things each commit, and keep your repository buildable and passing all of your tests. As a bonus, your team will spend much less time on merging, particularly if they're updating frequently as well.

Database Version Control
Closely related to source control is database version control. Many large projects end up with some sort of database back end, and that database will grow and evolve with your code. You will end up with various versions of the database lying around, and you will need a matching build of your product to work with each one. If you don't have a good plan in place before you start putting builds out there, you're going to create a headache for yourself.

Let me illuminate this one a little further. Lets say you've built a billing system. You offer a hosted solution, where each client gets their own separate virtual server running a full-blown database and your web-based product and you manage it for them, and you also support client installations, where you will generate the database on their server and allow them to both host the web product, and run their own copy of the management system. So for n clients, you have n servers, n database instances, and n builds of your system.

Now you make some changes. You need to add a table or two. Change some keys. You now have version 2, which you proudly release, along with a tool to update the database from the previous version. Not everyone updates, but you think you will be fine to support version 1 as well.

After a few bug fixes to version one, and a few minor updates to version two, both databases look a little different than they did when you built the tool to update a version 1 database to version 2. Some clients aren't on the original version 1 or the latest bugfix build, but somewhere in between. Most of your version 2 clients are fully patched, but some of them are holding out because your latest release broke some functionality they relied on. A few haven't patched since 2.0. You're spending too much money supporting version 1 in its various builds, version 2 has a few builds out there, migrating any given installation from 1 to 2 now costs much more than the update licence fee, marketing is screaming at you to release version 3, and you have folders full of SQL script files, a few spreadsheets to keep track of who's applied which patches to which databases (mostly accurate), and your DBA just took 6 weeks medical leave (citing stress).

What went wrong?

You didn't manage your database versioning. This seems to be a common problem, and (depending on your development platform) there may not be many tools out there to help you. Believe me when I tell you that a few folders full of SQL scripts and a couple of spreadsheets are not going to cut it.

What you should do about this is very much dependent on your development platform. If you're using Hibernate and developing in Java, your solution will be very different to someone using Ruby on Rails. I do a lot of developing in C#.Net at the moment, and we have a fairly complex NHibernate-based data layer. After looking at a few of the different tools out there, I decided to roll my own. Whenever we make a change to our data layer, we write a script to update the previous version to the current one. Whenever any of our builds accesses a database, the data layer looks at a table in that database which is automatically kept up to date with the scripts which have been run. If the versions don't match, a client tool or website will display a message, and an administration tool will provide the option of running any pending update scripts available to that build of the tool. With a single click, we can update any database we've produced since I introduced the system to any newer version we have a working build for. This is similar to the system RoR uses. There are other approaches that will work, but whatever you decide to do, a well-thought-out database versioning plan will save you both headaches and money.

A Build Server
I'm a huge fan of Continuous Integration. I like small, regular commits, automated builds and unit tests, and a clean build environment. I'm currently using CruiseControl.Net, but there's a generalised version called CruiseControl, and other products which accomplish the same task. Basically, every time somebody commits some code, your CI server checks out the latest changes, builds your system, and (optionally) runs some or all of your test suite. I like to run a cut-down version of our unit tests on every build, and schedule a longer set of tests to run overnight. This gives developers rapid feedback about the success or failure of their commits, while still ensuring builds are tested exhaustively on a regular basis.

I don't think a continuous integration server is nearly as critical to a team's success as having a good source control solution. I'll even accept that CI might be overkill in some circumstances. But for any decent-sized project, I really think it's worth having one.

Issue Tracking
All too often, companies don't have a good issue tracking system. By which I really mean: All too often, companies use Excel as their issue tracking system.

Now Excel is actually fairly good at keeping lists of things. It will even work reasonably well at issue tracking for very small projects. It doesn't take long to out-grow it, though, and you shouldn't cling to it once it stops working well. There are some really good systems out there, many of them with hosted solutions, some of them free. If you're after free and open-source, Bugzilla comes immediately to mind. Wikipedia has an extensive list of issue tracking software. There are plenty of hosts out there which offer integrated issue tracking and source control, and it turns out you get some nice benefits out of combining the two into a single solution. Google Code offers free hosting to open source projects which combines source control, issue tracking, and a documentation wiki.

A Coding Standard
The problem with not having a coding standard is that some developers are lazy. Lazy can be good: a good lazy developer won't write code three times when they could factor it out and write it once. Unfortunately, lazy can also be bad. An otherwise good developer who never writes any comments or documentation can cause all sorts of headaches for maintenance. A developer who isn't lazy enough may write a dozen similar functions rather then spending a little thought to do the same task in a tenth as many lines.

You could end up with a religious war over the size of tab stops. I've seen this happen.

A good coding standard is a very useful tool. A long, wordy, and generally bloated coding standard, on the other hand, is mainly only useful when printed out and used as a door-stop. I won't go into too many details here, but if you don't have a standard at all, try to come up with a page or two, dealing with comments, member naming, namespaces, and the like. If it's written down, you can hold people to it.

TEAM MEETINGS
I just can't emphasize this enough. Developers often hate team meetings ("Can't I just get on with writing code?"), but that's usually because team meetings aren't run well. You get three major benefits from having team meetings:

You won't end up with two developers doing the same thing.
You only get this benefit, of course, if you talk about what you will do during the meeting. Meetings are not information-gathering sessions. They're not show-off sessions. You're here to plan what to do, and if you talk about doing things before you dive in, you can avoid duplicating work.
Developers will have a basic idea of the achievements of other team members.
There's no point doing lots of work getting a framework up to date if the people using the framework don't know about it. I lied before. Meetings are show-off sessions. Just keep it short and to the point. We're not here to listen to an hour-long lecture on the intricate object architecture your lead architect is so proud of.
You'll pick up bad decisions before they hurt you, and maybe stumble across some good ones too.
One developer might hear about something someone else is working on, and point out something you've already built which will help. Or just having a bunch of smart engineers in the room for ten minutes chilling out at the end of the meeting will be enough to come up with a great new idea.

I'm sure there are other benefits, too. As a team leader, you might pick up on some under-current running through the team that's not obvious when you're not all together. You might discover that one of your developers has heaps of experience in an area you were about to start a project in. The possibilities are endless.

Just keep them short and relevant, and always always ALWAYS head off private discussions early. Your whole team doesn't need to listen while two developers hash out a network protocol because they don't bother having meetings outside the general team meeting. This is the number one cause of engineers hating meetings. If you've got a whole-team discussion which seems to be going around in circles, appoint one or two people to investigate the issue later, and move on.

Finally, make sure someone is taking notes and distributing them. There's no point in spending time making a decision if you're all going to forget it. There's no point in giving someone a task if there's no follow-up. There's no point in deciding what everyone will be working on for the rest of the week if everyone's forgotten by mid-afternoon on the same day. Send out a summary email right away. Make it so short that even the laziest of developers will skim it.

Mentors
This obviously depends a little more on your team make-up, but I love having a mentor system in place where possible. Junior developers won't develop if they don't have someone to advise them. Don't leave them to struggle with an architecture, trying and failing until they end up learning a bad solution: give them someone to go to for advice, so they can go straight to implementing a good solution.

I also love having periodic training seminars. Make your senior engineers run seminars. Make your junior engineers run seminars, and make your senior engineers show up to give them feedback. Provide pizza.

A Fast Deployment Process
If your product is quick and easy to deploy, people won't screw it up. The highest-pressure deployments are often the most crucial: marketing has a huge client on the boil, they're ready to sign a massive contract, and they were even nice enough to give you reasonable notice about the feature they simply must demo to make the sale, but the time-line is tight, you've barely finished the feature, and the meeting is about to start. Now is not the time to start on the intricate multi-multi-step deployment process with a dozen things that you can get wrong. Your deployment process should include backing up the previous version, a simple build installation (preferably from your clean, always-tested build server), and a rollback process for if your deployment breaks everything. It needs to deal with database versioning on the spot (and database backup too!). There's a lot to be done there, potentially serious consequences if you mess it up, and the opportunity to look really slick to your clients when you roll updates in seamlessly, with virtually zero downtime and never a mistake made.

And more...
These ideas, and more, are the base-line things I take in to a new employer. Not every idea works for every employer, but they're the low-hanging fruit for making life immediately better.

Once things like these are out of the way, you can get down to doing some serious innovating, knowing that the basics aren't going to trip you up every step of the way.

The Subjective Geek

Tuesday, July 27, 2010

Software Development 101

No comments:

Post a Comment