11 - Collaborative Development

Notes:

21.1 Overview of Collaborative Development Practices

"Collaborative construction" refers to pair programming, formal inspections, informal technical reviews, and document reading, as well as other techniques in which developers share responsibility for creating code and other work products. At my company, the term "collaborative construction" was coined by Matt Peloquin in about 2000. The term appears to have been coined independently by others in the same time frame.

All collaborative construction techniques, despite their differences, are based on the ideas that developers are blind to some of the trouble spots in their work, that other people don't have the same blind spots, and that it's beneficial for developers to have someone else look at their work. Studies at the Software Engineering Institute have found that developers insert an average of 1 to 3 defects per hour into their designs and 5 to 8 defects per hour into code (Humphrey 1997), so attacking these blind spots is a key to effective construction.

Collaborative Construction Complements Other Quality-Assurance Techniques

The primary purpose of collaborative construction is to improve software quality. As noted in Chapter 20, "The Software-Quality Landscape," software testing has limited effectiveness when used alone-the average defect-detection rate is only about 30 percent for unit testing, 35 percent for integration testing, and 35 percent for low-volume beta testing. In contrast, the average effectivenesses of design and code inspections are 55 and 60 percent (Jones 1996). The secondary benefit of collaborative construction is that it decreases development time, which in turn lowers development costs.

Early reports on pair programming suggest that it can achieve a code-quality level similar to formal inspections (Shull et al 2002). The cost of full-up pair programming is probably higher than the cost of solo development-on the order of 10-25 percent higher-but the reduction in development time appears to be on the order of 45 percent, which in some cases may be a decisive advantage over solo development (Boehm and Turner 2004), although not over inspections which have produced similar results.

Technical reviews have been studied much longer than pair programming, and their results, as described in case studies and elsewhere, have been impressive:

IBM found that each hour of inspection prevented about 100 hours of related work (testing and defect correction) (Holland 1999).
Raytheon reduced its cost of defect correction (rework) from about 40 percent of total project cost to about 20 percent through an initiative that focused on inspections (Haley 1996).
Hewlett-Packard reported that its inspection program saved an estimated $$ 21.5$ million per year (Grady and Van Slack 1994).
Imperial Chemical Industries found that the cost of maintaining a portfolio of about 400 programs was only about 10 percent as high as the cost of maintaining a similar set of programs that had not been inspected (Gilb and Graham 1993).
A study of large programs found that each hour spent on inspections avoided an average of 33 hours of maintenance work and that inspections were up to 20 times more efficient than testing (Russell 1991).
In a software-maintenance organization, 55 percent of one-line maintenance changes were in error before code reviews were introduced. After reviews were introduced, only 2 percent of the changes were in error (Freedman and Weinberg 1990). When all changes were considered, 95 percent were correct the first time after reviews were introduced. Before reviews were introduced, under 20 percent were correct the first time.
A group of 11 programs were developed by the same group of people, and all were released to production. The first five were developed without reviews and averaged 4.5 errors per 100 lines of code. The other six were inspected and averaged only 0.82 errors per 100 lines of code. Reviews cut the errors by over 80 percent (Freedman and Weinberg 1990).
Capers Jones reports that of all the software projects he has studied that have achieved 99 percent defect-removal rates or better, all have used formal inspections. Also, none of the projects that achieved less than 75 percent defectremoval efficiency used formal inspections (Jones 2000).

A number of these cases illustrate the General Principle of Software Quality, which holds that reducing the number of defects in the software also improves development time.

Various studies have shown that in addition to being more effective at catching errors than testing, collaborative practices find different kinds of errors than testing does (Myers 1978; Basili, Selby, and Hutchens 1986). As Karl Wiegers points out, "A human reviewer can spot unclear error messages, inadequate comments, hard-coded variable values, and repeated code patterns that should be consolidated. Testing won't" (Wiegers 2002). A secondary effect is that when people know their work will be reviewed, they scrutinize it more carefully. Thus, even when testing is done effectively, reviews or other kinds of collaboration are needed as part of a comprehensive quality program.

Collaborative Construction Provides Mentoring in Corporate Culture and Programming Expertise

Software standards can be written down and distributed, but if no one talks about them or encourages others to use them, they won't be followed. Reviews are an important mechanism for giving programmers feedback about their code. The code, the standards, and the reasons for making the code meet the standards are good topics for review discussions.

In addition to feedback about how well they follow standards, programmers need feedback about more subjective aspects of programming: formatting, comments, variable names, local and global variable use, design approaches, the-way-we-do-things-around-here, and so on. Programmers who are still wet behind the ears need guidance from those who are more knowledgeable, and more knowledgeable programmers who tend to be busy need to be encouraged to spend time sharing what they know. Reviews create a venue for more experienced and less experienced programmers to communicate about technical issues. As such, reviews are an opportunity for cultivating quality improvements in the future as much as in the present.

One team that used formal inspections reported that inspections quickly brought all the developers up to the level of the best developers (Tackett and Van Doren 1999).

Collective Ownership Applies to All Forms of Collaborative Construction

With collective ownership, all code is owned by the group rather than by individuals and can be accessed and modified by various members of the group. This produces several valuable benefits:

Better code quality arises from multiple sets of eyes seeing the code and multiple programmers working on the code.
The impact of someone leaving the project is lessened because multiple people are familiar with each section of code.
Defect-correction cycles are shorter overall because any of several programmers can potentially be assigned to fix bugs on an as-available basis.

Some methodologies, such as Extreme Programming, recommend formally pairing programmers and rotating their work assignments over time. At my company, we've found that programmers don't need to pair up formally to achieve good code coverage. Over time we achieve cross-coverage through a combination of formal and informal technical reviews, pair programming when needed, and rotation of defect-correction assignments.

Collaboration Applies As Much Before Construction As After

This book is about construction, so collaboration on detailed design and code are the focus of this chapter. However, most of the comments about collaborative construction in this chapter also apply to estimates, plans, requirements, architecture, testing, and maintenance work. By studying the references at the end of the chapter, you can apply collaborative techniques to most software development activities.

21.2 pair Programming

When pair programming, one programmer types in code at the keyboard and the other programmer watches for mistakes and thinks strategically about whether the code is being written correctly and whether the right code is being written. Pair programming was originally popularized by Extreme Programming (Beck 2000), but it is now being used more widely (Williams and Kessler 2002).

Key to Success with Pair Programming

The basic concept of pair programming is simple, but its use nonetheless benefits from a few guidelines:

Support pair programming with coding standards

Pair programming will not be effective if the two people in the pair spend their time arguing about coding style. Try to standardize what Chapter 5, "Design in Construction," refers to as the "accidental attributes" of programming so that the programmers can focus on the "essential" task at hand.

Don't let pair programming turn into watching

The person without the keyboard should be an active participant in the programming. That person is analyzing the code, thinking ahead to what will be coded next, evaluating the design, and planning how to test the code.

Don't force pair programming of the easy stuff

One group that used pair programming for the most complicated code found it more expedient to do detailed design at the whiteboard for 15 minutes and then to program solo (Manzo 2002). Most organizations that have tried pair programming eventually settle into using pairs for part of their work but not all of it (Boehm and Turner 2004).

Rotate pairs and work assignments regularly

In pair programming, as with other collaborative development practices, benefit arises from different programmers learning different parts of the system. Rotate pair assignments regularly to encourage cross-pollination-some experts recommend changing pairs as often as daily (Reifer 2002).

Encourage pairs to match each other's pace

One partner going too fast limits the benefit of having the other partner. The faster partner needs to slow down, or the pair should be broken up and reconfigured with different partners.

Make sure both partners can see the monitor

Even seemingly mundane issues like being able to see the monitor and using fonts that are too small can cause problems.

Don't force people who don't like each other to pair

Sometimes personality conflicts prevent people from pairing effectively. It's pointless to force people who don't get along to pair, so be sensitive to personality matches (Beck 2000, Reifer 2002).

Avoid pairing all newbies

Pair programming works best when at least one of the partners has paired before (Larman 2004).

Assign a team leader

If your whole team wants to do 100 percent of its programming in pairs, you'll still need to assign one person to coordinate work assignments, be held accountable for results, and act as the point of contact for people outside the project.

Benefits of Pair Programming

Pair programming produces numerous benefits:

It holds up better under stress than solo development. Pairs encourage each other to keep code quality high even when there's pressure to write quick and dirty code.
It improves code quality. The readability and understandability of the code tends to rise to the level of the best programmer on the team.
It shortens schedules. Pairs tend to write code faster and with fewer errors. The project team spends less time at the end of the project correcting defects.
It produces all the other general benefits of collaborative construction, including disseminating corporate culture, mentoring junior programmers, and fostering collective ownership.

21.4 Other Kinds of Collaborative Development Practices

Other kinds of collaboration haven't accumulated the body of empirical support that inspections or pair programming have, so they're covered in less depth here. The collaborations covered in this section includes walk-throughs, code reading, and dog-and-pony shows.

Walk-Throughs

A walk-through is a popular kind of review. The term is loosely defined, and at least some of its popularity can be attributed to the fact that people can call virtually any kind of review a "walk-through."

Because the term is so loosely defined, it's hard to say exactly what a walk-through is. Certainly, a walk-through involves two or more people discussing a design or code. It might be as informal as an impromptu bull session around a whiteboard; it might be as formal as a scheduled meeting with an overhead presentation prepared by the art department and a formal summary sent to management. In one sense, "where two or three are gathered together," there is a walk-through. Proponents of walk-throughs like the looseness of such a definition, so I'll just point out a few things that all walkthroughs have in common and leave the rest of the details to you:

The walk-through is usually hosted and moderated by the author of the design or code under review.
The walk-through focuses on technical issues-it's a working meeting.
All participants prepare for the walk-through by reading the design or code and looking for errors.
The walk-through is a chance for senior programmers to pass on experience and corporate culture to junior programmers. It's also a chance for junior programmers to present new methodologies and to challenge timeworn, possibly obsolete, assumptions.
A walk-through usually lasts 30 to 60 minutes.
The emphasis is on error detection, not correction.
Management doesn't attend.
The walk-through concept is flexible and can be adapted to the specific needs of the organization using it.

What Results Can You Expect from a Walk-Through?

Used intelligently and with discipline, a walk-through can produce results similar to those of an inspection-that is, it can typically find between 20 and 40 percent of the errors in a program (Myers 1979, Boehm 1987b, Yourdon 1989b, Jones 1996). But in general, walk-throughs have been found to be significantly less effective than inspections (Jones 1996).

Used unintelligently, walk-throughs are more trouble than they're worth. The low end of their effectiveness, 20 percent, isn't worth much, and at least one organization (Boeing Computer Services) found peer reviews of code to be "extremely expensive." Boeing found it was difficult to motivate project personnel to apply walk-through techniques consistently, and when project pressures increased, walk-throughs became nearly impossible (Glass 1982).

I've become more critical of walk-throughs during the past 10 years as a result of what I've seen in my company's consulting business. I've found that when people have bad experiences with technical reviews, it is nearly always with informal practices such as walk-throughs rather than with formal inspections. A review is basically a meeting, and meetings are expensive. If you're going to incur the overhead of holding a meeting, it's worthwhile to structure the meeting as a formal inspection. If the work product you're reviewing doesn't justify the overhead of a formal inspection, it doesn't justify the overhead of a meeting at all. In such a case you're better off using document reading or another less interactive approach.

Inspections seem to be more effective than walk-throughs at removing errors. So why would anyone choose to use walk-throughs?

If you have a large review group, a walk-through is a good review choice because it brings many diverse viewpoints to bear on the item under review. If everyone involved in the walk-through can be convinced that the solution is all right, it probably doesn't have any major flaws.

If reviewers from other organizations are involved, a walk-through might also be preferable. Roles in an inspection are more formalized and require some practice before people perform them effectively. Reviewers who haven't participated in inspections before are at a disadvantage. If you want to solicit their contributions, a walk-through might be the best choice.

Inspections are more focused than walk-throughs and generally pay off better. Consequently, if you're choosing a review standard for your organization, choose inspections first unless you have good reason not to.

Code Reding

Code reading is an alternative to inspections and walk-throughs. In code reading, you read source code and look for errors. You also comment on qualitative aspects of the code, such as its design, style, readability, maintainability, and efficiency.

A study at NASA's Software Engineering Laboratory found that code reading detected about 3.3 defects per hour of effort. Testing detected about 1.8 errors per hour (Card 1987). Code reading also found 20 to 60 percent more errors over the life of the project than the various kinds of testing did.

Like the idea of a walk-through, the concept of code reading is loosely defined. A code reading usually involves two or more people reading code independently and then meeting with the author of the code to discuss it. Here's how code reading goes:

In preparation for the meeting, the author of the code hands out source listings to the code readers. The listings are from 1000 to 10,000 lines of code; 4000 lines is typical.
Two or more people read the code. Use at least two people to encourage competition between the reviewers. If you use more than two, measure everyone's contribution so that you know how much the extra people contribute.
Reviewers read the code independently. Estimate a rate of about 1000 lines a day.
When the reviewers have finished reading the code, the code-reading meeting is hosted by the author of the code. The meeting lasts one or two hours and focuses on problems discovered by the code readers. No one makes any attempt to walk through the code line by line. The meeting is not even strictly necessary.
The author of the code fixes the problems identified by the reviewers.

The difference between code reading on the one hand and inspections and walkthroughs on the other is that code reading focuses more on individual review of the code than on the meeting. The result is that each reviewer's time is focused on finding problems in the code. Less time is spent in meetings in which each person contributes only part of the time and in which a substantial amount of the effort goes into moderating group dynamics. Less time is spent delaying meetings until each person in the group can meet for two hours. Code readings are especially valuable in situations in which reviewers are geographically dispersed.

A study of 13 reviews at AT&T found that the importance of the review meeting itself was overrated; 90 percent of the defects were found in preparation for the review meeting, and only about 10 percent were found during the review itself (Votta 1991, Glass 1999).

Dog-and-Pony Shows

Dog-and-pony shows are reviews in which a software product is demonstrated to a customer. Customer reviews are common in software developed for government contracts, which often stipulate that reviews will be held for requirements, design, and code. The purpose of a dog-and-pony show is to demonstrate to the customer that the project is OK, so it's a management review rather than a technical review.

Don't rely on dog-and-pony shows to improve the technical quality of your products. Preparing for them might have an indirect effect on technical quality, but usually more time is spent in making good-looking presentation slides than in improving the quality of the software. Rely on inspections, walk-throughs, or code reading for technical quality improvements.

Comparison of Collaborative Construction Techniques

What are the differences among the various kinds of collaborative construction? Table 21-1 provides a summary of each technique's major characteristics.

Table 21-1 Comparison of Collaborative Construction Techniques

Property	Pair Programming	Formal Inspection	Informal Review (Walk-Throughs)
Defined participant roles	Yes	Yes	No
Formal training in how to perform the roles	Maybe, through coaching	Yes	No
Who "drives" the collaboration	Person with the keyboard	Moderator	Author, usually
Focus of collaboration	Design, coding, testing, and defect correction	Defect detection only	Varies
Focused review effort-looks for the most frequently found kinds of errors	Informal, if at all	Yes	No
Follow-up to reduce bad fixes	Yes	Yes	No
Fewer future errors because of detailed error feedback to individual programmers	Incidental	Yes	Incidental
Improved process efficiency from analysis of results	No	Yes	No
Useful for nonconstruction activities	Possibly	Yes	Yes
Typical percentage of defects found	40-60%	45-70%	20-40%

Pair programming doesn't have decades of data supporting its effectiveness like formal inspection does, but the initial data suggests it's on roughly equal footing with inspections, and anecdotal reports have also been positive.

If pair programming and formal inspections produce similar results for quality, cost, and schedule, the choice between them becomes a matter of personal style rather than one of technical substance. Some people prefer to work solo, only occasionally breaking out of solo mode for inspection meetings. Others prefer to spend more of their time directly working with others. The choice between the two techniques can be driven by the work-style preference of a team's specific developers, and subgroups within the team might be allowed to choose which way they would like to do most of their work. You should also use different techniques with a project, as appropriate.

Relevant Standards

IEEE Std 1028-1997, Standard for Software Reviews
IEEE Std 730-2002, Standard for Software Quality Assurance Plans

Key Points

Collaborative development practices tend to find a higher percentage of defects than testing and to find them more efficiently.
Collaborative development practices tend to find different kinds of errors than testing does, implying that you need to use both reviews and testing to ensure the quality of your software.
Formal inspections use checklists, preparation, well-defined roles, and continual process improvement to maximize error-detection efficiency. They tend to find more defects than walk-throughs.
Pair programming typically costs about the same as inspections and produces similar quality code. Pair programming is especially valuable when schedule reduction is desired. Some developers prefer working in pairs to working solo.
Formal inspections can be used on work products such as requirements, designs, and test cases, as well as on code.
Walk-throughs and code reading are alternatives to inspections. Code reading offers more flexibility in using each person's time effectively.