Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

12/9/2020 Ho Man E eballs Tame Comple it

Ho Man E eballs Tame Comple i


Prev Ne t

Ho Man E eballs Tame Comple i


It's one thing to observe in the large that the ba aar st le greatl accelerates debugging and code evolution.
It's another to understand e actl ho and h it does so at the micro-level of da -to-da developer and
tester behavior. In this section ( ritten three ears after the original paper, using insights b developers ho
read it and re-e amined their o n behavior) e'll take a hard look at the actual mechanisms. Non-technicall
inclined readers can safel skip to the ne t section.

One ke to understanding is to reali e e actl h it is that the kind of bug report non source-a are users
normall turn in tends not to be ver useful. Non source-a are users tend to report onl surface s mptoms;
the take their environment for granted, so the (a) omit critical background data, and (b) seldom include a
reliable recipe for reproducing the bug.

The underl ing problem here is a mismatch bet een the tester's and the developer's mental models of the
program; the tester, on the outside looking in, and the developer on the inside looking out. In closed-source
development the 're both stuck in these roles, and tend to talk past each other and nd each other deepl
frustrating.

Open-source development breaks this bind, making it far easier for tester and developer to develop a shared
representation grounded in the actual source code and to communicate effectivel about it. Practicall , there
is a huge difference in leverage for the developer bet een the kind of bug report that just reports e ternall -
visible s mptoms and the kind that hooks directl to the developer's source-code based mental representation
of the program.

Most bugs, most of the time, are easil nailed given even an incomplete but suggestive characteri ation of
their error conditions at source-code level. When someone among our beta-testers can point out, "there's a
boundar problem in line nnn", or even just "under conditions X, Y, and Z, this variable rolls over", a quick
look at the offending code often suf ces to pin do n the e act mode of failure and generate a .

Thus, source-code a areness b both parties greatl enhances both good communication and the s nerg
bet een hat a beta-tester reports and hat the core developer(s) kno . In turn, this means that the core
developers' time tends to be ell conserved, even ith man collaborators.

Another characteristic of the open-source method that conserves developer time is the communication
structure of t pical open-source projects. Above I used the term "core developer"; this re ects a distinction
bet een the project core (t picall quite small; a single core developer is common, and one to three is
t pical) and the project halo of beta-testers and available contributors ( hich often numbers in the hundreds).

The fundamental problem that traditional soft are-development organi ation addresses is Brook's La :
``Adding more programmers to a late project makes it later.'' More generall , Brooks's La predicts that the
comple it and communication costs of a project rise ith the square of the number of developers, hile
ork done onl rises linearl .

Brooks's La is founded on e perience that bugs tend strongl to cluster at the interfaces bet een code
ritten b different people, and that communications/coordination overhead on a project tends to rise ith
the number of interfaces bet een human beings. Thus, problems scale ith the number of communications
paths bet een developers, hich scales as the square of the humber of developers (more precisel , according
to the formula N*(N - 1)/2 here N is the number of developers).

The Brooks's La anal sis (and the resulting fear of large numbers in development groups) rests on a hidden
assummption: that the communications structure of the project is necessaril a complete graph, that
ever bod talks to ever bod else. But on open-source projects, the halo developers ork on hat are in

catb.org/esr/ ritings/cathedral-ba aar/cathedral-ba aar/ar01s05.html 1/2


12/9/2020 Ho Man E eballs Tame Comple it

effect separable parallel subtasks and interact ith each other ver little; code changes and bug reports stream
through the core group, and onl ithin that small core group do e pa the full Brooksian overhead. [SU]

There are are still more reasons that source-code level bug reporting tends to be ver ef cient. The center
around the fact that a single error can often have multiple possible s mptoms, manifesting differentl
depending on details of the user's usage pattern and environment. Such errors tend to be e actl the sort of
comple and subtle bugs (such as d namic-memor -management errors or nondeterministic interrupt-
indo artifacts) that are hardest to reproduce at ill or to pin do n b static anal sis, and hich do the
most to create long-term problems in soft are.

A tester ho sends in a tentative source-code level characteri ation of such a multi-s mptom bug (e.g. "It
looks to me like there's a indo in the signal handling near line 1250" or "Where are ou eroing that
buffer?") ma give a developer, other ise too close to the code to see it, the critical clue to a half-do en
disparate s mptoms. In cases like this, it ma be hard or even impossible to kno hich e ternall -visible
misbehaviour as caused b precisel hich bug but ith frequent releases, it's unnecessar to kno .
Other collaborators ill be likel to nd out quickl hether their bug has been ed or not. In man cases,
source-level bug reports ill cause misbehaviours to drop out ithout ever having been attributed to an
speci c .

Comple multi-s mptom errors also tend to have multiple trace paths from surface s mptoms back to the
actual bug. Which of the trace paths a given developer or tester can chase ma depend on subtleties of that
person's environment, and ma ell change in a not obviousl deterministic a over time. In effect, each
developer and tester samples a semi-random set of the program's state space hen looking for the etiolog of
a s mptom. The more subtle and comple the bug, the less likel that skill ill be able to guarantee the
relevance of that sample.

For simple and easil reproducible bugs, then, the accent ill be on the "semi" rather than the "random";
debugging skill and intimac ith the code and its architecture ill matter a lot. But for comple bugs, the
accent ill be on the "random". Under these circumstances man people running traces ill be much more
effective than a fe people running traces sequentiall even if the fe have a much higher average skill
level.

This effect ill be greatl ampli ed if the dif cult of follo ing trace paths from different surface s mptoms
back to a bug varies signi cantl in a a that can't be predicted b looking at the s mptoms. A single
developer sampling those paths sequentiall ill be as likel to pick a dif cult trace path on the rst tr as an
eas one. On the other hand, suppose man people are tr ing trace paths in parallel hile doing rapid
releases. Then it is likel one of them ill nd the easiest path immediatel , and nail the bug in a much
shorter time. The project maintainer ill see that, ship a ne release, and the other people running traces on
the same bug ill be able to stop before having spent too much time on their more dif cult traces [RJ].

Prev Up Ne t
Release Earl , Release Often Home When Is a Rose Not a Rose?

catb.org/esr/ ritings/cathedral-ba aar/cathedral-ba aar/ar01s05.html 2/2

You might also like