Software Engineering for Students A Programming Approach
Software Engineering for Students A Programming Approach Software Engineering for Students A Programming Approach
31.3 Case study – assessing verification techniques 387 part so that the results are statistically significant? Is the problem that has been chosen typical, or is it a small “toy” problem from which it is unreasonable to extrapolate? Is there any difference between the motivation of the participants in the experiment and that of practitioners in a real situation? These questions are serious challenges to the validity of experiments and the significance of the results. The design of experiments must be examined carefully and the results used with caution. While the problem of measuring and comparing productivity is fearsome, the story gets worse when we consider software quality. Again, what we desire is a statement like, “Method A gives rise to software that is 50% more reliable than method B.” Whereas with productivity we have a ready-made measure – person months – how do we measure reliability? If we use the number of bugs as a measure, how can we actually count them? Again, do we count all the bugs equally or are some worse than others? Such questions illustrate the difficulties. Similarly, if we want to quantify how well a method creates software that is easy to maintain, then ideally we need an objective measure or metric. There are, of course, additional criteria for assessing and comparing methods (see Chapter 30 on project management). We might choose from amongst the following checklist: ■ training time for the people who will use the method ■ level of skill required by the people using the method ■ whether the software produced is easy to maintain ■ whether the software will meet performance targets ■ whether documentation is automatically produced ■ whether the method is enjoyable to use ■ whether the method can be used in the required area of application. The outcomes of experiments that assess methods are not encouraging. For example, it is widely accepted in the computer industry that structured programming is the best approach. But one review of the evidence (see the references below) concluded that it was inconclusive (because of problems with the design of experiments). Similarly, there seems to be very limited evidence that object-oriented methods are better than older methods. Clearly there are many problems to be solved in assessing methods, but equally clearly developers need hard evidence to use in choosing between methods. We can expect that much attention will be given to the evaluation of tools and methods, and it is, in fact, an active area of current research. This research centers on the design of experiments and the invention of useful metrics. 31.3 ● Case study – assessing verification techniques We now discuss the results of one of the few small-scale experiments that have been conducted to assess methods. This particular study assessed verification techniques, in particular black box testing, white box testing and walkthroughs. Black box and white box testing techniques are explained in Chapter 19. Structured walkthroughs are explained in Chapter 20 on groups.
388 Chapter 31 ■ Assessing methods In the experiment, 59 people were asked to test a 63-line PL/1 program. The people were workers in the computer industry, most of whom were programmers, with an average of 11 years experience in computing. They were told of a suspicion that the program was not perfect and asked to test the program until they felt that they had found all the errors (if any). An error meant a discrepancy between the program and the specification. The people were provided with the program specification, the program listing, a computer to run the program on and as much time as they wanted. Different groups used different verification methods. While the people were experienced and the program was quite small, their performance was surprisingly bad. The mean number of bugs found was 5.7. The most errors any individual found were 9. The least any person found was 3. The actual number of bugs was 15. There were 4 bugs that no one found. The overwhelming conclusion must be that people are not very effective at carrying out verification, whichever technique they use. Additional findings from this study were that the people were not careful enough in comparing the actual output from the program with the expected outcome. Bugs that were actually revealed were missed in this way. Also the people spent too long on testing the normal conditions that the program had to encounter, rather than testing special cases and invalid input situations. The evidence from this and other experiments suggests that inspections are a very effective way of finding errors. In fact inspections are at least as good a way of identifying bugs as actually running the program (doing testing). So, if you had to choose one method for verification, it would have to be inspection. Studies show that black box testing and white box testing are roughly equally effective. However, evidence suggests that the different verification techniques tend to discover different errors. Therefore the more techniques that are employed the better – provided that there is adequate time and money. So black box testing, white box testing and inspection all have a role to play. If there is sufficient time and effort available, the best strategy is to use all three methods. The conclusion is that small-scale experiments can give useful insights into the effectiveness of software development techniques. Incidentally, there is another technique for carrying out verification, which has not been assessed against the above techniques. Formal verification is very appealing because of its potential for rigorously verifying a program’s correctness beyond all possible doubt. However, it must be remembered that formal methods are carried out by fallible human beings who make mistakes. 31.4 ● The current state of methods The methods, tools and approaches discussed in this book are well-established and widely used. There is a diversity of approaches to software development. This, of course, reflects the infancy of the field. But it is also part of the joy of software engineering; it would be boring if there was simply one process model, one design method, one
- Page 360 and 361: CHAPTER 27 This chapter explains:
- Page 362 and 363: Figure 27.1 The phases of the unifi
- Page 364 and 365: 27.5 ● Iteration 27.6 Case study
- Page 366 and 367: The transition phase Summary 343 Th
- Page 368: PART F PROJECT MANAGEMENT
- Page 371 and 372: 348 Chapter 28 ■ Teams The commun
- Page 373 and 374: 350 Chapter 28 ■ Teams Level of s
- Page 375 and 376: 352 Chapter 28 ■ Teams A chief pr
- Page 377 and 378: 354 Chapter 28 ■ Teams benefits o
- Page 379 and 380: 356 Chapter 28 ■ Teams • Furthe
- Page 381 and 382: 358 Chapter 29 ■ Software metrics
- Page 383 and 384: 360 Chapter 29 ■ Software metrics
- Page 385 and 386: 362 Chapter 29 ■ Software metrics
- Page 387 and 388: 364 Chapter 29 ■ Software metrics
- Page 389 and 390: 366 Chapter 29 ■ Software metrics
- Page 391 and 392: 368 Chapter 29 ■ Software metrics
- Page 393 and 394: CHAPTER 30 This chapter: 30.1 ● I
- Page 395 and 396: 372 Chapter 30 ■ Project manageme
- Page 397 and 398: 374 Chapter 30 ■ Project manageme
- Page 399 and 400: 376 Chapter 30 ■ Project manageme
- Page 401 and 402: 378 Chapter 30 ■ Project manageme
- Page 403 and 404: 380 Chapter 30 ■ Project manageme
- Page 405 and 406: 382 Chapter 30 ■ Project manageme
- Page 408 and 409: CHAPTER 31 This chapter: 31.1 ● I
- Page 412 and 413: 31.5 A single development method? 3
- Page 414 and 415: Further reading 391 31.2 Draw up a
- Page 416 and 417: 32.3 ● The world of programming l
- Page 418 and 419: 32.5 ● The real world of software
- Page 420 and 421: 32.6 Control versus skill 397 Final
- Page 422 and 423: Formal methods 32.7 Future methods
- Page 424 and 425: Summary 401 In the short-term futur
- Page 426: Further reading 403 An extensive tr
- Page 430 and 431: APPENDIX A Case studies are used th
- Page 432 and 433: Figure A.1 Cyberspace invaders A.4
- Page 434 and 435: APPENDIX B Glossary Within the fiel
- Page 436 and 437: C.2 ● Class diagrams C.2 Class di
- Page 438 and 439: util Figure C.6 A package diagram S
- Page 440 and 441: References to books and websites ar
- Page 442 and 443: abstraction 99, 107 acceptance test
- Page 444 and 445: fork 324 formal methods 276, 388, 3
- Page 446 and 447: quality 18, 362 quality assurance 1
388 Chapter 31 ■ Assessing methods<br />
In the experiment, 59 people were asked to test a 63-line PL/1 program. The people<br />
were workers in the computer industry, most of whom were programmers, with an<br />
average of 11 years experience in computing. They were told of a suspicion that the<br />
program was not perfect and asked to test the program until they felt that they had<br />
found all the errors (if any). An error meant a discrepancy between the program and<br />
the specification. The people were provided with the program specification, the program<br />
listing, a computer to run the program on and as much time as they wanted.<br />
Different groups used different verification methods.<br />
While the people were experienced and the program was quite small, their per<strong>for</strong>mance<br />
was surprisingly bad. The mean number of bugs found was 5.7. The most errors<br />
any individual found were 9. The least any person found was 3. The actual number of<br />
bugs was 15. There were 4 bugs that no one found. The overwhelming conclusion<br />
must be that people are not very effective at carrying out verification, whichever technique<br />
they use.<br />
Additional findings from this study were that the people were not careful enough in<br />
comparing the actual output from the program with the expected outcome. Bugs that<br />
were actually revealed were missed in this way. Also the people spent too long on testing<br />
the normal conditions that the program had to encounter, rather than testing special<br />
cases and invalid input situations.<br />
The evidence from this and other experiments suggests that inspections are a very<br />
effective way of finding errors. In fact inspections are at least as good a way of identifying<br />
bugs as actually running the program (doing testing). So, if you had to choose<br />
one method <strong>for</strong> verification, it would have to be inspection. Studies show that black box<br />
testing and white box testing are roughly equally effective.<br />
However, evidence suggests that the different verification techniques tend to discover<br />
different errors. There<strong>for</strong>e the more techniques that are employed the better – provided<br />
that there is adequate time and money. So black box testing, white box testing and inspection<br />
all have a role to play. If there is sufficient time and ef<strong>for</strong>t available, the best strategy<br />
is to use all three methods.<br />
The conclusion is that small-scale experiments can give useful insights into the effectiveness<br />
of software development techniques.<br />
Incidentally, there is another technique <strong>for</strong> carrying out verification, which has not<br />
been assessed against the above techniques. Formal verification is very appealing because<br />
of its potential <strong>for</strong> rigorously verifying a program’s correctness beyond all possible doubt.<br />
However, it must be remembered that <strong>for</strong>mal methods are carried out by fallible<br />
human beings who make mistakes.<br />
31.4 ● The current state of methods<br />
The methods, tools and approaches discussed in this book are well-established and widely<br />
used. There is a diversity of approaches to software development. This, of course,<br />
reflects the infancy of the field. But it is also part of the joy of software engineering;<br />
it would be boring if there was simply one process model, one design method, one