Software Engineering for Students A Programming Approach

Software Engineering for Students A Programming Approach Software Engineering for Students A Programming Approach

web.firat.edu.tr
from web.firat.edu.tr More from this publisher
21.08.2013 Views

31.3 Case study – assessing verification techniques 387 part so that the results are statistically significant? Is the problem that has been chosen typical, or is it a small “toy” problem from which it is unreasonable to extrapolate? Is there any difference between the motivation of the participants in the experiment and that of practitioners in a real situation? These questions are serious challenges to the validity of experiments and the significance of the results. The design of experiments must be examined carefully and the results used with caution. While the problem of measuring and comparing productivity is fearsome, the story gets worse when we consider software quality. Again, what we desire is a statement like, “Method A gives rise to software that is 50% more reliable than method B.” Whereas with productivity we have a ready-made measure – person months – how do we measure reliability? If we use the number of bugs as a measure, how can we actually count them? Again, do we count all the bugs equally or are some worse than others? Such questions illustrate the difficulties. Similarly, if we want to quantify how well a method creates software that is easy to maintain, then ideally we need an objective measure or metric. There are, of course, additional criteria for assessing and comparing methods (see Chapter 30 on project management). We might choose from amongst the following checklist: ■ training time for the people who will use the method ■ level of skill required by the people using the method ■ whether the software produced is easy to maintain ■ whether the software will meet performance targets ■ whether documentation is automatically produced ■ whether the method is enjoyable to use ■ whether the method can be used in the required area of application. The outcomes of experiments that assess methods are not encouraging. For example, it is widely accepted in the computer industry that structured programming is the best approach. But one review of the evidence (see the references below) concluded that it was inconclusive (because of problems with the design of experiments). Similarly, there seems to be very limited evidence that object-oriented methods are better than older methods. Clearly there are many problems to be solved in assessing methods, but equally clearly developers need hard evidence to use in choosing between methods. We can expect that much attention will be given to the evaluation of tools and methods, and it is, in fact, an active area of current research. This research centers on the design of experiments and the invention of useful metrics. 31.3 ● Case study – assessing verification techniques We now discuss the results of one of the few small-scale experiments that have been conducted to assess methods. This particular study assessed verification techniques, in particular black box testing, white box testing and walkthroughs. Black box and white box testing techniques are explained in Chapter 19. Structured walkthroughs are explained in Chapter 20 on groups.

388 Chapter 31 ■ Assessing methods In the experiment, 59 people were asked to test a 63-line PL/1 program. The people were workers in the computer industry, most of whom were programmers, with an average of 11 years experience in computing. They were told of a suspicion that the program was not perfect and asked to test the program until they felt that they had found all the errors (if any). An error meant a discrepancy between the program and the specification. The people were provided with the program specification, the program listing, a computer to run the program on and as much time as they wanted. Different groups used different verification methods. While the people were experienced and the program was quite small, their performance was surprisingly bad. The mean number of bugs found was 5.7. The most errors any individual found were 9. The least any person found was 3. The actual number of bugs was 15. There were 4 bugs that no one found. The overwhelming conclusion must be that people are not very effective at carrying out verification, whichever technique they use. Additional findings from this study were that the people were not careful enough in comparing the actual output from the program with the expected outcome. Bugs that were actually revealed were missed in this way. Also the people spent too long on testing the normal conditions that the program had to encounter, rather than testing special cases and invalid input situations. The evidence from this and other experiments suggests that inspections are a very effective way of finding errors. In fact inspections are at least as good a way of identifying bugs as actually running the program (doing testing). So, if you had to choose one method for verification, it would have to be inspection. Studies show that black box testing and white box testing are roughly equally effective. However, evidence suggests that the different verification techniques tend to discover different errors. Therefore the more techniques that are employed the better – provided that there is adequate time and money. So black box testing, white box testing and inspection all have a role to play. If there is sufficient time and effort available, the best strategy is to use all three methods. The conclusion is that small-scale experiments can give useful insights into the effectiveness of software development techniques. Incidentally, there is another technique for carrying out verification, which has not been assessed against the above techniques. Formal verification is very appealing because of its potential for rigorously verifying a program’s correctness beyond all possible doubt. However, it must be remembered that formal methods are carried out by fallible human beings who make mistakes. 31.4 ● The current state of methods The methods, tools and approaches discussed in this book are well-established and widely used. There is a diversity of approaches to software development. This, of course, reflects the infancy of the field. But it is also part of the joy of software engineering; it would be boring if there was simply one process model, one design method, one

388 Chapter 31 ■ Assessing methods<br />

In the experiment, 59 people were asked to test a 63-line PL/1 program. The people<br />

were workers in the computer industry, most of whom were programmers, with an<br />

average of 11 years experience in computing. They were told of a suspicion that the<br />

program was not perfect and asked to test the program until they felt that they had<br />

found all the errors (if any). An error meant a discrepancy between the program and<br />

the specification. The people were provided with the program specification, the program<br />

listing, a computer to run the program on and as much time as they wanted.<br />

Different groups used different verification methods.<br />

While the people were experienced and the program was quite small, their per<strong>for</strong>mance<br />

was surprisingly bad. The mean number of bugs found was 5.7. The most errors<br />

any individual found were 9. The least any person found was 3. The actual number of<br />

bugs was 15. There were 4 bugs that no one found. The overwhelming conclusion<br />

must be that people are not very effective at carrying out verification, whichever technique<br />

they use.<br />

Additional findings from this study were that the people were not careful enough in<br />

comparing the actual output from the program with the expected outcome. Bugs that<br />

were actually revealed were missed in this way. Also the people spent too long on testing<br />

the normal conditions that the program had to encounter, rather than testing special<br />

cases and invalid input situations.<br />

The evidence from this and other experiments suggests that inspections are a very<br />

effective way of finding errors. In fact inspections are at least as good a way of identifying<br />

bugs as actually running the program (doing testing). So, if you had to choose<br />

one method <strong>for</strong> verification, it would have to be inspection. Studies show that black box<br />

testing and white box testing are roughly equally effective.<br />

However, evidence suggests that the different verification techniques tend to discover<br />

different errors. There<strong>for</strong>e the more techniques that are employed the better – provided<br />

that there is adequate time and money. So black box testing, white box testing and inspection<br />

all have a role to play. If there is sufficient time and ef<strong>for</strong>t available, the best strategy<br />

is to use all three methods.<br />

The conclusion is that small-scale experiments can give useful insights into the effectiveness<br />

of software development techniques.<br />

Incidentally, there is another technique <strong>for</strong> carrying out verification, which has not<br />

been assessed against the above techniques. Formal verification is very appealing because<br />

of its potential <strong>for</strong> rigorously verifying a program’s correctness beyond all possible doubt.<br />

However, it must be remembered that <strong>for</strong>mal methods are carried out by fallible<br />

human beings who make mistakes.<br />

31.4 ● The current state of methods<br />

The methods, tools and approaches discussed in this book are well-established and widely<br />

used. There is a diversity of approaches to software development. This, of course,<br />

reflects the infancy of the field. But it is also part of the joy of software engineering;<br />

it would be boring if there was simply one process model, one design method, one

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!