Can Reuse Improve Reliability?
Giancarlo Succi
1
ENEL - University of Calgary
Luigi Benedicenti
University of Regina
Andrea Valerio, Tullio Vernazza
DIST - University of Genova
Software Reuse has great potential
for improving the quality of a software artifact, and to reduce
the its development time. However, this potential has seldom
been assessed in terms of reliability of the software produced.
We present an account of an experiment that aimed at investigating
the importance of reuse in improving reliability of a software
product. Other reports exist such as [Agr
92] but they do not explain reliability in terms of software
reuse.
The experiment took place in a
medium-sized Italian software company that produces business
software. We analyzed two development teams working concurrently
at two similar projects: they are both parts of an accounting
system, both are written in RPG, the size of both was roughly
100,000 lines of code, and both took about 12 months to complete.
The size of the development teams was roughly the same, and the
skill levels were similar.
The first team worked adopting
no corporate reuse policy, while the second team worked developing
a domain library of reusable artifacts that were employed in
the development of the product. We designed the experiment according
to [Con 88]. The experiment design
is a single control-group random assignment. The treatment is
the adoption of a reuse library. The control group was not treated,
while the experiment group was treated. Particular care was taken
to insure that no bias would change the experiment's results.
This involved the "blind" assignment of people to groups
(the team leaders were not aware of the experiment) and the collection
of data by automatic inspection systems in order to minimize
the choice of error. The databases created were joined after
the development was over by data analysts using statistical software
packages.
The data analysis and the choice of measures followed [Fen 96]. The variables of interest for
the study are two:
- Reliability. The reliability
data in this experiment were obtained as number of malfunctions
during customer tests (Customer Complaint Density, CCD). The
difficulty in testing thoroughly a business system without requesting
full coverage was therefore avoided. The variable we measured
is the number of customer complaints relating to each source
file divided by the number of lines of code in that file. The
variable is in ratio scale, since it derives from a ratio of
elements. Note that a value of 0 indicates perfect reliability,
whereas there is no limit to the unreliability of a system (theoretically,
the number of malfunctions could be infinite). We did not apply
any transformation to the variable to keep it normalized as given
by the denominator.
- Reuse Level. For this
experiment we considered reuse as a binary variable, 1 meaning
that a reuse technique was employed and 0 meaning it was not.
This separated the cases in the first group from the cases in
the second group.
Preliminary data analysis indicates a difference between the
two means indicating that the adoption of reuse nearly halves
the normalized number of errors per line of code. Figure 1 shows
the means of the two groups.

Figure 1 - Descriptives for CCd for the two groups
The technique we applied to obtain the result was an independent
samples t-test. The t-test is relatively robust given the high
number of samples we obtained (137 and 156 samples) and is therefore
not influenced by the normality of the distribution of the samples.
The Levene test was employed to test the equality of the variances
among the two groups. The variances appear different, therefore
we used the corrected version of the t-test for inequality of
variances that yielded a p-value of 0.033 that the two means
differ. This is a significant value.
The results of the t-test are shown
in Figure 2.
Figure 2 - Independent samples t-test result
The effect size for this test is
d = 0.03 and the eta square is 0.016. This indicates
that the difference of the means, although significant, is small
compared with the variance of the two samples, and that there
is not a perfect replication of the values in the two samples.
This was to be expected since the variances of the two samples
differ. The situation is depicted in Figure 3.
Figure 3 - Error bars for the two samples
The error bars represent two standard deviations from the
mean for each sample.
The findings in this experiment show that the adoption of
a reuse policy in an RPG business software development environment
can significantly improve the reliability of the software, as
perceived by the customer.
References
[Agr 92] Agresti, W.W., and W.M.
Evanco. "Projecting Software Defects from Analyzing Ada
Design." IEEE Transactions on Software Engineering,
18(11) (Nov. 1992), 988-997
[Con 88] Conte, S.D., H.E. Dunsmore,
and V. Y. Shen. Software engineering metrics and models. Benjamin
Cummings, 1988
[Fen 96] Fenton, N., and S.L. Pfleeger.
Software metrics: a rigorous and practical approach. International
Thomson Computer Press, 1996
Author contact: Giancarlo Succi, University
of Calgary, 2500 University Drive NW, Calgary, AB, Canada, Tel
1-403-585-8347, E-Mail: Giancarlo.Succi@enel.ucalgary.ca. |