Selecting a Cost-Eﬁective Test Case Prioritization Technique

Selecting a cost -E ective Test case Prioritization TechniqueSebastian Elbaum, Gregg Rothermel,ySatya Kanduri,zAlexey G. Malishevsky,xApril 20, 2004 AbstractRegression testing is an expensive testing process used to validate modi ed software and detectwhether new faults have been introduced into previously tested code. To reduce the cost of regressiontesting, software testers may prioritize their test cases so that those which are more important, by somemeasure, are run earlier in the regression testing process. One goal of Prioritization is to increase atest suite's rate of fault detection. Previous empirical studies have shown that several prioritizationtechniques can signi cantly improve rate of fault detection, but these studies have also shown that thee ectiveness of these techniques varies considerably across various attributes of the program, test suites,and modi cations being considered.

This variation makes it di cult for a practitioner to choose anappropriate Prioritization Technique for a given testing scenario. To address this problem, we analyzethe fault detection rates that result from applying several di erent Prioritization techniques to severalprograms and modi ed versions. The results of our analyses provide insights into which types of prioriti-zation techniques are and are not appropriate under speci c testing scenarios, and the conditions underwhich they are or are not appropriate. Our analysis approach can also be used by other researchers orpractitioners to determine the Prioritization techniques appropriate to other :test case Prioritization , regression testing, empirical studies1 IntroductionAs software evolves, test engineers regression test it to validate new features and detect whether new faultshave been introduced into previously tested code.

Regression testing is important, but also expensive, somany approaches for improving its cost -e ectiveness have been investigated. Among these approaches we ndtest case Prioritization . Test case Prioritization techniques help engineers execute regression tests in anorder that achieves testing objectives earlier in the testing process. One testing objective involvesrate offault detection{ a measure of how quickly a test order detects faults. An improved rate of fault detectioncan provide earlier feedback on the system under test, enable earlier debugging, and increase the likelihoodthat, if testing is prematurely halted, those test cases that o er the greatest fault detection ability in theavailable testing time will have been Prioritization techniques have been described in the research literature (Elbaum et al.)}

, 2001b,2002; Jones and Harrold, 2001; Rothermel et al., 2001; Wong et al., 1997). Studies (Elbaum et al., 2002; Department of Computer Science and Engineering, University of Nebraska { of Computer Science and Engineering, University of Nebraska { of Computer Science and Engineering, University of Nebraska { of Computer Science, Oregon State et al., 1999, 2001) have shown that at least some of these techniques can signi cantly increasethe rate of fault detection of test suites in comparison to the rates achieved by unordered or randomlyordered test suites. More recently, researchers at Microsoft (Srivastava and Thiagarajan, 2002) have appliedprioritization to test suites for several multi-million line software systems, and found it highly e cient evenon such large early indications of potential are encouraging; however, studies have also shown that the rates offault detection produced by Prioritization techniques can vary signi cantly with several factors related toprogram attributes, change attributes, and test suite characteristics (Elbaum et al.}}}

, 2001a, 2003). In severalinstances, techniques have not performed as expected. For example, one might expect that techniquesthat take into account the location of code changes would outperform techniques that simply consider testcoverage without taking changes into account, and this expectation is implicit in the Technique implementedat Microsoft (Srivastava and Thiagarajan, 2002). In our empirical studies (Elbaum et al., 2002), however, wehave often observed results contrary to this expectation. It is possible that engineers choosing to prioritizefor both coverage and change attributes may actually achieve poorer rates of fault detection than if theyprioritized just for coverage, or did not prioritize at generally, to use Prioritization cost -e ectively, practitioners must be able to assess which prioritiza-tion techniques are likely to be most e ective in their particular testing scenarios, , given their particularprograms, test cases, and modi cations.

Toward this end, we might seek an algorithm which, given variousmetrics about programs, modi cations, and test suites, calculates and recommends the Technique most likelyto succeed. The factors a ecting Prioritization success are, however, complex, and interact in complex ways(Elbaum et al., 2001a). We do not possess su cient empirical data to allow creation of such a generalprediction algorithm, and the complexities of gathering such data are such that it may be years before itcan be available. Moreover, even if we possessed a general prediction algorithm capable of distinguishingbetween existing Prioritization techniques, such an algorithm might not extend to additional techniques thatmay be this paper, therefore, we pursue an alternative approach. Using data obtained from the applicationof several Prioritization techniques to several substantial programs, we compare the performance of theseprioritization techniques in terms of e ectiveness, and show how the results of this comparison can be used,together with cost -bene t threshold information, to select a Technique that is most likely to be cost -e then show how an analysis strategy based on classi cation trees can be incorporated into this approachand used to improve the likelihood of Selecting the most cost -e ective results provide insight into the tradeo s between techniques, and the conditions underlying thosetradeo s, relative to the programs, test suites, and modi ed programs that we examine.

If these resultsgeneralize to other workloads, they could guide the informed selection of techniques by practitioners. More2generally, however, the analysis strategy we use demonstrably improves the Prioritization Technique selectionprocess, and can be used by researchers or practitioners to evaluate techniques in a manner appropriate totheir own testing rest of this article is organized as follows. Section 2 describes the test case Prioritization problemin greater detail, presents a measure for assessing rate of fault detection and techniques for prioritizingtest cases, and summarizes related work. Section 3 presents the details of our study, our results, and ourapproaches for Selecting appropriate techniques. Section 4 presents further discussion of our results, Test case PrioritizationRothermel et al.

(Rothermel et al., 2001) de ne the test case Prioritization problem and describe several issuesrelevant to its solution; this section reviews the portions of that material that are necessary to understandthis test case Prioritization problem is de ned as follows:The Test case Prioritization Problem:Given:T, a test suite;PT, the set of permutations ofT; andf, a function fromPTto the : FindT02 PTsuch that (8T00)(T002PT)(T006=T0)[f(T0) f(T00)].Here,PTrepresents the set of all possible prioritizations (orderings) ofT, andfis a function that, appliedto any such ordering, yields anaward valuefor that are many possible goals for Prioritization . In this article, we focus on increasing the likelihood ofrevealing faults earlier in the testing process. This goal can be described, informally, as one of improving atest suite'srate of fault detection.

To quantify this goal, Rothermel et al. introduced a metric, (Rothermelet al., 2001)APFD, which measures the weighted average of the percentage of faults detected over the lifeof the suite. APFD values range from 0 to 100; higher numbers imply faster (better) fault detection a test suite containingntest cases, and letFbe a set ofmfaults revealed byT. LetTFibe the rst test case in orderingT0ofTwhich reveals faulti. The APFD for test suiteT0is given by the equation:APFD = 1 TF1+TF2+:::+TFmnm+12n(1)For example, consider a program with a test suite of ve test cases,AthroughE, such that the programcontains eight faults detected by those test cases, as shown by the table in Figure Consider two ordersof these test cases, orderT1:A{B{C{D{E, and orderT2:C{E{B{A{D. Figures and show Suite case Order: A B C D EPercent Detected FaultsArea = 50%1 2 3 4 5 6 7 8 9 10x xx x x xx x x x x x x x x x Suite FractionTest case Order: C E B A DPercent Detected FaultsArea = 84%A.}}}}}}}}

Test suite and faults exposedB. APFD for prioritized test suite T1C. APFD for prioritized test suite Figure 1: Example illustrating the APFD of faults detected versus the fraction of the test suite used, for these two orders, area inside the inscribed rectangles (dashed boxes) represents the weighted percentage of faults detectedover the corresponding fraction of the test suite. The solid lines connecting the corners of the inscribedrectangles interpolate the gain in the percentage of detected faults. The area under the curve thus representsthe weighted average of the percentage of faults detected over the life of the test suite. Test orderT1 ( ) produces an APFD of 50%, and test orderT2 (Figure ) is a much \faster detecting" test order thanT1, (and in fact, an optimal order) with an APFD of Prioritization TechniquesNumerous Prioritization techniques have been described in the research literature (Elbaum et al.)

Selecting a Cost-Eﬁective Test Case Prioritization Technique

Tags:

Information

Transcription of Selecting a Cost-Eﬁective Test Case Prioritization Technique

Related search queries

Selecting a Cost-Eﬁective Test Case Prioritization Technique

Tags:

Information

Documents from same domain

Related documents

Related search queries