THOMAS J. McCABE - Literate Programming

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-2, , DECEMBER 1976308 THOMAS J. McCABEA bstract- This paper describes a graph-theoretic complexity measureand illustrates how it can be used to manage and control program com-plexity .The paper first explains how the graph-theory concepts applyand gives an intuitive explanation of the graph concepts in programmingterms. The control graphs of several actual Fortran programs are thenpresented to illustrate the conelation between intuitive complexity andthe graph-theoretic complexity .Several properties of the graph-theoretic complexity are then proved which show, for example, thatcomplexity is independent of physical size (adding or subtractingfunctional statements leaves complexity unchanged) and complexitydepends only on the decision structure of a issue of using non structured control flow )s also discussed.

Acharacterization of nonstructured control graphs is given and a methodof measuring the "structuredness" of a program is developed. The re-lationship between structure and reducibility is illustrated with last section of this paper deals with a testing methodology usedin conjunction with the complexity measure; a testing strategy is de-fined that dictates that a program can either admit of a certain minimaltesting level or the program can be structurally Temls-Basis, complexity measure, control flow, decomposi-tion, graph theory , independence, linear, modularization, Programming ,reduction, software, A COMPLEXITY MEASUREIn this sl~ction a mathematical technique for program mod-ularization will be developed. A few defmitions and theoremsfrom graph theory will be needed, but several examples willbe presented in order to illustrate the applications of complexity measure approach we will take is to mea-sure and control the number of paths through a program.

Thisapproach, however, immediately raises the following nastyproblem: "Any program with a backward branch potentiallyhas an infinite number of paths." Although it is possible todefme a set of algebraic expressions that give the total numberof possible paths through a (structured) program,l using thetotal number of paths has been found to be impractical. Be-cause of this the complexity measure developed here is defmedin terms of basic paths-that when taken in combinatio~ willgenerate every possible following mathematical preliminaries will be needed, allof which can be found in Berge [I] .Definition 1: The cyclomatic number V(G) of a graph Gwith n vertices, e edges, and p connected components isv(G) = e -n + 1: In a strongly connected graph G, the cyclo-matic number is equal to the maximum number of linearlyindependent applications of the above theorem will be made asfollows: Given a program we will associate with it a directedgraph that has unique entry and exit nodes.

Each node in thegraph corre,~ponds to a block of code in the program where theflow is sequential and the arcs correspond to branches taken inthe program. This graph is classically known as the programcontrol graph (see Ledgard [6] ) and it is assumed that eachnode can be reached by the entry node and each node canreach the e:dt node. For example, the following is a program /control graph with entry node "a" and exit node "f."I. INTRODUCTIONTHERE is a critical questi?n facing software engineeringtoday: How to modularlZe a software system so theresulting modules are both testable and maintainable?That the issues of testability and maintainability are impor-tant is borne out by the fact that we often spend half of thedevelopment time in testing [2] and can spend most of ourdollars maintaining systems [3].

What is neededjs a mathe-matical technique that will provide a quantitative basis formodularization and allow us to identify software modulesthat will be difficult to test or maintain. This paper reportson an effort to develop such a mathematical technique whichis based on program control currently used practice that attempts to ensure a reason-able modularization is to limit programs by physical size( , IBM-50 lines, TRW-2 pages). This technique is notadequate, which can be demonstrated by imagining a 50 lineprogram consisting of 25 consecutive "IF THEN" a program could have as many as million distinctcontrol paths, only a small percentage of which would prob-ably e'ler be tested. Many such examples of live Fortran pro-grams that are physically small but untestable have been iden-tilled and analyzed by the tools described in this paper.

-,II,,G:Manuscript received April 10, author is with the Department of Defense, National SecurityAgency, Ft. Meade, MD See the AppendixMC CASE: A COMPLEXITY MEASURE309 CYCtoMATIC COMPLEXITY*v = e -n + 2pv=1-2+2=1v= +2=2 Theorem 1 is applied to G in the following way. -Imagine thatthe exit node (f) branches back to the entry node (a). Thecontrol graph G is now strongly connected (there is a pathjoining any pair of arbitrary distinct vertices) so Theorem 1applies. Therefore, the maximum number of linearly indepen-dent circuits in G is 9-6+2. For example, one could choosethe following 5 independent circuits in G:Bl: (abefa), (beb), (abea), (acfa), (adcfa).It follows that Bl forms a basis for the set of all circuits in Gand any path through G can be expressed as a linear combina-tion of circuits from Bl.

For instance, the path (abeabebebef)is expressable as (abea) +2(beb) + (abefa). To see how thisworks its necessary to number the edges on G as inNow for each member of the basis Bl associate a vector asfollows:Notice that the sequence of an arbitrary number of nodes al-ways has unit complexity and that cyclomatic complexityconforms to our intuitive notion of "minimum number ofpaths." Several properties of cyclomatic complexity are statedbelow:I) v(G)~ ) v(G) is the maximum number of linearly independentpaths in G; it is the size of a basis ) Inserting or deleting functional statements to G does notaffect v(G).4) G has only one path if and only if v(G) = ) Inserting a new edge in G increases v(G) by ) v(G) depends only on the decision structure of WORKING EXPERIENCE WITH THECOMPLEXITY MEASUREIn this section a system which automates the complexitymeasure will be described.

The control structures of severalPDP-1O Fortran programs and their corresponding complexitymeasures will be aid the author's research into control structure complex-ity a tool was built to run on a that analyzes thestructure of Fortran programs. The tool, FLOW, was writtenin APL to input the source code from Fortran ftles on would then break a Fortran job into distinct subrou-tines and analyze the control structure of each subroutine. Itdoes this by breaking the Fortran subroutines into blocks thatare delimited by statements that affect control flow: IF, GOTO ,referenced LABELS, DO, etc. The flow between the blbCks isthen represented in an n by n matrix (where n is the numberof blocks), having a 1 in the i-jth position if block i can branchto block j in 1 step.

FLOW also produces the "blocked" listingof the original program, computes the cyclomatic complexity,and produces a reachability matrix (there is a 1 in the i-jthposition if block i can branch to block j in any number ofsteps). An example of FLOW'S output is shown INTEGER(A-Z)COMMON I ALLOC I MEM( ),LM,LU,LV,LW,LX,LY,LQ,LWEXNCHARS,NWORDS DIMENSION MEMORY( ),INHEAD(..),ITRANS(128)TYPE 1 FORMATl'DOMOLKI STRUCTURE FIL:: NAME?' $)NAMDML=OACCEPT 2, (AS )CALL ALCHAII(ICHAN)CALL IFIL::(ICHAN,'DSK' ,NAMDML,'DAT',O,O)CALL READB(ICliAN,I!IHrAD,I~2,NREAD,$990,$99Q )NCHARS=INH AIJ( liNWORDS=INHEAD(:)12345 678910(abefa) lOO lOO 0 1 0 1(beb) 000 1 lOO 00 0(abea) lOO 100 000 0(acfa) 0 1 0 0 0 1 0 00 1(adcfa) 00 1 00 1 1 00 1 The path (abea(be)3fa) corresponds to the vector 2004200111and the vector addition of (abefa), 2(beb), and (abea) yieldsthe desired using Theorem lone can choose a basis set of circuitsthat correspond to paths through the program.

The set B2 is abasis of program : (abef), (abeabef), (abebef), (acf), (adcf).linear combination of paths in B2 will also generate any example,(abea(be )3 f) = 2(abebef) -(abef)and(a(be)2abef) = (a(be)2f) + (abeabef) -(abef).The overall strategy will be to measure the complexity of aprogram by computing the number of linearly independentpaths v(G), control the "size" of programs by setting an upperlimit to v(G) (instead of using just physical size), and use thecyclomatic complexity as the basis for a testing methodology .A few simple examples may help to illustrate. Beloware thecontrol graphs of the usual constructs used in structured pro-gramming and their respective complexities.*The role of the variable p will be explained in Section IV .For theseexamples assume p = TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976 HTCT=(NCHAitS+7)-NWORj)SLTOT={NCHARS+5)" NWORDS' BLOCK 00.}

THOMAS J. McCABE - Literate Programming

Tags:

Information

Advertisement

Transcription of THOMAS J. McCABE - Literate Programming

Related search queries

THOMAS J. McCABE - Literate Programming

Tags:

Information

Advertisement

Documents from same domain

Software Documentation - Literate Programming

Related documents

User’s Guide for SNOPT Version 7.6: Software for …

Abstract Algebra Theory and Applications

A Computational Introduction to Number Theory …

BRANCH-AUTOMATION AND ROBOTICS

1.1.22 UEL User subroutine to define an element. …

MATHEMATICS (CLASSES XI –XII)

Related search queries