Transcription of Errors, Warnings, and Notes (Oh My): A Practical …
1 1 errors , warnings , and Notes (Oh My) A Practical guide to debugging SAS ProgramsSusan J. Slaughter, University of California Extension, Davis, CALora D. Delwiche, IT/ANSA, University of California, Davis, CAWhy a paper on debugging SASprograms?Most of the documentation about the SAS Systemdoesn t even mention bugs, as if debugging wasn tworth talking about. This paper, on the other hand,is based on the belief that debugging is a good wayto get insight into how SAS works. Once youunderstand why you got an error, you ll be betterable to avoid it in the future. In other words, peoplewho are good debuggers are good can have different origins; some areaccidentally built into the software by developers,others are introduced by programmers. Recently,one of the authors of this paper had a conversationabout this topic with her father who is an aerospaceengineer but not knowledgeable about the SASS ystem.
2 The conversation went like this:Susan: I m writing about how to debug : I thought they would have gotten the bugsout of SAS by , SAS Institute has done a good job ofgetting the bugs out of SAS , nobody has yet figured out how toget the bugs out of SAS System even fixes some mistakes made byprogrammers. For example, SAS has gotten so smartover the years that it is now almost impossible to getan error by misspelling a keyword. If you misspell akeyword in a SAS program, SAS will almost alwaysfigure out what you meant to say and run thestatement correctly in spite of your poor typing SAS can t fix all programming errors , so thispaper discusses some of the most common bugsand how to exterminate them. 1 debugging Myself by Hayes (1995) contains anentertaining discussion of human is a bug?Scientists have identified approximately 1 1/4 millionspecies of animals.
3 Of those about 3/4, or 932,000,are insects. However, only the 82,000 speciesbelonging to the order Hemiptera are considered byscientists to be true bugs (McGavin, 1993).Fortunately, a taxonomy of SAS bugs would notidentify nearly so many aside, a bug is an error in a computerprogram that causes an undesirable, usuallyunexpected, result. One way of classifyingcomputer bugs is to divide them into three types oferrors: syntax, data, and logic. Syntax errors resultfrom failing to follow SAS s rules about the waykeywords are put together to make statements. Withdata errors you have a program that is syntacticallysound but fails because of data values that do not fitthe program as it was written. With logic errors youhave a program that runs, and data that fits, but theresult is wrong because the program doessomething different than you bugs discussed in this paper can be classifiedas:Syntax missing semicolon uninitialized variable and variable not foundData missing values were generated numeric to character conversion invalid data character field is truncatedLogic DATA step produces wrong results but no to the SAS LogThe first and most important rule in debugging SASprograms is to always, always check the SAS running a SAS program many people turnimmediately to the output.
4 This is understandable,2but not advisable. It is entirely possible and soonerof later it happens to all of us to get output thatlooks fine but is totally bogus. Often checking theSAS log is the only way to know whether a programhas run logs contain 3 types of messages: Errors, warnings , and you get an error message in your program, you willknow it. Error messages get your attention becauseSAS will not run a job with one of these bugs. Errormessages are not quiet, discrete, or subtle; they arethe loud, rabble-rousers of SAS messages. Thismessage, for example:ERROR: No CARDS or INFILE a program dead in its tracks. This messagetells you that SAS could not find any data to read withthe INPUT are less dire than errors . SAS printswarnings in your log and then goes ahead and runsthe job anyway. Many people, including someprofessional programmers, try to ignore t you be one of them.
5 Sometimes the situationsthat result in warnings are indeed harmless; othertimes they indicate grave problems which, ifunresolved, will render your results worthless. Youshould check all warnings to see if they are harmlessor hazardous. This message:WARNING: The data set may beincomplete. When this step was stopped therewere 0 observations and 3 you that SAS did run a DATA step, but for somereason there are zero observations. This could beOK, but generally speaking when you go to thetrouble of creating a data set, you want some data are the most innocuous messages that SASwrites in your SAS log. They simply inform you of thestatus of your program. Notes contain informationsuch as the number of records input from an externalfile, or the number of observations written in a SASdata set. Don t be fooled by demure little Notes ; theyare a critically important way of catching messages:NOTE: 29 records were read from the infile' '.
6 The minimum record length was 27. The maximum record length was : The data set has 14observations and 3 you that while 29 records where read from a rawdata file, the resulting SAS data set contains only 14observations. If you were expecting only 14observations, then this would be fine. But if youwere expecting 29 observations, one observationfor each input record, then this would tip you off thatsomething went type of note can help you write efficientprograms. At the end of every step SAS prints anote similar to this:NOTE: The PROCEDURE PRINT used you are running a one-time report, you may notcare, but if you run the same program over and overthen you may want to check your Notes to see whichsteps can benefit the most from species dataThe data for the next few examples appear in Table1. Each observation contains data about one orderin the class Insecta (La Plante, 1996).
7 The variablesare the name of the order (ORDER), the number ofspecies in that order found in North America(NASP), and the number of species found outsideNorth America (OUTSP).3 Table 1 Species data. ORDER NASP OUTSP Thysanura 20 230 Diplura 30 370 Protura 30 70 Collembola 325 1675 Ephemeroptera 550 950 Odonata 425 4575 Plecoptera 34 1266 Grylloblattodea . 6 Saltatoria 110 21890 Phasmida ..Dictyptera ..Isoptera 45 .Dermaptera 20 1080 Embioptera 10 140 Psocoptera 150 950 Zoraptera 2 17 Mallophaga 320 2280 Anoplura 65 285 Thysanoptera 625 2375 Hemiptera 8750 46250 Neuroptera 350 4350 Mecoptera 70 280 Trichoptera 950 3550 Lepidoptera 10500 189500 Diptera 16700 68300 Siphonaptera 250 850 Hymenoptera 14600 90400 Coleoptera 27000 530000 Strepsiptera 120 180 The missing semicolonEven the newest of SAS programmers knows thatevery SAS statement ends with a semicolon.
8 So it isironic that one of the most common bugs is themissing most SAS error messages are clear and easyto understand, the hallmark of a missing semicolon isconfusion. Missing semicolons often produce a longstream of baffling messages. In the followingexample, the absence of a semicolon at the end ofthe DATA statement causes two error messages,three warnings , and a suspicious DATA species2 INFILE ' '; -------------- 2003 INPUT order $ 1-15 nasp outsp;4 RUN;ERROR 200-322: The symbol is not : No CARDS or INFILE : The SAS System stopped processing thisstep because of : The data set may beincomplete. When this step was stoppedthere were 0 observations and 3 : Data set was notreplaced because this step was : The data set may beincomplete. When this step was stoppedthere were 0 observations and 3 message No CARDS or INFILE statement isespecially odd since there obviously is an INFILE statement.
9 Without a semicolon the DATA statement becomes concatenated with the INFILE statement. SAS then interprets the keyword INFILEas a data set name in the DATA statement resultingin the warning data set may beincomplete. If you find that the messages in your log make nosense, check for missing variable and variablenot foundThese two related messages tell you that SAS wasunable to find one of your variables. The first timeyou see one of these messages you will probablywonder what SAS is babbling about, after all youremember creating the the following SAS log, the INPUT statement readsthe species data using the variable name NASP forthe number of species in North America. Then asubsetting IF statement contains the misspelledvariable name DATA species (KEEP = order worldsp);2 INFILE ' ';3 INPUT order $ 1-15 nasp outsp;4 IF naspec > 100;5 worldsp = nasp + outsp;6 RUN;NOTE: Variable NASPEC is SAS is unable to find a variable in a DATA step, SAS prints the variable-is-uninitializedmessage.
10 Then SAS creates the variable, sets itsvalues to missing for all observations, and runs theDATA step. It s nice that SAS runs the DATA step,but you probably don t want the variable to havemissing values for all more serious problem ensues when SAS is unableto find a variable in a PROC step. In the followingexample, SAS cannot find the variable NASP. Thisvariable did exist, but was accidentally dropped in theprevious DATA step because it was not listed in theKEEP option. SAS prints the variable-not-foundmessage and does not run the procedure at PROC PRINT DATA=species;8 VAR order nasp worldsp;ERROR: Variable NASP not RUN;Another version of the variable-not-found messageappears as a warning when the problem occurs in aless critical statement such as a LABEL this is a warning, not an error, SAS runs causes of the variable-is-uninitialized andvariable-not-found messages include: A misspelled variable name.