Transcription of www.static99.org
1 Psychological AssessmentCommunicating the Results of Criterion ReferencedPrediction Measures: Risk Categories for the Static-99R andStatic-2002R Sexual Offender Risk Assessment ToolsR. Karl Hanson, Kelly M. Babchishin, L. Maaike Helmus, David Thornton, and Amy PhenixOnline First Publication, September 12, 2016. , R. K., Babchishin, K. M., Helmus, L. M., Thornton, D., & Phenix, A. (2016, September12). Communicating the Results of Criterion Referenced Prediction Measures: Risk Categoriesfor the Static-99R and Static-2002R Sexual Offender Risk Assessment Tools. PsychologicalAssessment. Advance online publication. Communicating the Results of Criterion Referenced Prediction Measures:Risk Categories for the Static-99R and Static-2002R Sexual Offender RiskAssessment ToolsR.
2 Karl HansonPublic Safety Canada, Ottawa, Ontario, CanadaKelly M. BabchishinUniversity of OttawaL. Maaike HelmusWandering VagabondDavid ThorntonSand Ridge Secure Treatment Centre, Madison, WisconsinAmy PhenixMorro Bay, CaliforniaThis article describes principles for developing risk category labels for criterion referenced predictionmeasures, and demonstrates their utility by creating new risk categories for the Static-99R and Static-2002R sexual offender risk assessment tools. Currently, risk assessments in corrections and forensicmental health are typically summarized in 1 of 3 words: low, moderate, or high. Although these risklabels have strong influence on decision makers, they are interpreted differently across settings, evenamong trained professionals.
3 The current article provides a framework for standardizing risk communi-cation by matching (a) the information contained in risk tools to (b) a broadly applicable classificationof riskiness that is independent of any particular offender risk scale. We found that the new, commonSTATIC risk categories not only increase concordance of risk classification (from 51% to 72%) theyalso allow evaluators to make the same inferences for offenders in the same category regardless of whichinstrument was used to assign category membership. More generally, we argue that the risk categoriesshould be linked to the decisions at hand, and that risk communication can be improved by groundingthese risk categories in evidence-based :standards, criterion referenced tests, Static-99R, Static-2002R, risk communicationWe are greatly at a loss for a standard whereby to measure cold.
4 Thecommon instruments show us no more than the relative coldness ofthe air, but leave us in the dark as to the positive degree thereof;whence we cannot communicate the idea of any such degree toanother person. Robert Boyle (1665), quoted in Landsberg (1964, pp. 42 43)Many of us involved with applied psychological assessment canempathize with Boyle s concerns. Boyle was writing at a time(17th century) when there were more than 35 different temperaturescales in use, and whoever constructed a new type of thermometersimultaneously created a new scale to go along with it (Landsberg,1964). Contemporary psychological assessment faces a similarchallenge. Although there are a large number of measures thatreliably rank individuals on constructs such as anxiety or antisocialtraits, we have yet to establish consensus for communicating theresults (Blanton & Jaccard, 2006).
5 The current study was motivated by our need, as test developers,to update the category labels for certain actuarial sexual offenderrisk assessment tools, specifically, Static-99R and Static-2002R(Hanson & Thornton, 2000; Helmus, Thornton, Hanson, & Bab-chishin, 2012). The primary purpose of these tools is to estimatethe relative risk of sexual recidivism based on commonly availabledemographic and criminal history information. Like other empir-ically derived actuarial risk tools, norms for these tools are peri-R. Karl Hanson, Public Safety Canada, Ottawa, Ontario, Canada; KellyM. Babchishin, Royal s Institute of Mental Health Research, University ofOttawa; L. Maaike Helmus, Wandering Vagabond; David Thornton, SandRidge Secure Treatment Centre, Madison, Wisconsin; Amy Phenix, MorroBay, for this project was provided in part by the Canadian Institutefor Health Research (Banting Postdoctoral Fellowship).
6 The views ex-pressed are those of the authors and not necessarily those of Public SafetyCanada or the Sand Ridge Secure Treatment Centre. R. Karl Hanson, KellyBabchishin, L. Maaike Helmus, and David Thornton are coauthors of theStatic-99R and Static-2002R risk tools. R. Karl Hanson, L. MaaikeHelmus, David Thornton, and Amy Phenix are certified trainers for theseinstruments. The copyright for Static-99R and Static-2002R is held by theGovernment of Canada. We thank Anton Schweighofer and Ian Barsetti forhelpful comments on previous versions of this concerning this article should be addressed to R. KarlHanson, Research Division, Community Safety and Countering CrimeBranch, Public Safety Canada, 340 Laurier Avenue, West, Ottawa, On-tario, Canada K1A 0P8.
7 E-mail: Assessment 2016 The Crown in Right of Canada2016, Vol. 28, No. 8, 0001040-3590/16/$ updated as new and better research becomes question was whether, then how, should we revise their riskcategory labels?There has been considerable discussion about how to commu-nicate the results of norm referenced measures, for which scorescan be interpreted as the position of an individual within a definedgroup (Crawford, Garthwaite, & Slick, 2009; Oosterhuis, van derArk, & Sijtsma, 2016). Such an interpretation, however, poorlyexpresses the information contained in criterion referenced predic-tion measures, in which the goal of the assessment is to estimatethe likelihood of a significant outcome, such as suicide (Berman &Silverman, 2014), major depression (King et al.)
8 , 2008), success inlaw school (Thomas, 2003), or, in our case, recidivism by sexualoffenders. Prediction measures are also different from diagnosticmeasures ( , x-rays for brain tumor), which can be evaluated interms of diagnostic accuracy ( , false positives, false negatives,positive predictive value; Swets, 1988). For prognostic measures,in contrast, the outcome of interest is not present at the time ofassessment and may never happen ( , risk of breast cancer;Moons, Royston, Vergouwe, Grobbee, & Altman, 2009; for re-view, see Helmus & Babchishin, in press).There are no universal standards for labeling relative or absolutelikelihoods of adverse events, nor do we expect there ever will 10% chance of a hurricane is high risk (Monahan & Steadman,1996); a 10% chance of rain is not.
9 A 10% chance of your car sbrakes failing is catastrophic (for reviews, see Hilton, Scurich, &Helmus, 2015; Visschers, Meertens, Passchier, & de Vries, 2009).We believe, however, that there are certain common principlesworth considering when developing risk category labels within anyspecific domain. Although we focus on offender risk assessment,some of these principles may also be helpful when consideringcategory labels in other areas of applied psychological , we argue that certain quantitative information (per-centile ranks, risk ratios, estimates of the rates of outcomes) shouldinform the meanings ascribed to risk category concept of risk is ubiquitous in applied decision making,and is a dominant concern of business and industry.
10 For example,the International Organization for Standardization (ISO 31000)definesriskas the effect of uncertainty on objectives (Gjerdrum& Peter, 2011). A very similar definition has been adopted byproponents of the structured professional judgment (SPJ) approachto violence risk assessment (Douglas & Ogloff, 2003). In the userguide for the HCR-20V3, for example, risk is defined as a threator hazard that is incompletely understood, and thus whose occur-rence can be forecast only with uncertainty (Douglas, Hart, Web-ster, & Belfrage, 2013, p. 4). From this perspective, it makes littlesense to associate risk categories labels with precise, numericestimates of recidivism risk. If risk is fundamentally uncertainty,then recidivism estimates that are not close to 0 or 1 are expres-sions of ignorance.