Abstract

We write in response to Matthew Large's letter to the editor, which was recently published in Australasian Psychiatry, 1 and acknowledge his comments about the lack of empirical research into the impact of the introduction of violence risk assessments on violent behaviour. It is important that more research is conducted that examines whether structured risk assessment instruments actually prevent violence. In this letter, we will respond to one of Matthew Large's criticisms, the description of statistics in our paper 2 and justify why particular sensitivity and specificity scores were not reported. We will challenge the assertion that the instrument is inaccurate and comment on what we believe is a misguided criticism of structured risk assessment methodologies. We will also comment on Large's use of the term ‘dangerous’.
First, the issue of sensitivity and specificity; the receiver operating characteristic (ROC) statistic is used to assess data that are comprised of a continuous predictor variable (in this case, daily DASA assessments, which range from 1 to 7) and a dichotomous dependent measure (in this case, whether the patient was aggressive or not in the 24 hours after their DASA assessment). To establish a measure's overall level of accuracy, a ROC graph may be plotted by generating specificity and sensitivity scores for each point in the predictor variable (e.g. for each of the possible DASA scores). The area under the curve (AUC) of the ROC graph is an index of the overall accuracy of the predictor (i.e. the DASA). As the DASA is a continuous 7-point scale, all values are important; a 0 indicates that the level of risk for imminent aggression is low. The risk for violence increases corresponding with increases in the total DASA score. As such, the type and necessity of intervention varies according to total score – the higher the score the greater the level of risk. When the score is zero, there would be less need for additional treatments and management of the patient.
Though it is fair to say that there is a linear relationship between total DASA score and risk, we do not believe there should be a particular score indicating ‘at risk’ and we certainly do not suggest that labelling patients as ‘dangerous’, as Large has suggested, 1 is appropriate. However, we have previously argued that DASA scores of 2–3 would indicate a moderate level of risk and scores greater than 3 would suggest that the patient is at a high level of risk of imminent aggression. These suggestions are based on findings from the development study 3 in which odds ratios for different DASA scores were calculated. In that study, patients who scored 7 (as compared to patients who scored 0) were 29 times more likely to be violent in the 24 hours following assessment. Patients who scored 6 were 15.7 more likely to be violent. The other odds ratios were: 3.17 for scores of 5, 4.48 for scores of 4, 2.79 for scores of 3, 2.69 for scores of 2, and 1.31 for scores of 1.
We have previously argued that the DASA should be used “in conjunction with, and to improve upon, clinical judgments. Furthermore, the presence of characteristics measured by these risk assessment instruments should, with other considerations, indicate the need for intervention rather than determine a particular type of intervention. More assertive intervention is, however, indicated for higher total scores”. 4, p.13 Corresponding with these suggestions, the accuracy of the DASA is best measured with the AUC statistic rather than using 7 specificity and seven 7 scores for each of the different types of aggressive behaviour studied. We accept that the DASA does not have perfect predictive power; no measure of human behaviour does. However, structured risk assessment methods are consistently stronger predictors than unstructured risk assessment methods. Furthermore, structured methods create a systematic structure that increases reliability and transparency. Finally, the DASA should not be used to justify punitive treatment. The first task following assessment, where the results indicate elevated risk, is to determine what has contributed to the high risk state and then to resolve these issues. Management and restriction should be secondary, albeit important, considerations.
Large's criticism of the accuracy of structured risk assessment measures appears to be biased; the ‘effect size’ for violence risk assessment is superior to that of many other medical and psychological activities that are used without the controversy that surrounds violence risk assessment research. It has been long established in comparisons of a variety of prediction tasks, that the effect size for violence risk assessment was between 0.91 and 1.19 across various studies, which surpassed that of chemotherapy for breast cancer (d = 0.08 to 0.11), the effects of by-pass surgery on angina (d = 0.80), psychotherapy in general (d = 0.76), and the effect of electroconvulsive therapy on depression (d = 0.80). 5,6
Given these observations, we must speculate as to the reasons why the ‘accuracy’ of violence risk assessment is singled out for special attention and criticism. Tony Maden, an eminent British forensic psychiatrist, has written rather eloquently about this issue. According to Maden 7 : “I have been surprised by the strength of feeling expressed by some opponents of standardised risk assessment. On the face of it, such opposition is a bizarre response to what amounts to nothing more than a special investigation. It is hard to imagine taking to the barricades in opposition to the Beck Depression Inventory, liver function tests or neuroimaging. The difference is that standardised risk assessment deals with violence and offending, so moral and emotional considerations intrude on scientific objectivity.” 7, p.201 According to Maden, 6 “The stigmatisation issue relates to the novelty of the tests. Again, the analogy with IQ testing is useful. In the early days, clinicians gave too much weight to intelligence tests, compared with other aspects of a case. This led to a backlash in which psychologists would refuse to report IQ scores for fear of the damage that might be done to the patient's treatment. The area has now stabilised, with recognition of both the benefits and limitations of IQ testing. It is reasonable to hope that the same sense of perspective will emerge in standardised risk assessment, once the dust settles”. 7, p.202 Though some will argue that violence risk assessment is different because the outcome of the assessment (as compared with outcome for some other medical procedures) may result in a restriction in liberty, we would argue that decisions about risk and the need for incapacitation are made regularly and often using less reliable and valid methods. There are many well-documented cases of unstructured approaches resulting in inaccurate and overly cautious assessments of risk. 8 To imply that structured methods are more punitive is wrong.
Finally, with reference to Large's introduction of the term ‘dangerousness’, we believe that labelling patients as ‘dangerous’ is unhelpful. We have never used this term and others in the field have assiduously warned against using the term ‘dangerousness’. 9 Risk should be expressed in terms of probabilities (not dangerousness).
The risk assessment field is controversial. However, risk assessments are necessary, important and they are conducted regularly. When they are necessary, the task is for clinicians to use the most appropriate method. We argue that structured risk assessment methods, used properly, are the most accurate and best method to support violence risk assessments.
