Claiming these tests are ‘good’ or ‘bad’ because of their LR is misleading since their clinical interpretation relies equally on the pre-test odds (except for LRs of 1 which are genuinely useless as they don’t alter the post-test odds at all.) Beyond that, we can only really use these LR numbers
in isolation to compare the utility of two different tests, ie, ‘how much better is this test than that test?’ Stating that the test is of ‘limited’ or ‘moderate’ utility without reference to the pre-test odds is essentially trying to describe if some selleck number (which can range from 0 to 1, or 1 to infinity, Altman and Bland, 1994) is ‘large’ or ‘small’. This paper has documented (very well in my opinion) LR for these clinical tests, and I think this Bortezomib manufacturer is how the data should have been presented. “
“We thank Dr Whiteley for his interest in our study. Dr Whiteley argues that likelihood ratios cannot
be used to make judgements about the accuracy of a diagnostic test because the post-test probability generated by a diagnostic test depends on the pre-test probability. Consequently he believes that our conclusion – that provocative wrist tests are of limited value for diagnosing wrist ligament injuries – misrepresents the data. Post-test probabilities do, of course, depend on pre-test probabilities (Herbert et al 2011). Likelihood ratios quantify the extent to which a diagnostic test modifies pre-test probabilities. Accurate diagnostic tests substantially modify pre-test probabilities, especially in cases of uncertainty (when pre-test probabilities are neither very low nor very high). In contrast, inaccurate tests (tests which carry little diagnostic information) have very little effect on pre-test probabilities. The descriptors that we used to describe test accuracy were based on those recommended by Portney and Watkins (2009). In our opinion these descriptors are, if anything,
a little too generous. By way of illustration, consider the best positive Mephenoxalone likelihood ratio we reported: MRI diagnosis of TFCC injuries had a positive likelihood ratio of 5.6, so it was classified as a ‘moderately useful’ test. If we were to use this test on a person for whom we felt completely ambivalent about the diagnosis of TFCC injury (ie, on a person for whom the pre-test probability was 50%) the test would change the estimated probability of TFCC injury to 84%, a change in probability of 34%. This test would aid diagnosis a bit but not much – with a post-test probability of 84% we would still not be confident that the person does have a TFCC injury. So a descriptor of ‘moderately useful’ seems, if anything, generous. The absolute change in probability produced by a test finding is always greatest for a pre-test probability of 50%, so in all other scenarios this test modifies the probability of the diagnosis by less than 34%.