Clinicians are principally interested in the interpretation of a test result. Thus, the clinical utility of sensitivity and specificity are limited by the fact that you need outcome (gold standard) data to calculate them. Clinicians need to know how likely disease is, given the result of their test. Predictive values are useful here.
The positive predictive value (+PV) of a test is defined as the proportion of patients with positive test results who truly have the disease, or algebraically from Fig. 1, a/(a+b). From Example A in Fig. 2, the +PV 30/35 (0.86) is very good. The likelihood of coronary artery disease is very high given a positive exercise ECG test. The negative predictive value (—PV) is the proportion of patients with a negative test who do not have the disease, calculated as d/(c+d). In Example A, the —PV is 45/65 (0.69).
One important clinical issue is to be aware of the approximate prevalence of the problem in your practice population. The predictive values of a test vary with the underlying prevalence of the disease in the target population. This is illustrated in Fig. 2 by comparing Example B with Example A. In Example B, a community sample of 1000 was screened using an exercise ECG and the true disease status also determined by angiography. The sensitivity and specificity of the test are identical in both Examples A and B (Se = 0.6, Sp = 0.9). However, the positive predictive value, which was diagnostically useful at 0.86 in Example A, has fallen to 0.38 in the screened population in Example B (60/160). Thus, in the unselected community sample where the underlying prevalence of disease was only 100/1000 (or 10%) the proportion of patients with a positive test who truly had the disorder had fallen to 38% — in other words, in this sample, conducting the test did not contribute to diagnostic certainty about the presence of atherosclerotic disease.
Even if the test were very specific, a low prevalence of disease in the underlying population would produce a low positive predictive value. Test positive results in this setting will be largely false positives (100/160 in Example B).
Clinical judgment and examination increases the 'prevalence of disease'. Thus, exercise ECGs applied to the unselected community would imply a low prevalence and poor +PV. Clinical judgments can increase the prevalence — if one targeted middle aged males who smoked and were hypertensive, then the yield from an exercise ECG test would be much higher (a +PV approaching that in Example A).
This process of 'increasing the likelihood of disease' (prevalence) selects subjects so that diagnostic tests are more useful. Clinical signs and a history of post-prandial pain will be needed before gastric endoscopy is recommended. More detailed test results will be essential before submitting patients to a liver biopsy to diagnose chronic active hepatitis — the preliminary tests, including LFTs and ultrasound results increase the likelihood of hepatitis and make the (potentially hazardous) biopsy test worthwhile.