Background:. Cerebral white matter hyperintensities (WMHs) are common in older people. Their presence correlates with cognitive decline and vascular risk factors. Various scales have been developed to quantify the amount and type of WMH, but with few observer reliability studies. We evaluated several scales in different cohorts to determine their observer reliability. Methods:. Two observers independently rated T2-weighted MR images from five groups (total n = 494: normal older subjects ; patients with minor stroke ; young insulin dependent diabetics ; maturity onset diabetics ; and hepatic encephalopathy ), using seven rating scales (Breteler, Fazekas, Longstreth, Mirsen, Shimada, Van Swieten and Wahlund). Inter-observer reliability was determined using Kappa statistics. Results:. Patients with maturity onset diabetes had the most WMHs and young insulin-dependent diabetics the least. Inter-observer reliability varied with the amount of WMH. In maturity onset diabetics (most WMHs) the weighted Kappas were: Breteler 0.74; Fazekas 0.89 and 0.72; Van Swieten 0.76 and 0.88; and in young insulin-dependent diabetics (least WMH): Breteler 0.3; Fazekas 0.2 and 0.24; Van Swieten 0.39 and 0.30. These findings were consistent across the groups. Conclusion:. WMH rating scale performance varied with WMH prevalence, and hence with subject cohort. In patients with most WMHs the apparent better kappas may reflect a “ceiling effect” rather than true better agreement. These factors should be considered in studies where risk factors for, or associations with, the early development of WMHs are being determined.