B. Mwangi, K. P. Ebmeier, K. Matthews, J. D. Steele


1460-2156 (Electronic) 0006-8950 (Linking)

Publication year




Periodical Number

Pt 5





Author Address

Division of Neuroscience, Medical Research Institute, University of Dundee, Ninewells Hospital and Medical School, Dundee, UK.

Full version

Quantitative abnormalities of brain structure in patients with major depressive disorder have been reported at a group level for decades. However, these structural differences appear subtle in comparison with conventional radiologically defined abnormalities, with considerable inter-subject variability. Consequently, it has not been possible to readily identify scans from patients with major depressive disorder at an individual level. Recently, machine learning techniques such as relevance vector machines and support vector machines have been applied to predictive classification of individual scans with variable success. Here we describe a novel hybrid method, which combines machine learning with feature selection and characterization, with the latter aimed at maximizing the accuracy of machine learning prediction. The method was tested using a multi-centre dataset of T(1)-weighted ‘structural’ scans. A total of 62 patients with major depressive disorder and matched controls were recruited from referred secondary care clinical populations in Aberdeen and Edinburgh, UK. The generalization ability and predictive accuracy of the classifiers was tested using data left out of the training process. High prediction accuracy was achieved (~90%). While feature selection was important for maximizing high predictive accuracy with machine learning, feature characterization contributed only a modest improvement to relevance vector machine-based prediction (~5%). Notably, while the only information provided for training the classifiers was T(1)-weighted scans plus a categorical label (major depressive disorder versus controls), both relevance vector machine and support vector machine ‘weighting factors’ (used for making predictions) correlated strongly with subjective ratings of illness severity. These results indicate that machine learning techniques have the potential to inform clinical practice and research, as they can make accurate predictions about brain scan data from individual subjects. Furthermore, machine learning weighting factors may reflect an objective biomarker of major depressive disorder illness severity, based on abnormalities of brain structure.