University of Chicago Chicago, Illinois, United States
Disclosure(s):
Simar Narula: No financial relationships to disclose
Background and Aim Diabetes has excellent treatments that can reduce morbidity and mortality, but there are multiple comorbid conditions that can degrade the ability of the hemoglobin A1c (A1c) and fasting plasma glucose (FPG) to accurately diagnose diabetes. The two-hour oral glucose tolerance test (OGTT) is the third modality for diagnosing diabetes, but is used infrequently. Determining who would benefit from an OGTT could increase diagnosis. Previously, we trained a machine learning model to predict diabetes in National Health and Nutrition Examination Survey (NHANES) subjects using standard clinical features. Here, we aim to provide external validation of this model.
Methods We used a gradient boosted machine learning model (XGBoost) to predict diabetes from standard clinical features using 13,800 adult participants from multiple NHANES cycles from 2005-2016 who had A1c, FPG, and two-hour oral glucose tolerance test (OGTT) data available. In the test set, we compared the area under the receiver operating characteristic curve (AUC) to A1c, FPG, and a combination of the two. We performed external validation by applying the NHANES-trained model to 3,049 subjects from the Diabetes Prevention Program Outcomes Study (DPPOS) with matching features available.
Results Comparing the performances of the NHANES-trained XGBoost model on the DPPOS dataset to the internal NHANES validation, we found the NHANES internal validation had an AUC of 0.995 and the external DPPOS validation had an AUC of 0.946. At Youden’s Index the NHANES model had a sensitivity of 0.91 and a specificity of 0.99 on internal validation and sensitivity of 0.85 and a specificity of 1.00 on external validation. In external validation, the NHANES model outperformed benchmark predictors, including logistic regression using A1C and fasting plasma glucose (AUC = 0.935), only fasting plasma glucose (0.881), and only A1c (0.858). The model’s sensitivity and specificity at Youden’s Index was also superior, outperforming the logistic regression which had a sensitivity of 0.91 and specificity 0.94. It also outperformed the A1c, which at Youden’s index (A1c = 6.25%) had a specificity of 0.78 and sensitivity of 0.85, and the FPG (FPG = 116.5 mg/dL) which had a specificity of 0.81 and sensitivity of 0.98.
Conclusion This model describes a promising approach for identifying individuals for whom the A1c and FPG may not adequately diagnose diabetes and who may benefit from using an OGTT. Compared with standard laboratory diabetes predictors, the model achieved higher performance, and stronger specificity and sensitivity cutoffs. Future steps focus on real-world evaluation to assess clinical utility and implementation feasibility in real world settings.These findings suggest that this machine learning model could have real world applications in the future, as a screening tool for diabetes risk for undiagnosed populations.