Discrimination.
The discrimination is defined as the model’s ability to distinguish between participants who do or do not experience the event of interest (e.g., disease outcome such as hypertension). A good prediction model can accurately discriminate between those with and without the outcome5. C-statistic, which is equal to the area under the receiver operating characteristic (ROC) curve for binary outcomes, is commonly employed to assess discrimination. ROC curve plots the sensitivity against (1 – specificity) for consecutive cutoffs for the probability of an outcome. The value of a C-statistic (area under ROC curve) points out to the probability that a randomly selected subject who experienced the outcome will have a higher predicted probability of having the outcome occur compared to a randomly selected subject who did not experience the event. The C-statistic can range from 0.5 to 1, with higher values indicating better predictive models. A C-statistic of 0.5 indicates the model’s performance in predicting an outcome is no better than the random chance while a C-statistics of 1 indicates the model perfectly distinguishes those who will experience a certain outcome and those who will not. Generally, the C-statistic of a prediction model ranges from 0.6 to 0.85. A model with a C-statistic ranging from 0.70 to 0.80 is considered adequate, while a range of 0.80 to 0.90 is considered excellent6.
For survival data, an extension of C-statistic called Harrell’s C-statistic is suggested which indicates the proportion of all pairs of subjects who can be ordered such that the subject who survived longer will have the higher predicted survival time than the subjects who survived shorter, assuming that these subject pairs are selected at random. Although C-statistic is insensitive to outcome incidence, one disadvantage of C-statistic is, its interpretation is based on an artificial situation assumption that we have a pair of patients, one with and one without the outcome.