6. Statistical methods
Continuous variables were represented as mean ± standard deviation (SD) when following normal distribution, and were represented by P50 (P25, P75) otherwise. Categorical variables were expressed as frequency (%). Unpaired Student-t test or Mann-Whitney nonparametric test was used to compare between groups for continuous variables, and Pearson chi-square test or Fisher’s exact test was used for categorical variables. The intra-class correlation coefficient (ICC) was used to analyze the agreement of measurements between two observers.
The split sample function automatically divided the original dataset into a modeling group and verification group at a ratio of 75:25. The multi-factor logistic regression method was used to establish a prediction model, and the best model parameters were selected according to the minimum Akaike’s information criterion (AIC). The variable introduction standard was P <0.3. The independent variable was screened by collinearity (variation inflation factor, VIF), and the elimination criterion was VIF >10. The odds ratio (OR) and 95% confidence interval (CI) were calculated. The nomogram of the prediction model was plotted, which could visually display the prediction result of post-CPVA recurrence for each PAF patient. In addition, a calibration curve was plotted to show the prediction accuracy of the nomogram. The receiver operating characteristic (ROC) curve of the prediction model was plotted, and the area under the curve (AUC) and its 95% CI were obtained. The z statistic and Hanley and McNeil procedures[14] were used to compare the AUCs of modeling group and verification group. All statistical analysis was performed using R3.4.3 (http://www.R-project.org). All statistical tests were two-sided tests, and P <0.05 was considered statistically significant.