Skip to main content

Table 3 Performance of the scar prediction models trained using 5 different splits of the development dataset

From: Radiomics and deep learning for myocardial scar screening in hypertrophic cardiomyopathy

Cross-validation

1

2

3

4

5

Mean ± SD

Radiomics

Sensitivity

0.92

0.91

0.92

0.91

0.91

0.91 ± 0.01

Specificity

0.42

0.36

0.37

0.37

0.29

0.36 ± 0.05

Recall

0.85

0.81

0.84

0.82

0.77

0.82 ± 0.03

Precision

0.58

0.56

0.57

0.56

0.53

0.56 ± 0.02

Accuracy

0.65

0.62

0.63

0.62

0.58

0.62 ± 0.03

AUC

0.77

0.76

0.77

0.72

0.71

0.75 ± 0.03

Deep learning

Sensitivity

0.92

0.92

0.91

0.92

0.91

0.92 ± 0.01

Specificity

0.45

0.42

0.42

0.42

0.30

0.40 ± 0.06

Recall

0.86

0.85

0.83

0.85

0.78

0.83 ± 0.03

Precision

0.60

0.58

0.58

0.58

0.54

0.58 ± 0.02

Accuracy

0.67

0.65

0.65

0.65

0.58

0.64 ± 0.03

AUC

0.76

0.76

0.75

0.78

0.77

0.76 ± 0.01

Deep learning–radiomics

Sensitivity

0.91

0.91

0.91

0.91

0.91

0.91 ± 0.00

Specificity

0.43

0.45

0.43

0.49

0.32

0.42 ± 0.06

Recall

0.84

0.84

0.84

0.85

0.79

0.83 ± 0.02

Precision

0.59

0.60

0.59

0.61

0.54

0.59 ± 0.03

Accuracy

0.65

0.67

0.65

0.69

0.60

0.65 ± 0.03

AUC

0.79

0.82*†

0.83*†

0.82*

0.79*

0.81 ± 0.02

  1. Performance is evaluated using the internal testing dataset
  2. *Statistical significance vs Radiomics model.
  3. †Statistical significance vs DL model. All metrics were computed at an operating point corresponding to a sensitivity of at least 90%