Input feature | Type | Missing instances (Hong Kong Cohort only) | Handling technique for missing data |
---|---|---|---|
Age | Continuous | 0 | NA |
Sex | Boolean | 0 | NA |
Tobacco smoking | Boolean | 2 | Binarization of variables during feature engineering |
Alcohol drinking | Categorical (nominal) | 33 | |
Risk habit indulgence following diagnosis | Categorical (nominal) | 0 | NA |
Previous malignancy | Categorical (nominal) | 0 | NA |
Charlson Comorbidity Index (CCI) | Continuous | 0 | NA |
Hypertension status | Boolean | 0 | NA |
Diabetes Mellitus status | Boolean | 0 | NA |
Hyperlipidemia status | Boolean | 0 | NA |
Autoimmune disease status | Boolean | 0 | NA |
Viral hepatitis status | Boolean | 0 | NA |
Type of lesion | Boolean | 0 | NA |
Clinical subtype of lichenoid lesion | Categorical (nominal) | 0 | NA |
Tongue/FOM involved | Boolean | 0 | NA |
Labial/buccal mucosa involved | Boolean | 0 | NA |
Retromolar area involved | Boolean | 0 | NA |
Gingiva involved | Boolean | 0 | NA |
Palate involved | Boolean | 0 | NA |
Number of lesions | Categorical (ordinal) | 0 | NA |
Presence of ulcers or erosions | Boolean | 0 | NA |
Presence of induration | Boolean | 0 | NA |
Treatment at diagnosis | Categorical (nominal) | 0 | NA |
Recurrence after surgical excision | Boolean | 0 | NA |
Number of recurrences | Categorical (ordinal) | 0 | NA |
Oral epithelial dysplasia at diagnosis | Categorical (nominal) | 0 | NA |
Oral epithelial dysplasia detected during follow-up | Categorical (nominal) | 0 | NA |