From: The non-linear nature of the cost of comprehensibility
Metafeature name | Description |
---|---|
AttrConc (mean) | Concentration coef. of each pair of distinct attributes |
AttrEnt (mean) | Shannon’s entropy for each predictive attribute |
AttrToInst | The ratio between the number of attributes |
C1 | The entropy of class proportions |
C2 | The imbalance ratio |
CanCor (mean) | Canonical correlations of data |
CatToNum | The ratio between the number of categoric and numeric features |
ClassConc (mean) | Concentration coefficient between each attribute and class |
ClassEnt | Target attribute Shannon’s entropy |
ClsCoef | Clustering coefficient |
Cor (mean) | The absolute value of the correlation of distinct dataset column pairs |
Cov (mean) | The absolute value of the covariance of distinct dataset attribute pairs |
Density | Average density of the network |
Eigenvalues (mean) | Eigenvalues of covariance matrix from dataset |
EqNumAttr | Number of attributes equivalent for a predictive task |
F1 (mean) | Maximum Fisher’s discriminant ratio |
F1v (mean) | Directional-vector maximum Fisher’s discriminant ratio |
F2 (mean) | Volume of the overlapping region |
F3 (mean) | Feature maximum individual efficiency |
F4 (mean) | Collective feature efficiency |
FreqClass (mean) | Relative frequency of each distinct class |
Gmean (mean) | Geometric mean of each attribute |
Gravity | Distance between minority and majority classes center of mass |
Hmean (mean) | Harmonic mean of each attribute |
Hubs (mean) | Hub score |
InstToAttr | Ratio between the number of instances and attributes |
IqRange (mean) | Interquartile range (IQR) of each attribute |
JointEnt (mean) | Joint entropy between each attribute and class |
Kurtosis (mean) | Kurtosis of each attribute |
L1 (mean) | Sum of error distance by linear programming |
L2 (mean) | OVO subsets error rate of linear classifier |
L3 (mean) | Non-Linearity of a linear classifier |
LhTrace | Lawley-Hotelling trace |
Lsc | Local set average cardinality |
Mad (mean) | Median Absolute Deviation (MAD) adjusted by a factor |
Max (mean) | Maximum value from each attribute |
Mean (mean) | Mean value of each attribute |
Median (mean) | Median value from each attribute |
Min (mean) | Minimum value from each attribute |
MutInf (mean) | Mutual information between each attribute and target |
N1 | Fraction of borderline points |
N2 (mean) | Ratio of intra and extra class nearest neighbor distance |
N3 (mean) | Error rate of the nearest neighbor classifier |
N4 (mean) | Non-linearity of the k-NN Classifier |
NrAttr | Total number of attributes |
NrBin | Number of binary attributes |
NrCat | Number of categorical attributes |
NrClass | Number of distinct classes |
NrCorAttr | Number of distinct highly correlated pair of attributes |
NrDisc | Number of canonical correlation between each attribute and class |
NrInst | Number of instances (rows) in the dataset |
NrNorm | Number of attributes normally distributed based in a given method |
NrNum | Number of numeric features |
NrOutliers | Number of attributes with at least one outlier value |
NsRatio | Noisiness of attributes |
NumToCat | Number of numerical and categorical features |
Ptrace | Pillai’s trace |
Range (mean) | Range (max - min) of each attribute |
RoyRoot | Roy’s largest root |
Sd (mean) | Standard deviation of each attribute |
SdRatio | Statistical test for homogeneity of covariances |
Skewness (mean) | Skewness for each attribute |
Sparsity (mean) | (Possibly normalized) sparsity metric for each attribute |
T1 (mean) | Fraction of hyperspheres covering data |
T2 | Average number of features per dimension |
T3 | Average number of PCA dimensions per points |
T4 | Ratio of the PCA dimension to the original dimension |
TMean (mean) | Trimmed mean of each attribute |
Var (mean) | Variance of each attribute |
WLambda | Wilks’ Lambda value |