From: Optimizing classification efficiency with machine learning techniques for pattern matching
Algorithms | Time Complexity | Advantages | Disadvantages |
---|---|---|---|
KNN | O (n * d) Where: n: the number of instances, d: dimensions | 1. There is no training period- KNN. 2. Simple Implementation | 1. It does not perform well with huge datasets. 2. Does not function properly with several dimensions. 3. Sensitive to missing and noisy data 4. Scaling of Features |
SVM | O(s*d) Where: s: number of SV, d: data dimensionality | 1. In higher dimensions, it performs effectively. 2. When classes can be separated, the best algorithm is used. 3. Outliers have less influence. 4. SVM is well-suited for binary classification in extreme cases. | 1. Slower with bigger datasets 2. Overlapped classes perform poorly. 3. It is critical to choose proper hyperparameters. 4. Choosing the right kernel function might be difficult. |
Decision Tree | O(k) Where: k: depth of tree | 1. No data normalization or scaling is required. 2. Missing value handling 3. Feature selection that is automatic | 1. Susceptible to overfitting. 2. Data sensitivity. When data changes little, the consequences might alter dramatically. 3. It takes more time to train decision trees. |
Random Forest | O(k*m) Where: k: depth of tree, m: decision trees | 1. Error reduction 2. Excellent performance on unbalanced datasets 3. Dealing with massive amounts of data 4. Effective handling of missing data 5. Outliers have little influence | 1. Features must have some predictive power, or they will not operate. 2. The tree predictions must be uncorrelated. |
Naive Bayes | O(n*d) | 1. Scalable when dealing with large datasets. 2. Insensitive to unimportant characteristics. 3. Effective multi-class prediction 4. High dimensional performance with good performance | 1. The independence of characteristics is not valid. 2. Training data should accurately represent the population. |