Parameter | Value |
---|---|
Input dimensions | 300 × 1 |
LSTM memory dimensions | 150 × 1 |
Epochs | 15 |
Mini batch size | 25 |
Learning rate | 1 × \(10^{-2}\) |
Weight decay (Regularization) | 2.25 × \(10^{-3}\) |
Dropout | 0.2 |
Loss function | Kullback-Leibler divergence loss |
Optimizer | Adagrad optimizer |
Learning rate scheduler | Stepwise learning rate decay |
Step learning rate step size | Once every 2 epochs |
Step learning rate decay | 0.25 |