Skip to main content

Table 8 Selected parameter settings for machine learning models

From: Time-aware domain-based social influence prediction

Parameter

Description

Value

Generalised linear model (GLM)

 Family

Uses binomial for classification

Gaussian

 Solver

Used for optimisation

IRLSM

 Standardisation

Standardisation numerical columns

Checked

 Maximum number of threads

Controls parallelism level of building model

1

Naive Bayes (NB)

 Laplace correction

Prevents the occurrence of zero values

True

Logistic regression (LR)

 Solver

Used for optimisation

IRLSM

 Compute p-values

Requests p-values computation

True

 Remove collinear columns

Removes some dependent columns

True

 Add intercept

Includes constant term in the model

Ture

Deep learning (DL)

 No. of epochs

Iteration times over dataset

50

 Adaptive rate (ADADELTA)

Unifies the benefits of momentum training and learning rate annealing

True

 Mean learning rate

A non-negative scalar indicating step size

0.003772

 Activation function

Function used by neurons in the hidden layers

Rectifier

 No. of hidden layer

Number of hidden layers in the model

50

 No. of neurons per layer

Size of each hidden layer

50

 L1

Regularization (absolute value of the weights)

1.0E − 5

 L2

Regularization (sum of the squared weights)

0.0

 Loss function

loss (error) function

Quadratic

Random forest tree (RFT)

 No. trees

Number of random generated trees

100

 Criterion

On which attribute will be split

gain_ratio

 Max_depth

Depth of the tree

10

Gradient boosted tree (GBT)

 No. trees

Number of generated trees

20

 Maximum number of threads

Controls parallelism level of model building.

1

 Max_depth

Depth of the tree

10

Decision tree (DT)

 Criterion

On which attribute will be split

Gain_ratio

 Max_depth

Depth of the tree

20

 Confidence

confidence level used for the pessimistic error calculation of pruning

0.1

 Minimal gain

The gain of a node is calculated before splitting it

0.05