Evaluation of maxout activations in deep learning across several big data domains

Journal of Big Data

Table 7 Best activation and its results per dataset

Dataset	Best activation	Average accuracy (%)	Average epochs	Average 100 batches time (s)	Average 100 batches training time
CIFAR-10	ReLU6x	79.91	64	0.26	13.33
CIFAR-100	ReLU6x	50.44	60	0.33	18.76
Fashion MNIST	ReLU6x	92.48	89	0.22	17.37
MNIST	ReLU6x	99.46	35	0.22	7.89
LFW	ReLU6x	79.67	51	26.75	1418.12
MS-Celeb	SeLU	97.50	97	39.09	4202.31
All image and face datasets combined	ReLU6x	84.40	60	5.5	161.41
Amazon1M	Maxout 3-2	88.17	35	73.27	2124.86
Amazon4M	Maxout 6-1	93.73	26	57.32	1490.36
Sentiment140	ReLU2x	84.57	60	5.09	259.79
Yelp500K	ReLU2x	93.17	60	18.19	873.33
Yelp1M	ReLU	93.60	60	8.65	519.41
All text datasets combined	ReLU2x	90.41	40	15.57	594.15
Medicare Part B	SeLU	71.0	29	0.12	7.98
Medicare Part D	Maxout 2-1 Maxout 6-1	71.5	180	0.12	22.84
DMEPOS	SeLU	68.5	51	0.12	5.54
Combined CMS dataset	SeLU	74.0	160	0.12	21.05
All Medicare datasets combined	SeLU	69.7	107	0.12	13.01
Google Speech Commands	Maxout 3-2	91.93	45	50.17	2257.65
IRMAS	ReLU2x	67.59	180	10.14	1825.20
IDMT-SMT-Audio-Effects	SeLU	95.51	87	5.94	531.64
All sound datasets combined	Maxout 2-1	83.19	79	11.59	983.18