Skip to main content

Table 2 Overview of Software handling Big Data

From: Detecting Denial of Service attacks using machine learning algorithms

Tools/Software

Advantages

Disadvantages

ORANGE

• It's a terrific technique for projecting demand while knowing the patterns and trends of five years' worth of data

• Working further insights into it, and hypothesis models testing of data projected by orange

• It isn't very reliable when dealing with massive datasets. Orange may crash if you use datasets that operate well in Python

• As a result, it's appropriate for smaller projects, teaching reasons, or exploratory data analysis

RAPIDMINER

• it is a robust data mining application that can handle everything from data mining through model deployment and model operations

• Its end-to-end data science platform includes all of the data preparation and machine learning tools

• The programme has a tendency to crash frequently; this is especially true with neural networks and other complex algorithms. Some versions have limitations

• Even the student edition has a 10,000-row output restriction, so if you're trying to analyse a 12,000-point data set, 2000 points will be excluded at random

KNIME

• Access to all current and future advancements in data science, machine learning, and artificial intelligence

• Avoid the danger of price changes by locking your data science IP into a proprietary format. Make data science accessible to everyone, not just Windows users

• The number of rows is unlimited, but the number of columns shouldn't get much larger than ~ 10.000

APACHE SPARK

• Analytics is advanced

• Dynamic in nature

• Multilingual

• Powerful

• Fewer Algorithms

• Small files issue

• Window criteria

• Doesn’t suit well for multi-user environment

HADOOP

• Minimum network traffic

• High throughput

• High Speed

• Fault tolerance

• Problem with small files

• Vulnerable

•Security issues

• Supports only batch processing

TENSORFLOW

• Good for data visualization

• Scalable

• Compatible

• Inconsistent

• Less Speed

• Dependency

• Frequent Updates