From: Clustering large datasets using K-means modified inter and intra clustering (KM-I2C) in Hadoop
Notation | Description |
---|---|
It | Number of iterations |
ic | Initial centroid |
D | Dataset |
k | Number of clusters |
oc | Previous centroid values |
nc | New cluster centroid values |
Result | Final result |
select() | Function for selecting data based on the k value |
input() | Function for data file uploading |
job.mapper() | Map function |
job.reducer() | Reduce function |
write() | Function for writing centroid values to a file |
read() | Function for reading centroid values to a file |
update() | Function for testing for updated centroid values |