reduceByKey {SparkR} | R Documentation |
This function operates on RDDs where every element is of the form list(K, V) or c(K, V). and merges the values for each key using an associative reduce function.
reduceByKey(rdd, combineFunc, numPartitions) ## S4 method for signature 'RDD,ANY,integer' reduceByKey(rdd, combineFunc, numPartitions)
rdd |
The RDD to reduce by key. Should be an RDD where each element is list(K, V) or c(K, V). |
combineFunc |
The associative reduce function to use. |
numPartitions |
Number of partitions to create. |
An RDD where each element is list(K, V') where V' is the merged value
groupByKey
## Not run:
##D sc <- sparkR.init()
##D pairs <- list(list(1, 2), list(1.1, 3), list(1, 4))
##D rdd <- parallelize(sc, pairs)
##D parts <- reduceByKey(rdd, "+", 2L)
##D reduced <- collect(parts)
##D reduced[[1]] # Should be a list(1, 6)
## End(Not run)