reduceByKey {SparkR}R Documentation

Merge values by key

Description

This function operates on RDDs where every element is of the form list(K, V) or c(K, V). and merges the values for each key using an associative reduce function.

Usage

reduceByKey(rdd, combineFunc, numPartitions)

## S4 method for signature 'RDD,ANY,integer'
reduceByKey(rdd, combineFunc, numPartitions)

Arguments

rdd

The RDD to reduce by key. Should be an RDD where each element is list(K, V) or c(K, V).

combineFunc

The associative reduce function to use.

numPartitions

Number of partitions to create.

Value

An RDD where each element is list(K, V') where V' is the merged value

See Also

groupByKey

Examples

## Not run: 
##D sc <- sparkR.init()
##D pairs <- list(list(1, 2), list(1.1, 3), list(1, 4))
##D rdd <- parallelize(sc, pairs)
##D parts <- reduceByKey(rdd, "+", 2L)
##D reduced <- collect(parts)
##D reduced[[1]] # Should be a list(1, 6)
## End(Not run)

[Package SparkR version 0.1 Index]