R: Group values by key

groupByKey {SparkR}

R Documentation

Group values by key

Description

This function operates on RDDs where every element is of the form list(K, V) or c(K, V). and group values for each key in the RDD into a single sequence.

Usage

groupByKey(rdd, numPartitions)

## S4 method for signature 'RDD,integer'
groupByKey(rdd, numPartitions)

Arguments

`rdd`	The RDD to group. Should be an RDD where each element is list(K, V) or c(K, V).
`numPartitions`	Number of partitions to create.

Value

An RDD where each element is list(K, list(V))

Examples

## Not run: 
##D sc <- sparkR.init()
##D pairs <- list(list(1, 2), list(1.1, 3), list(1, 4))
##D rdd <- parallelize(sc, pairs)
##D parts <- groupByKey(rdd, 2L)
##D grouped <- collect(parts)
##D grouped[[1]] # Should be a list(1, list(2, 4))
## End(Not run)