groupByKey {SparkR}R Documentation

Group values by key

Description

This function operates on RDDs where every element is of the form list(K, V) or c(K, V). and group values for each key in the RDD into a single sequence.

Usage

groupByKey(rdd, numPartitions)

## S4 method for signature 'RDD,integer'
groupByKey(rdd, numPartitions)

Arguments

rdd

The RDD to group. Should be an RDD where each element is list(K, V) or c(K, V).

numPartitions

Number of partitions to create.

Value

An RDD where each element is list(K, list(V))

See Also

reduceByKey

Examples

## Not run: 
##D sc <- sparkR.init()
##D pairs <- list(list(1, 2), list(1.1, 3), list(1, 4))
##D rdd <- parallelize(sc, pairs)
##D parts <- groupByKey(rdd, 2L)
##D grouped <- collect(parts)
##D grouped[[1]] # Should be a list(1, list(2, 4))
## End(Not run)

[Package SparkR version 0.1 Index]