cogroup {SparkR} | R Documentation |
For each key k in several RDDs, return a resulting RDD that whose values are a list of values for the key in all RDDs.
cogroup(..., numPartitions) ## S4 method for signature 'RDD' cogroup(..., numPartitions)
... |
Several RDDs. |
numPartitions |
Number of partitions to create. |
a new RDD containing all pairs of elements with values in a list in all RDDs.
## Not run:
##D sc <- sparkR.init()
##D rdd1 <- parallelize(sc, list(list(1, 1), list(2, 4)))
##D rdd2 <- parallelize(sc, list(list(1, 2), list(1, 3)))
##D cogroup(rdd1, rdd2, numPartitions = 2L)
##D # list(list(1, list(1, list(2, 3))), list(2, list(list(4), list()))
## End(Not run)