R: Zip an RDD with generated unique Long IDs.

zipWithUniqueId {SparkR}

R Documentation

Zip an RDD with generated unique Long IDs.

Description

Items in the kth partition will get ids k, n+k, 2*n+k, ..., where n is the number of partitions. So there may exist gaps, but this method won't trigger a spark job, which is different from zipWithIndex.

Usage

zipWithUniqueId(x)

## S4 method for signature 'RDD'
zipWithUniqueId(x)

Arguments

`x`	An RDD to be zipped.

Value

An RDD with zipped items.

Examples

## Not run: 
##D sc <- sparkR.init()
##D rdd <- parallelize(sc, list("a", "b", "c", "d", "e"), 3L)
##D collect(zipWithUniqueId(rdd))
##D # list(list("a", 0), list("b", 3), list("c", 1), list("d", 4), list("e", 2))
## End(Not run)