zipWithIndex {SparkR} | R Documentation |
The ordering is first based on the partition index and then the ordering of items within each partition. So the first item in the first partition gets index 0, and the last item in the last partition receives the largest index.
zipWithIndex(x) ## S4 method for signature 'RDD' zipWithIndex(x)
x |
An RDD to be zipped. |
This method needs to trigger a Spark job when this RDD contains more than one partition.
An RDD with zipped items.
zipWithUniqueId
## Not run:
##D sc <- sparkR.init()
##D rdd <- parallelize(sc, list("a", "b", "c", "d", "e"), 3L)
##D collect(zipWithIndex(rdd))
##D # list(list("a", 0), list("b", 1), list("c", 2), list("d", 3), list("e", 4))
## End(Not run)