takeSample {SparkR} | R Documentation |
Return a list of the elements that are a sampled subset of the given RDD.
takeSample(rdd, withReplacement, num, seed) ## S4 method for signature 'RDD,logical,integer,integer' takeSample(rdd, withReplacement, num, seed)
rdd |
The RDD to sample elements from |
withReplacement |
Sampling with replacement or not |
num |
Number of elements to return |
seed |
Randomness seed value |
## Not run:
##D sc <- sparkR.init()
##D rdd <- parallelize(sc, 1:100)
##D # exactly 5 elements sampled, which may not be distinct
##D takeSample(rdd, TRUE, 5L, 1618L)
##D # exactly 5 distinct elements sampled
##D takeSample(rdd, FALSE, 5L, 16181618L)
## End(Not run)