cache |
Persist an RDD |
cache-method |
Persist an RDD |
checkpoint |
Checkpoint an RDD |
checkpoint-method |
Checkpoint an RDD |
coalesce |
Return a new RDD that is reduced into numPartitions partitions. |
coalesce,RDD |
Return a new RDD that is reduced into numPartitions partitions. |
coalesce-method |
Return a new RDD that is reduced into numPartitions partitions. |
cogroup |
For each key k in several RDDs, return a resulting RDD that whose values are a list of values for the key in all RDDs. |
cogroup-method |
For each key k in several RDDs, return a resulting RDD that whose values are a list of values for the key in all RDDs. |
collect |
Collect elements of an RDD |
collect-method |
Collect elements of an RDD |
collectAsMap |
Collect elements of an RDD |
collectAsMap-method |
Collect elements of an RDD |
collectPartition |
Collect elements of an RDD |
collectPartition-method |
Collect elements of an RDD |
combineByKey |
Combine values by key |
combineByKey-method |
Combine values by key |
count |
Return the number of elements in the RDD. |
count-method |
Return the number of elements in the RDD. |
countByKey |
Count the number of elements for each key, and return the result to the master as lists of (key, count) pairs. |
countByKey-method |
Count the number of elements for each key, and return the result to the master as lists of (key, count) pairs. |
countByValue |
Return the count of each unique value in this RDD as a list of (value, count) pairs. |
countByValue-method |
Return the count of each unique value in this RDD as a list of (value, count) pairs. |
Filter |
This function returns a new RDD containing only the elements that satisfy a predicate (i.e. returning TRUE in a given logical function). The same as 'filter()' in Spark. |
Filter-method |
This function returns a new RDD containing only the elements that satisfy a predicate (i.e. returning TRUE in a given logical function). The same as 'filter()' in Spark. |
filterRDD |
This function returns a new RDD containing only the elements that satisfy a predicate (i.e. returning TRUE in a given logical function). The same as 'filter()' in Spark. |
filterRDD-method |
This function returns a new RDD containing only the elements that satisfy a predicate (i.e. returning TRUE in a given logical function). The same as 'filter()' in Spark. |
first |
First |
first,RDD |
First |
first-method |
First |
flatMap |
Flatten results after apply a function to all elements |
flatMap-method |
Flatten results after apply a function to all elements |
flatMapValues |
Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning. |
flatMapValues-method |
Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning. |
fold |
Fold an RDD using a given associative function and a neutral "zero value". |
fold-method |
Fold an RDD using a given associative function and a neutral "zero value". |
foldByKey |
Fold a pair RDD by each key. |
foldByKey-method |
Fold a pair RDD by each key. |
foreach |
Applies a function to all elements in an RDD, and force evaluation. |
foreach-method |
Applies a function to all elements in an RDD, and force evaluation. |
foreachPartition |
Applies a function to all elements in an RDD, and force evaluation. |
foreachPartition-method |
Applies a function to all elements in an RDD, and force evaluation. |
fullOuterJoin |
Join two RDDs |
fullOuterJoin-method |
Join two RDDs |
map |
Apply a function to all elements |
map-method |
Apply a function to all elements |
mapPartitions |
Apply a function to each partition of an RDD |
mapPartitions-method |
Apply a function to each partition of an RDD |
mapPartitionsWithIndex |
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition. |
mapPartitionsWithIndex-method |
Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition. |
mapValues |
Applies a function to all values of the elements, without modifying the keys. |
mapValues-method |
Applies a function to all values of the elements, without modifying the keys. |
maximum |
Get the maximum element of an RDD. |
maximum,RDD |
Get the maximum element of an RDD. |
maximum-method |
Get the maximum element of an RDD. |
minimum |
Get the minimum element of an RDD. |
minimum,RDD |
Get the minimum element of an RDD. |
minimum-method |
Get the minimum element of an RDD. |
RDD |
S4 class that represents an RDD |
RDD-class |
S4 class that represents an RDD |
reduce |
Reduce across elements of an RDD. |
reduce-method |
Reduce across elements of an RDD. |
reduceByKey |
Merge values by key |
reduceByKey-method |
Merge values by key |
reduceByKeyLocally |
Merge values by key locally |
reduceByKeyLocally-method |
Merge values by key locally |
repartition |
Return a new RDD that has exactly numPartitions partitions. Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data. If you are decreasing the number of partitions in this RDD, consider using coalesce, which can avoid performing a shuffle. |
repartition,RDD |
Return a new RDD that has exactly numPartitions partitions. Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data. If you are decreasing the number of partitions in this RDD, consider using coalesce, which can avoid performing a shuffle. |
repartition-method |
Return a new RDD that has exactly numPartitions partitions. Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data. If you are decreasing the number of partitions in this RDD, consider using coalesce, which can avoid performing a shuffle. |
rightOuterJoin |
Join two RDDs |
rightOuterJoin-method |
Join two RDDs |
take |
Take elements from an RDD. |
take-method |
Take elements from an RDD. |
takeOrdered |
Returns the first N elements from an RDD in ascending order. |
takeOrdered-method |
Returns the first N elements from an RDD in ascending order. |
takeSample |
Return a list of the elements that are a sampled subset of the given RDD. |
takeSample,RDD |
Return a list of the elements that are a sampled subset of the given RDD. |
takeSample-method |
Return a list of the elements that are a sampled subset of the given RDD. |
textFile |
Create an RDD from a text file. |
top |
Returns the top N elements from an RDD. |
top-method |
Returns the top N elements from an RDD. |