R frontend for Spark


[Up] [Top]

Documentation for package ‘SparkR’ version 0.1

Help Pages

A B C D F G H I J K L M N O P R S T U V Z

-- A --

aggregateByKey Aggregate a pair RDD by each key.
aggregateByKey-method Aggregate a pair RDD by each key.
aggregateRDD Aggregate an RDD using the given combine functions and a neutral "zero value".
aggregateRDD-method Aggregate an RDD using the given combine functions and a neutral "zero value".

-- B --

Broadcast S4 class that represents a Broadcast variable
broadcast Broadcast a variable to all workers
Broadcast-class S4 class that represents a Broadcast variable

-- C --

cache Persist an RDD
cache-method Persist an RDD
checkpoint Checkpoint an RDD
checkpoint-method Checkpoint an RDD
coalesce Return a new RDD that is reduced into numPartitions partitions.
coalesce,RDD Return a new RDD that is reduced into numPartitions partitions.
coalesce-method Return a new RDD that is reduced into numPartitions partitions.
cogroup For each key k in several RDDs, return a resulting RDD that whose values are a list of values for the key in all RDDs.
cogroup-method For each key k in several RDDs, return a resulting RDD that whose values are a list of values for the key in all RDDs.
collect Collect elements of an RDD
collect-method Collect elements of an RDD
collectAsMap Collect elements of an RDD
collectAsMap-method Collect elements of an RDD
collectPartition Collect elements of an RDD
collectPartition-method Collect elements of an RDD
combineByKey Combine values by key
combineByKey-method Combine values by key
count Return the number of elements in the RDD.
count-method Return the number of elements in the RDD.
countByKey Count the number of elements for each key, and return the result to the master as lists of (key, count) pairs.
countByKey-method Count the number of elements for each key, and return the result to the master as lists of (key, count) pairs.
countByValue Return the count of each unique value in this RDD as a list of (value, count) pairs.
countByValue-method Return the count of each unique value in this RDD as a list of (value, count) pairs.

-- D --

distinct Removes the duplicates from RDD.
distinct-method Removes the duplicates from RDD.

-- F --

Filter This function returns a new RDD containing only the elements that satisfy a predicate (i.e. returning TRUE in a given logical function). The same as 'filter()' in Spark.
Filter-method This function returns a new RDD containing only the elements that satisfy a predicate (i.e. returning TRUE in a given logical function). The same as 'filter()' in Spark.
filterRDD This function returns a new RDD containing only the elements that satisfy a predicate (i.e. returning TRUE in a given logical function). The same as 'filter()' in Spark.
filterRDD-method This function returns a new RDD containing only the elements that satisfy a predicate (i.e. returning TRUE in a given logical function). The same as 'filter()' in Spark.
first First
first,RDD First
first-method First
flatMap Flatten results after apply a function to all elements
flatMap-method Flatten results after apply a function to all elements
flatMapValues Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
flatMapValues-method Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.
fold Fold an RDD using a given associative function and a neutral "zero value".
fold-method Fold an RDD using a given associative function and a neutral "zero value".
foldByKey Fold a pair RDD by each key.
foldByKey-method Fold a pair RDD by each key.
foreach Applies a function to all elements in an RDD, and force evaluation.
foreach-method Applies a function to all elements in an RDD, and force evaluation.
foreachPartition Applies a function to all elements in an RDD, and force evaluation.
foreachPartition-method Applies a function to all elements in an RDD, and force evaluation.
fullOuterJoin Join two RDDs
fullOuterJoin-method Join two RDDs

-- G --

glom Coalesce all elements within each partition of an RDD into a list.
glom,RDD Coalesce all elements within each partition of an RDD into a list.
glom-method Coalesce all elements within each partition of an RDD into a list.
groupByKey Group values by key
groupByKey-method Group values by key

-- H --

hashCode Compute the hashCode of an object

-- I --

includePackage Include this specified package on all workers

-- J --

join Join two RDDs
join-method Join two RDDs

-- K --

keyBy Creates tuples of the elements in this RDD by applying a function.
keyBy,RDD Creates tuples of the elements in this RDD by applying a function.
keyBy-method Creates tuples of the elements in this RDD by applying a function.
keys Return an RDD with the keys of each tuple.
keys,RDD Return an RDD with the keys of each tuple.
keys-method Return an RDD with the keys of each tuple.

-- L --

lapply Apply a function to all elements
lapply-method Apply a function to all elements
lapplyPartition Apply a function to each partition of an RDD
lapplyPartition-method Apply a function to each partition of an RDD
lapplyPartitionsWithIndex Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
lapplyPartitionsWithIndex-method Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
leftOuterJoin Join two RDDs
leftOuterJoin-method Join two RDDs
length-method Return the number of elements in the RDD.
lookup Look up elements of a key in an RDD
lookup-method Look up elements of a key in an RDD

-- M --

map Apply a function to all elements
map-method Apply a function to all elements
mapPartitions Apply a function to each partition of an RDD
mapPartitions-method Apply a function to each partition of an RDD
mapPartitionsWithIndex Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapPartitionsWithIndex-method Return a new RDD by applying a function to each partition of this RDD, while tracking the index of the original partition.
mapValues Applies a function to all values of the elements, without modifying the keys.
mapValues-method Applies a function to all values of the elements, without modifying the keys.
maximum Get the maximum element of an RDD.
maximum,RDD Get the maximum element of an RDD.
maximum-method Get the maximum element of an RDD.
minimum Get the minimum element of an RDD.
minimum,RDD Get the minimum element of an RDD.
minimum-method Get the minimum element of an RDD.

-- N --

name Return an RDD's name.
name,RDD Return an RDD's name.
name-method Return an RDD's name.
numPartitions Gets the number of partitions of an RDD
numPartitions-method Gets the number of partitions of an RDD

-- O --

objectFile Load an RDD saved as a SequenceFile containing serialized objects.

-- P --

parallelize Create an RDD from a homogeneous list or vector.
partitionBy Partition an RDD by key
partitionBy-method Partition an RDD by key
persist Persist an RDD
persist-method Persist an RDD
pipeRDD Pipes elements to a forked external process.
pipeRDD-method Pipes elements to a forked external process.
print.jobj Print a JVM object reference.

-- R --

RDD S4 class that represents an RDD
RDD-class S4 class that represents an RDD
reduce Reduce across elements of an RDD.
reduce-method Reduce across elements of an RDD.
reduceByKey Merge values by key
reduceByKey-method Merge values by key
reduceByKeyLocally Merge values by key locally
reduceByKeyLocally-method Merge values by key locally
repartition Return a new RDD that has exactly numPartitions partitions. Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data. If you are decreasing the number of partitions in this RDD, consider using coalesce, which can avoid performing a shuffle.
repartition,RDD Return a new RDD that has exactly numPartitions partitions. Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data. If you are decreasing the number of partitions in this RDD, consider using coalesce, which can avoid performing a shuffle.
repartition-method Return a new RDD that has exactly numPartitions partitions. Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data. If you are decreasing the number of partitions in this RDD, consider using coalesce, which can avoid performing a shuffle.
rightOuterJoin Join two RDDs
rightOuterJoin-method Join two RDDs

-- S --

sampleRDD Return an RDD that is a sampled subset of the given RDD.
sampleRDD,RDD Return an RDD that is a sampled subset of the given RDD.
sampleRDD-method Return an RDD that is a sampled subset of the given RDD.
saveAsObjectFile Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFile,RDD Save this RDD as a SequenceFile of serialized objects.
saveAsObjectFile-method Save this RDD as a SequenceFile of serialized objects.
saveAsTextFile Save this RDD as a text file, using string representations of elements.
saveAsTextFile,RDD Save this RDD as a text file, using string representations of elements.
saveAsTextFile-method Save this RDD as a text file, using string representations of elements.
setBroadcastValue Internal function to set values of a broadcast variable.
setCheckpointDir Set the checkpoint directory Set the directory under which RDDs are going to be checkpointed. The directory must be a HDFS path if running on a cluster.
setName Set an RDD's name.
setName,RDD Set an RDD's name.
setName-method Set an RDD's name.
sortBy Sort an RDD by the given key function.
sortBy-method Sort an RDD by the given key function.
sortByKey Sort a (k, v) pair RDD by k.
sortByKey-method Sort a (k, v) pair RDD by k.
sparkR.init Initialize a new Spark Context.

-- T --

take Take elements from an RDD.
take-method Take elements from an RDD.
takeOrdered Returns the first N elements from an RDD in ascending order.
takeOrdered-method Returns the first N elements from an RDD in ascending order.
takeSample Return a list of the elements that are a sampled subset of the given RDD.
takeSample,RDD Return a list of the elements that are a sampled subset of the given RDD.
takeSample-method Return a list of the elements that are a sampled subset of the given RDD.
textFile Create an RDD from a text file.
top Returns the top N elements from an RDD.
top-method Returns the top N elements from an RDD.

-- U --

unionRDD Return the union RDD of two RDDs. The same as union() in Spark.
unionRDD-method Return the union RDD of two RDDs. The same as union() in Spark.
unpersist Unpersist an RDD
unpersist-method Unpersist an RDD

-- V --

value Broadcast a variable to all workers
value-method Broadcast a variable to all workers
values Return an RDD with the values of each tuple.
values,RDD Return an RDD with the values of each tuple.
values-method Return an RDD with the values of each tuple.

-- Z --

zipRDD Zip an RDD with another RDD.
zipRDD,RDD Zip an RDD with another RDD.
zipRDD-method Zip an RDD with another RDD.
zipWithIndex Zip an RDD with its element indices.
zipWithIndex,RDD Zip an RDD with its element indices.
zipWithIndex-method Zip an RDD with its element indices.
zipWithUniqueId Zip an RDD with generated unique Long IDs.
zipWithUniqueId,RDD Zip an RDD with generated unique Long IDs.
zipWithUniqueId-method Zip an RDD with generated unique Long IDs.