diff --git a/docs/PartitionEvaluator.md b/docs/PartitionEvaluator.md new file mode 100644 index 000000000..c59dc8977 --- /dev/null +++ b/docs/PartitionEvaluator.md @@ -0,0 +1,28 @@ +--- +tags: + - DeveloperApi +--- + +# PartitionEvaluator + +`PartitionEvaluator[T, U]` is an [abstraction](#contract) of [partition evaluators](#implementations) that can [compute (_evaluate_) one or more RDD partitions](#eval). + +## Contract + +### Evaluate Partitions { #eval } + +```scala +eval( + partitionIndex: Int, + inputs: Iterator[T]*): Iterator[U] +``` + +Used when: + +* `MapPartitionsWithEvaluatorRDD` is requested to [compute a partition](rdd/MapPartitionsWithEvaluatorRDD.md#compute) +* `ZippedPartitionsWithEvaluatorRDD` is requested to [compute a partition](rdd/ZippedPartitionsWithEvaluatorRDD.md#compute) + +## Implementations + +!!! note + No built-in implementations available in Spark Core (but [Spark SQL]({{ book.spark_sql }})). diff --git a/docs/PartitionEvaluatorFactory.md b/docs/PartitionEvaluatorFactory.md new file mode 100644 index 000000000..e64473b96 --- /dev/null +++ b/docs/PartitionEvaluatorFactory.md @@ -0,0 +1,30 @@ +--- +tags: + - DeveloperApi +--- + +# PartitionEvaluatorFactory + +`PartitionEvaluatorFactory[T, U]` is an [abstraction](#contract) of [PartitionEvaluator factories](#implementations). + +`PartitionEvaluatorFactory` is a `Serializable` ([Java]({{ java.api }}/java/io/Serializable.html)). + +## Contract + +### Creating PartitionEvaluator { #createEvaluator } + +```scala +createEvaluator(): PartitionEvaluator[T, U] +``` + +Creates a [PartitionEvaluator](PartitionEvaluator.md) + +Used when: + +* `MapPartitionsWithEvaluatorRDD` is requested to [compute a partition](rdd/MapPartitionsWithEvaluatorRDD.md#compute) +* `ZippedPartitionsWithEvaluatorRDD` is requested to [compute a partition](rdd/ZippedPartitionsWithEvaluatorRDD.md#compute) + +## Implementations + +!!! note + No built-in implementations available in Spark Core (but [Spark SQL]({{ book.spark_sql }})). diff --git a/docs/rdd/MapPartitionsWithEvaluatorRDD.md b/docs/rdd/MapPartitionsWithEvaluatorRDD.md new file mode 100644 index 000000000..1833fed69 --- /dev/null +++ b/docs/rdd/MapPartitionsWithEvaluatorRDD.md @@ -0,0 +1,33 @@ +# MapPartitionsWithEvaluatorRDD + +`MapPartitionsWithEvaluatorRDD` is an [RDD](RDD.md). + +## Creating Instance + +`MapPartitionsWithEvaluatorRDD` takes the following to be created: + +* Previous [RDD](RDD.md) +* [PartitionEvaluatorFactory](../PartitionEvaluatorFactory.md) + +`MapPartitionsWithEvaluatorRDD` is created when: + +* [RDD.mapPartitionsWithEvaluator](RDD.md#mapPartitionsWithEvaluator) operator is used +* [RDDBarrier.mapPartitionsWithEvaluator](../barrier-execution-mode/RDDBarrier.md#mapPartitionsWithEvaluator) operator is used + +## Computing Partition { #compute } + +??? note "RDD" + + ```scala + compute( + split: Partition, + context: TaskContext): Iterator[U] + ``` + + `compute` is part of the [RDD](RDD.md#compute) abstraction. + +`compute` requests the [PartitionEvaluatorFactory](#evaluatorFactory) to [create a PartitionEvaluator](../PartitionEvaluatorFactory.md#createEvaluator). + +`compute` requests the [first parent RDD](RDD.md#firstParent) to [iterator](RDD.md#iterator). + +In the end, `compute` requests the [PartitionEvaluator](../PartitionEvaluator.md) to [evaluate the partition](../PartitionEvaluator.md#eval). diff --git a/docs/rdd/RDD.md b/docs/rdd/RDD.md index ca8afe3d1..31f809267 100644 --- a/docs/rdd/RDD.md +++ b/docs/rdd/RDD.md @@ -445,6 +445,25 @@ withScope[U]( !!! note `withScope` is used for most (if not all) `RDD` API operators. +## mapPartitionsWithEvaluator { #mapPartitionsWithEvaluator } + +```scala +mapPartitionsWithEvaluator[U: ClassTag]( + evaluatorFactory: PartitionEvaluatorFactory[T, U]): RDD[U] +``` + +`mapPartitionsWithEvaluator` creates a [MapPartitionsWithEvaluatorRDD](MapPartitionsWithEvaluatorRDD.md) for this `RDD` and the given [PartitionEvaluatorFactory](../PartitionEvaluatorFactory.md). + +## zipPartitionsWithEvaluator { #zipPartitionsWithEvaluator } + +```scala +zipPartitionsWithEvaluator[U: ClassTag]( + rdd2: RDD[T], + evaluatorFactory: PartitionEvaluatorFactory[T, U]): RDD[U] +``` + +`zipPartitionsWithEvaluator` creates a [ZippedPartitionsWithEvaluatorRDD](ZippedPartitionsWithEvaluatorRDD.md) for this `RDD` and the given `RDD` and the [PartitionEvaluatorFactory](../PartitionEvaluatorFactory.md). +