coalesce(numPartitions) 案例
- 作用:縮減分區數,用於大數據集過濾後,提高小數據集的執行效率。
- 需求:創建一個4個分區的RDD,對其縮減分區
package com.dark.spark.SparkStudent.Spark_RDD
import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}
object Spark31 extends App {
val config: SparkConf = new SparkConf().setMaster("local[*]").setAppName("WordCount")
val sc = new SparkContext(config)
private val listRDD: RDD[Int] = sc.makeRDD(1 to 16,4)
println("縮減分區前="+ listRDD.partitions.size)
private val coalesceRDD = listRDD.coalesce(3)
println("縮減分區後="+ coalesceRDD.partitions.size)
}
縮減分區前=4
縮減分區後=3