RDD map和mapPartitions方法之間有什么區(qū)別?并且flatMap表現(xiàn)得像map還是喜歡mapPartitions?謝謝。(edit),即兩者之間在語義上或執(zhí)行上有什么區(qū)別 def map[A, B](rdd: RDD[A], fn: (A => B)) (implicit a: Manifest[A], b: Manifest[B]): RDD[B] = { rdd.mapPartitions({ iter: Iterator[A] => for (i <- iter) yield fn(i) }, preservesPartitioning = true) }和: def map[A, B](rdd: RDD[A], fn: (A => B)) (implicit a: Manifest[A], b: Manifest[B]): RDD[B] = { rdd.map(fn) }
Apache Spark:map與mapPartitions?
慕的地6264312
2019-11-22 11:15:27