Codegen,動態字節碼技術,那麼什麼是動態字節碼技術呢?先看來一段代碼,假設SparkPlan爲Sort
case class Sort(
sortOrder: Seq[SortOrder],
global: Boolean,
child: SparkPlan)
extends UnaryNode {
override def requiredChildDistribution: Seq[Distribution] =
if (global) OrderedDistribution(sortOrder) :: Nil else UnspecifiedDistribution :: Nil
protected override def doExecute(): RDD[Row] = attachTree(this, "sort") {
child.execute().mapPartitions( { iterator =>
val ordering = newOrdering(sortOrder, child.output)
iterator.map(_.copy()).toArray.sorted(ordering).iterator
}, preservesPartitioning = true)
}
override def output: Seq[Attribute] = child.output
override def outputOrdering: Seq[SortOrder] = sortOrder
}
abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializable {
protected def newOrdering(order: Seq[SortOrder], inputSchema: Seq[Attribute]): Ordering[Row] = {
if (codegenEnabled) {//開啓動態字節碼技術
GenerateOrdering.generate(order, inputSchema)
} else {//否則關閉
new RowOrdering(order, inputSchema)
}
}
}
可見針對Sort的SparkPlan,針對是否開啓動態字節碼技術的情況下會發生兩種情況:當關閉的時候,其Compare函數如下:
class RowOrdering(ordering: Seq[SortOrder]) extends Ordering[Row] {
def this(ordering: Seq[SortOrder], inputSchema: Seq[Attribute]) =
this(ordering.map(BindReferences.bindReference(_, inputSchema)))
def compare(a: Row, b: Row): Int = {
var i = 0
while (i < ordering.size) {
val order = ordering(i)
val left = order.child.eval(a)//虛函數調用,然後裝箱
val right = order.child.eval(b)//虛函數調用,然後裝箱
if (left == null && right == null) {
// Both null, continue looking.
} else if (left == null) {
return if (order.direction == Ascending) -1 else 1
} else if (right == null) {
return if (order.direction == Ascending) 1 else -1
} else {
val comparison = order.dataType match {
case n: AtomicType if order.direction == Ascending =>
n.ordering.asInstanceOf[Ordering[Any]].compare(left, right)//調用具體對象的compare函數
case n: AtomicType if order.direction == Descending =>
n.ordering.asInstanceOf[Ordering[Any]].reverse.compare(left, right)//調用具體對象的compare函數
case other => sys.error(s"Type $other does not support ordered operations")
}
if (comparison != 0) return comparison
}
i += 1
}
return 0
}
}
其涉及到虛函數調用及裝箱,虛函數的調用相對普通函數而言比較耗時。
當開啓動態字節碼技術的時候,其Compare函數如下:
object GenerateOrdering extends CodeGenerator[Seq[SortOrder], Ordering[Row]] with Logging {
import scala.reflect.runtime.{universe => ru}
import scala.reflect.runtime.universe._
protected def canonicalize(in: Seq[SortOrder]): Seq[SortOrder] =
in.map(ExpressionCanonicalizer.execute(_).asInstanceOf[SortOrder])
protected def bind(in: Seq[SortOrder], inputSchema: Seq[Attribute]): Seq[SortOrder] =
in.map(BindReferences.bindReference(_, inputSchema))
protected def create(ordering: Seq[SortOrder]): Ordering[Row] = {
val a = newTermName("a")
val b = newTermName("b")
val comparisons = ordering.zipWithIndex.map { case (order, i) =>
val evalA = expressionEvaluator(order.child)
val evalB = expressionEvaluator(order.child)
val compare = order.child.dataType match {
case BinaryType =>
q"""
val x = ${if (order.direction == Ascending) evalA.primitiveTerm else evalB.primitiveTerm}//直接指定類型,不涉及虛函數調用
val y = ${if (order.direction != Ascending) evalB.primitiveTerm else evalA.primitiveTerm}//直接指定類型,不涉及虛函數調用
var i = 0
while (i < x.length && i < y.length) {
val res = x(i).compareTo(y(i))
if (res != 0) return res
i = i+1
}
return x.length - y.length
"""
case _: NumericType =>
q"""
val comp = ${evalA.primitiveTerm} - ${evalB.primitiveTerm}//直接指定類型
if(comp != 0) {
return ${if (order.direction == Ascending) q"comp.toInt" else q"-comp.toInt"}
}
"""
case StringType =>
if (order.direction == Ascending) {
q"""return ${evalA.primitiveTerm}.compare(${evalB.primitiveTerm})"""//直接指定類型,不涉及虛函數調用
} else {
q"""return ${evalB.primitiveTerm}.compare(${evalA.primitiveTerm})"""
}
}
q"""
i = $a
..${evalA.code}
i = $b
..${evalB.code}
if (${evalA.nullTerm} && ${evalB.nullTerm}) {
// Nothing
} else if (${evalA.nullTerm}) {
return ${if (order.direction == Ascending) q"-1" else q"1"}
} else if (${evalB.nullTerm}) {
return ${if (order.direction == Ascending) q"1" else q"-1"}
} else {
$compare
}
"""
}
val q"class $orderingName extends $orderingType { ..$body }" = reify {
class SpecificOrdering extends Ordering[Row] {
val o = ordering
}
}.tree.children.head
val code = q"""
class $orderingName extends $orderingType {
..$body
def compare(a: $rowType, b: $rowType): Int = {
var i: $rowType = null // Holds current row being evaluated.
..$comparisons
return 0
}
}
new $orderingName()
"""
logDebug(s"Generated Ordering: $code")
toolBox.eval(code).asInstanceOf[Ordering[Row]]
}
}
可見動態字節碼技術中不涉及虛函數的調用,其本質就是scala的反射機制。關於虛調用爲什麼耗時的原因如下:
以具體的SQL語句 select a+b fromtable 爲例進行說明,下面是它的解析過程:
1.調用虛函數Add.eval(),需確認Add兩邊數據類型
2.調用虛函數a.eval(),需要確認a的數據類型
3.確認a的數據類型是int,裝箱
4.調用虛函數b.eval(),需確認b的數據類型
5.確認b的數據類型是int,裝箱
6.調用int類型的add
7.返回裝箱後的計算結果
從上面的步驟可以看出,一條SQL語句的解析需要進行多次虛函數的調用。我們知道,虛函數的調用會極大的降低效率。那麼,虛函數的調用爲什麼會影響效率呢?
有人答案是:虛函數調用會進行一次間接尋址過程。事實上這一步間接尋址真的會顯著降低運行效率?顯然不是。
流水線的打斷纔是真正降低效率的原因。
我們知道,虛函數的調用時是運行時多態,意思就是在編譯期你是無法知道虛函數的具體調用。設想一下,如果說不是虛函數,那麼在編譯時期,其相對地址是確定的,編譯器可以直接生成jmp/invoke指令; 如果是虛函數,多出來的一次查找vtable所帶來的開銷,倒是次要的,關鍵在於,這個函數地址是動態的,譬如 取到的地址在eax裏,則在call eax之後的那些已經被預取進入流水線的所有指令都將失效。流水線越長,一次分支預測失敗的代價也就越大,如下所示:
pf->test
001E146D mov eax,dword ptr[pf]
011E1470 mov edx,dword,ptr[eax]
011E1472 mov esi,esp
011E1474 mov ecx,dword ptr[pf]
011E1477 mov eax,dword ptr[edx]
011E1479 eax <-----------------------分支預測失敗
011E147B cmp esi esp
011E147D @ILT+355(__RTC_CheckEsp)(11E1168h)