vectorcall 是微軟在VS2015中增加的 https://msdn.microsoft.com/zh-cn/library/dn375768.aspx
目的是用來優化浮點向量運算,intel處理器種有很多浮點向量寄存器,傳統的調用約定(stdcall cdecl fastcall thiscall) 都是通過通用寄存器(ecx edx /rcx rdx r8 r9)以及堆棧進行參數傳遞,所以調用的時候,浮點參數需要從棧獲取 例如:
float __stdcall addf0(float a, float b, float c)
{
return a + b + c;
}
的調用時float c = addf0(1, 2, 3);
float c = addf0(1, 2, 3);
00076EE0 push ecx
00076EE1 movss xmm0,dword ptr [__real@40400000 (080CDCh)]
00076EE9 movss dword ptr [esp],xmm0
00076EEE push ecx
00076EEF movss xmm0,dword ptr [__real@40000000 (080CC8h)]
00076EF7 movss dword ptr [esp],xmm0
00076EFC push ecx
00076EFD movss xmm0,dword ptr [__real@3f800000 (080CC4h)]
00076F05 movss dword ptr [esp],xmm0
00076F0A call addf0 (0711A4h)
00076F0F fstp dword ptr [c]
可以看出是通過棧傳遞的(push ecx 隨便壓入棧一個數,然後用mov寫棧頂)再看看vectorcall
float __vectorcall addf3v(float a, float b, float c)
{
return a + b + c;
}
調用的彙編代碼(可以看出直接通過寄存器傳遞參數,這樣就是比通過棧快)
float c = addf3v(1, 2, 3);
003AB550 movss xmm2,dword ptr [__real@40400000 (03B0CDCh)]
003AB558 movss xmm1,dword ptr [__real@40000000 (03B0CC8h)]
003AB560 movss xmm0,dword ptr [__real@3f800000 (03B0CC4h)]
003AB568 call addf3v (03A170Dh)
003AB56D movss dword ptr [c],xmm0
微軟的解釋大概是vectorcall 繼承於fastcall 但對於fastcall中的整數仍然按照fastcall規則傳遞 而浮點以及向量將通過寄存器傳遞
例如
float __vectorcall addvi3f3(int a, float b, int c, float d, int e, float f)
{
return a + b + c + d + e + f;
}
調用float c = addvi3f3(1, 2, 3, 4, 5, 6);
float c = addvi3f3(1, 2, 3, 4, 5, 6);
0023B5C0 push 5
0023B5C2 movss xmm2,dword ptr [__real@40c00000 (0240CE8h)]
0023B5CA movss xmm1,dword ptr [__real@40800000 (0240CE0h)]
0023B5D2 mov edx,3
0023B5D7 movss xmm0,dword ptr [__real@40000000 (0240CC8h)]
0023B5DF mov ecx,1
0023B5E4 call addvi3f3 (0231712h)
0023B5E9 movss dword ptr [c],xmm0
可以看到 1 3 5 符合fastcall調用 而2 4 6 則通過寄存器傳遞的,這樣vectorcall 比傳統調用約定更加快速