[英]ARM inline assembly for 32-bit word rotate
我正在嘗試制作一些內聯匯編來測試ARM上的旋轉性能。 該代碼是C ++代碼庫的一部分,因此旋轉是模板專業化。 代碼在下面,但是它產生的消息對我來說沒有多大意義。
根據ARM匯編語言 ,說明大致如下:
# rotate - rotate instruction
# dst - output operand
# lhs - value to be rotated
# rhs - rotate amount (immediate or register)
<rotate> <dst>, <lhs>, <rhs>
它們沒有多大意義,因為例如(對我而言),我使用g
來約束輸出寄存器,而這只是每個Simple Contraints的通用寄存器。 ARM應該有很多,並且特定於機器的約束似乎沒有改變約束的行為。
我不確定解決此問題的最佳方法,因此我將提出三個問題:
arm-linux-androideabi-g++ -DNDEBUG -g2 -Os -pipe -fPIC -mfloat-abi=softfp
-mfpu=vfpv3-d16 -mthumb --sysroot=/opt/android-ndk-r10e/platforms/android-21/arch-arm
-I/opt/android-ndk-r10e/sources/cxx-stl/stlport/stlport/ -c camellia.cpp
In file included from seckey.h:9:0,
from camellia.h:9,
from camellia.cpp:14:
misc.h: In function 'T CryptoPP::rotlFixed(T, unsigned int) [with T = unsigned int]':
misc.h:1121:71: error: matching constraint not valid in output operand
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
^
misc.h:1121:71: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotrFixed(T, unsigned int) [with T = unsigned int]':
misc.h:1129:71: error: matching constraint not valid in output operand
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
^
misc.h:1129:71: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotlVariable(T, unsigned int) [with T = unsigned int]':
misc.h:1137:72: error: matching constraint not valid in output operand
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "g1" ((int)(y%32)));
^
misc.h:1137:72: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotrVariable(T, unsigned int) [with T = unsigned int]':
misc.h:1145:72: error: matching constraint not valid in output operand
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "g1" ((int)(y%32)));
^
misc.h:1145:72: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotrFixed(T, unsigned int) [with T = unsigned int]':
misc.h:1129:71: error: matching constraint not valid in output operand
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
^
misc.h:1129:71: error: invalid lvalue in asm output 0
misc.h:1129:71: error: matching constraint references invalid operand number
misc.h: In function 'T CryptoPP::rotlFixed(T, unsigned int) [with T = unsigned int]':
misc.h:1121:71: error: matching constraint not valid in output operand
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
^
misc.h:1121:71: error: invalid lvalue in asm output 0
misc.h:1121:71: error: matching constraint references invalid operand number
// ROL #n Rotate left immediate
template<> inline word32 rotlFixed<word32>(word32 x, unsigned int y)
{
int z;
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
return static_cast<word32>(z);
}
// ROR #n Rotate right immediate
template<> inline word32 rotrFixed<word32>(word32 x, unsigned int y)
{
int z;
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "M1" ((int)(y%32)));
return static_cast<word32>(z);
}
// ROR rn Rotate left by a register
template<> inline word32 rotlVariable<word32>(word32 x, unsigned int y)
{
int z;
__asm__ ("rol %2, %0, %1" : "=g2" (z) : "g0" (x), "g1" ((int)(y%32)));
return static_cast<word32>(z);
}
// ROR rn Rotate right by a register
template<> inline word32 rotrVariable<word32>(word32 x, unsigned int y)
{
int z;
__asm__ ("ror %2, %0, %1" : "=g2" (z) : "g0" (x), "g1" ((int)(y%32)));
return static_cast<word32>(z);
}
template<> inline word32 rotlMod<word32>(word32 x, unsigned int y)
{
return rotlVariable<word32>(x, y);
}
template<> inline word32 rotrMod<word32>(word32 x, unsigned int y)
{
return rotrVariable<word32>(x, y);
}
首先,ARM沒有向左旋轉( ROL
),您需要通過ROR
進行模擬。
其次,由於某種原因, M
約束接受0到32,但是ROL
在處理立即數時僅接受0到31。
第三, g
約束過於籠統,因為它還允許ROR
不接受的內存操作數。 最好改用r
。
這是我想出的:
// Rotate right
inline word32 rotr(word32 x, unsigned int y)
{
int z;
if (__builtin_constant_p(y))
{
y &= 31;
if (y != 0) // this should be optimized away by the compiler
{
__asm__ ("ror %0, %1, %2" : "=r" (z) : "r" (x), "M" (y));
}
} else {
__asm__ ("ror %0, %1, %2" : "=r" (z) : "r" (x), "r" (y));
}
return static_cast<word32>(z);
}
// Rotate left
inline word32 rotl(word32 x, unsigned int y)
{
int z;
if (__builtin_constant_p(y))
{
y &= 31;
if (y != 0) // this should be optimized away by the compiler
{
__asm__ ("ror %0, %1, %2" : "=r" (z) : "r" (x), "M" (32 - y));
}
} else {
__asm__ ("ror %0, %1, %2" : "=r" (z) : "r" (x), "r" (32 - y));
}
return static_cast<word32>(z);
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.