简体   繁体   English

如何使用 x86 汇编语言将两个 64 位数字相乘?

[英]How can I multiply two 64-bit numbers using x86 assembly language?

How would I go about...我该怎么办……

  • multiplying two 64-bit numbers两个 64 位数字相乘

  • multiplying two 16-digit hexadecimal numbers两个 16 位十六进制数相乘

...using Assembly Language. ...使用汇编语言。

I'm only allowed to use registers %eax, %ebx, %ecx, %edx, and the stack.我只允许使用寄存器 %eax、%ebx、%ecx、%edx 和堆栈。

EDIT: Oh, I'm using ATT Syntax on the x86编辑:哦,我在 x86 上使用 ATT 语法
EDIT2: Not allowed to decompile into assembly... EDIT2:不允许反编译成程序集...

Use what should probably be your course textbook, Randall Hyde's "The Art of Assembly Language".使用应该是您的课程教科书的 Randall Hyde 的“汇编语言艺术”。

See 4.2.4 - Extended Precision Multiplication4.2.4 - 扩展精度乘法

Although an 8x8, 16x16, or 32x32 multiply is usually sufficient, there are times when you may want to multiply larger values together.尽管 8x8、16x16 或 32x32 乘法通常就足够了,但有时您可能希望将更大的值相乘。 You will use the x86 single operand MUL and IMUL instructions for extended precision multiplication ..您将使用 x86 单操作数 MUL 和 IMUL 指令进行扩展精度乘法..

Probably the most important thing to remember when performing an extended precision multiplication is that you must also perform a multiple precision addition at the same time .执行扩展精度乘法时要记住的最重要的事情可能是您还必须同时执行多精度加法 Adding up all the partial products requires several additions that will produce the result.将所有部分乘积相加需要多次相加才能产生结果。 The following listing demonstrates the proper way to multiply two 64 bit values on a 32 bit processor ..以下清单演示了在 32 位处理器上将两个 64 位值相乘的正确方法..

(See the link for full assembly listing and illustrations.) (请参阅完整组件列表和插图的链接。)

If this was 64x86,如果这是 64x86,

function(x, y, *lower, *higher)
movq %rx,%rax     #Store x into %rax
mulq %y           #multiplies %y to %rax
#mulq stores high and low values into rax and rdx.
movq %rax,(%r8)   #Move low into &lower
movq %rdx,(%r9)   #Move high answer into &higher

This code assumes you want x86 (not x64 code), that you probably only want a 64 bit product, and that you don't care about overflow or signed numbers.此代码假定您需要 x86(而不是 x64 代码),您可能只需要 64 位产品,并且您不关心溢出或带符号的数字。 (A signed version is similar). (签名版本类似)。

MUL64_MEMORY:
     mov edi, val1high
     mov esi, val1low
     mov ecx, val2high
     mov ebx, val2low
MUL64_EDIESI_ECXEBX:
     mov eax, edi
     mul ebx
     xch eax, ebx  ; partial product top 32 bits
     mul esi
     xch esi, eax ; partial product lower 32 bits
     add ebx, edx
     mul ecx
     add ebx, eax  ; final upper 32 bits
; answer here in EBX:ESI

This doesn't honor the exact register constraints of OP, but the result fits entirely in the registers offered by the x86.这不符合 OP 的确切寄存器约束,但结果完全适合 x86 提供的寄存器。 (This code is untested, but I think it's right). (此代码未经测试,但我认为它是正确的)。

[Note: I transferred (my) this answer from another question that got closed, because NONE of the other "answers" here directly answered the question]. [注意:我从另一个已关闭的问题转移了(我的)这个答案,因为这里的其他“答案”都没有直接回答这个问题]。

Since you're on x86 you need 4 mull instructions.由于您使用的是 x86,因此您需要 4 条 mull 指令。 Split the 64bit quantities into two 32bit words and multiply the low words to the lowest and 2nd lowest word of the result, then both pairs of low and high word from different numbers (they go to the 2nd and 3rd lowest word of the result) and finally both high words into the 2 highest words of the result.将 64 位数量拆分为两个 32 位字并将低字乘以结果的最低和第二低的字,然后将来自不同数字的两对低和高的字(它们转到结果的第二和第三低的字)和最后两个高的词变成了结果的2个最高的词。 Add them all together not forgetting to deal with carry.将它们全部加在一起不要忘记处理carry。 You didn't specify the memory layout of the inputs and outputs so it's impossible to write sample code.您没有指定输入和输出的内存布局,因此无法编写示例代码。

It depends what language you are using.这取决于您使用的语言。 From what I remember from learning MIPS assembly, there is a Move From High command and a Move From Lo command, or mflo and mfhi.根据我在学习 MIPS 汇编时的记忆,有一个 Move From High 命令和一个 Move From Lo 命令,或 mflo 和 mfhi。 mfhi stores the top 64bits while mflo stores the lower 64bits of the total number. mfhi 存储最高 64 位,而 mflo 存储总数的低 64 位。

ah assembly, been awhile since i've used it.啊组装,自从我使用它以来已经有一段时间了。 so i'm assuming the real problem here is that the microcontroller (what i used to write code for in assembly anyways) you're working on doesn't have 64 bit registers?所以我假设这里真正的问题是你正在使用的微控制器(无论如何我用来在汇编中编写代码)没有 64 位寄存器? if that's the case, you're going to have the break the numbers you're working with apart and perform multiple multiplications with the pieces.如果是这种情况,您将打破您正在处理的数字,并对这些数字进行多次乘法运算。

this sounds like it's a homework assignment from the way you've worded it, so i'm not gonna spell it out much further :P从你的措辞来看,这听起来像是一项家庭作业,所以我不会再详细说明了:P

Just do normal long multiplication, as if you were multiplying a pair of 2-digit numbers, except each "digit" is really a 32-bit integer.只需进行普通的长乘法,就好像您在乘以一对 2 位数字一样,除了每个“数字”实际上是一个 32 位整数。 If you're multiplying two numbers at addresses X and Y and storing the result in Z, then what you want to do (in pseudocode) is:如果您将地址 X 和 Y 处的两个数字相乘并将结果存储在 Z 中,那么您想要做的(伪代码)是:

Z[0..3] = X[0..3] * Y[0..3]
Z[4..7] = X[0..3] * Y[4..7] + X[4..7] * Y[0..3]

Note that we're discarding the upper 64 bits of the result (since a 64-bit number times a 64-bit number is a 128-bit number).请注意,我们丢弃了结果的高 64 位(因为 64 位数字乘以 64 位数字是一个 128 位数字)。 Also note that this is assuming little-endian.还要注意,这是假设小端。 Also, be careful about a signed versus an unsigned multiply.另外,请注意有符号乘法与无符号乘法。

Find a C compiler that supports 64bit (GCC does IIRC) compile a program that does just that, then get the disassembly.找到一个支持 64 位的 C 编译器(GCC 执行 IIRC)编译一个程序,然后进行反汇编。 GCC can spit it out on it's own and you can get it out of object file with the right tools. GCC 可以自己吐出它,您可以使用正确的工具将其从目标文件中取出。

OTOH their is a 32bX32b = 64b op on x86 OTOH 他们是 x86 上的 32bX32b = 64b op

a:b * c:d = e:f
// goes to
e:f = b*d;
x:y = a*d;  e += x;
x:y = b*c;  e += x;

everything else overflows其他一切都溢出了

(untested) (未经测试)

Edit Unsigned only编辑未签名

I'm betting you're a student, so see if you can make this work: Do it word by word, and use bit shifts.我打赌你是个学生,所以看看你能不能完成这项工作:一个字一个字地做,并使用位移。 Think up the most efficient solution.想出最有效的解决方案。 Beware of the sign bit.小心符号位。

If you want 128 mode try this...如果你想要 128 模式试试这个...

__uint128_t AES::XMULTX(__uint128_t TA,__uint128_t TB)
{
    union
    {
        __uint128_t WHOLE;
        struct
        {
            unsigned long long int LWORDS[2];
        } SPLIT;
    } KEY;
    register unsigned long long int __XRBX,__XRCX,__XRSI,__XRDI;
    __uint128_t RESULT;

    KEY.WHOLE=TA;
    __XRSI=KEY.SPLIT.LWORDS[0];
    __XRDI=KEY.SPLIT.LWORDS[1];
    KEY.WHOLE=TB;
    __XRBX=KEY.SPLIT.LWORDS[0];
    __XRCX=KEY.SPLIT.LWORDS[1];
    __asm__ __volatile__(
                 "movq          %0,             %%rsi           \n\t"       
                 "movq          %1,             %%rdi           \n\t"
                 "movq          %2,             %%rbx           \n\t"
                 "movq          %3,             %%rcx           \n\t"
                 "movq          %%rdi,          %%rax           \n\t"
                 "mulq          %%rbx                           \n\t"
                 "xchgq         %%rbx,          %%rax           \n\t"
                 "mulq          %%rsi                           \n\t"
                 "xchgq         %%rax,          %%rsi           \n\t"
                 "addq          %%rdx,          %%rbx           \n\t"
                 "mulq          %%rcx                           \n\t"
                 "addq          %%rax,          %%rbx           \n\t"
                 "movq          %%rsi,          %0              \n\t"
                 "movq          %%rbx,          %1              \n\t"
                 : "=m" (__XRSI), "=m" (__XRBX)
                 : "m" (__XRSI),  "m" (__XRDI), "m" (__XRBX), "m" (__XRCX)
                 : "rax","rbx","rcx","rdx","rsi","rdi"
                 );
    KEY.SPLIT.LWORDS[0]=__XRSI;
    KEY.SPLIT.LWORDS[1]=__XRBX;
    RESULT=KEY.WHOLE;
    return RESULT;
}

If you want 128bit multiplication then this should work this is in AT&T format.如果你想要 128 位乘法,那么这应该是 AT&T 格式的。

__uint128_t FASTMUL128(const __uint128_t TA,const __uint128_t TB)
{
    union
    {
        __uint128_t WHOLE;
        struct
        {
            unsigned long long int LWORDS[2];
        } SPLIT;
    } KEY;
    register unsigned long long int __RAX,__RDX,__RSI,__RDI;
    __uint128_t RESULT;

KEY.WHOLE=TA;
__RAX=KEY.SPLIT.LWORDS[0];
__RDX=KEY.SPLIT.LWORDS[1];
KEY.WHOLE=TB;
__RSI=KEY.SPLIT.LWORDS[0];
__RDI=KEY.SPLIT.LWORDS[1];
__asm__ __volatile__(
    "movq           %0,                             %%rax                   \n\t"
    "movq           %1,                             %%rdx                   \n\t"
    "movq           %2,                             %%rsi                   \n\t"
    "movq           %3,                             %%rdi                   \n\t"
    "movq           %%rsi,                          %%rbx                   \n\t"
    "movq           %%rdi,                          %%rcx                   \n\t"
    "movq           %%rax,                          %%rsi                   \n\t"
    "movq           %%rdx,                          %%rdi                   \n\t"
    "xorq           %%rax,                          %%rax                   \n\t"
    "xorq           %%rdx,                          %%rdx                   \n\t"
    "movq           %%rdi,                          %%rax                   \n\t"
    "mulq           %%rbx                                                   \n\t"
    "xchgq          %%rbx,                          %%rax                   \n\t"
    "mulq           %%rsi                                                   \n\t"
    "xchgq          %%rax,                          %%rsi                   \n\t"
    "addq           %%rdx,                          %%rbx                   \n\t"
    "mulq           %%rcx                                                   \n\t"
    "addq           %%rax,                          %%rbx                   \n\t"
    "movq           %%rsi,                          %%rax                   \n\t"
    "movq           %%rbx,                          %%rdx                   \n\t"
    "movq           %%rax,                          %0                      \n\t"
    "movq           %%rdx,                          %1                      \n\t"
    "movq           %%rsi,                          %2                      \n\t"
    "movq           %%rdi,                          %3                      \n\t"
    : "=m"(__RAX),"=m"(__RDX),"=m"(__RSI),"=m"(__RDI)
    :  "m"(__RAX), "m"(__RDX), "m"(__RSI), "m"(__RDI)
    : "rax","rbx","ecx","rdx","rsi","rdi"
);
KEY.SPLIT.LWORDS[0]=__RAX;
KEY.SPLIT.LWORDS[1]=__RDX;
RESULT=KEY.WHOLE;
return RESULT;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM