简体   繁体   English

从size_t转换为int

[英]Conversion from size_t to int

Following this thread ... 跟随这个线程 ...

For this piece of code: 对于这段代码:

#include <stdio.h>

int main(void)
{
    int i;
    size_t u;

    for (i = 0; i < 10; i++) {
        u = (size_t)i;
        printf("i = %d, u = %zu\n", i, u);
    }
    return 0;
}

The output in assembly is: 汇编中的输出为:

EDIT : Compiled with -O2 编辑 :用-O2编译

    .file   "demo.c"
    .section    .rodata.str1.1,"aMS",@progbits,1
.LC0:
    .string "i = %d, u = %zu\n"
    .section    .text.startup,"ax",@progbits
    .p2align 4,,15
    .globl  main
    .type   main, @function
main:
.LFB3:
    .cfi_startproc
    pushq   %rbx
    .cfi_def_cfa_offset 16
    .cfi_offset 3, -16
    xorl    %ebx, %ebx
    .p2align 4,,10
    .p2align 3
.L2:
    movq    %rbx, %rdx
    movl    %ebx, %esi
    xorl    %eax, %eax
    movl    $.LC0, %edi
    addq    $1, %rbx
    call    printf
    cmpq    $10, %rbx
    jne .L2
    xorl    %eax, %eax
    popq    %rbx
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc
.LFE3:
    .size   main, .-main
    .ident  "GCC: (Debian 4.7.2-5) 4.7.2"
    .section    .note.GNU-stack,"",@progbits

Is the conversion u = (size_t)i; 是转换u = (size_t)i; consuming extra cycles? 消耗额外的周期?

Yes, as the code is posted, certainly. 是的,肯定会发布代码。 Your conversion is here: 您的转换在这里:

movl    -4(%rbp), %eax
cltq
movq    %rax, -16(%rbp)

Of course, this code is unoptimized, so it's not a very fair comparison. 当然,此代码未经过优化,因此它不是一个很公平的比较。 If you compile it with optimization, the compiler may realize that the values are always positive and just do a single move from whatever register holds i to %rdx that holds the third argument. 如果您使用优化对其进行编译,则编译器可能会意识到这些值始终为正,并且只需从保存i任何寄存器向保存第三个参数的%rdx进行一次移动即可。

Edit: 编辑:

As suspected, there is essentially no overhead in the optimized code. 如所怀疑的,优化的代码基本上没有开销。 In this case, the compiler has converted the loop to count up u , and derive i from u instead of the other way around, so %rbx is used for the loop, and the value of i is just using %ebx , which is the lower 32 bits of %rbx - so there is no overhead in this example . 在这种情况下,编译器已将循环转换为对u进行计数,并从u导出i ,而不是反过来,因此%rbx用于循环,而i的值仅使用%ebx ,即%rbx低32位-因此在此示例中没有开销。 I emphasise this, since there may well be other cases where converting from int to size_t will have a penalty. 我强调这一点,因为在其他情况下 ,从int转换为size_t会有损失。 It completely depends on the circumstances. 这完全取决于情况。

yes, it does, as it changes the internal representation from 32bit to 64bit. 是的,它可以,因为它将内部表示形式从32位更改为64位。 specifically, 特别,

.L3:
    movl    -4(%rbp), %eax
    cltq
    movq    %rax, -16(%rbp)
    movq    -16(%rbp), %rdx

reads i , performs sign-extension and copying to %rdx . 读取i ,执行符号扩展并将其复制到%rdx i'm unsure why this value has to pass through the stack - as mats pointed out, this looks like code from a non-noptimizing compiler run. 我不确定为什么这个值必须通过堆栈-正如垫指出,这看起来像是非优化编译器运行的代码。

EDIT 编辑

in the optimized assembly code, the loop counter is maintained as the wider data type. 在优化的汇编代码中,循环计数器将保留为较宽的数据类型。 afair, mov s between registers don't differ in run-time cycles wrt quad or dword (indeed they don't: see table C-16 in intels pertinent doc , referenced by this SO post . 公平地说,寄存器之间的mov在运行时周期中没有区别,实际上是quad或dword(实际上它们没有:请参见intel相关文档中的表C-16 此SO post引用了 文档

Not sure if this is the actual assignment that's consuming cycles for you i believe this is the assignment thats consuming cycles 不知道这是否是消耗周期的实际作业,我相信这是消耗周期的作业

for example looc at this t1.c 例如在此t1.c的looc

#include <stdio.h>

int main(void)
{
    int i;
    size_t u;

    for (i = 0; i < 10; i++) {
        printf("i = %d, u = %zu\n", i, u);
    }
    return 0;
}

and the assmebly for t1.c 和t1.c的组装

        .file   "t1.c"
        .section        .rodata
.LC0:
        .string "i = %d, u = %zu\n"
        .text
.globl main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        andl    $-16, %esp
        subl    $32, %esp
        movl    $0, 24(%esp)
        jmp     .L2
.L3:
        movl    $.LC0, %eax
        movl    28(%esp), %edx
        movl    %edx, 8(%esp)
        movl    24(%esp), %edx
        movl    %edx, 4(%esp)
        movl    %eax, (%esp)
        call    printf
        addl    $1, 24(%esp)
.L2:
        cmpl    $9, 24(%esp)
        jle     .L3
        movl    $0, %eax
        leave
        ret
        .size   main, .-main
        .ident  "GCC: (GNU) 4.4.6 20110731 (Red Hat 4.4.6-3)"
        .section        .note.GNU-stack,"",@progbits

in the above case no assignment atall for its ok for now 在上述情况下,目前尚无分配

second case t2.c 第二种情况t2.c

#include <stdio.h>

int main(void)
{
    int i;
    size_t u;

    for (i = 0; i < 10; i++) {
        i = (size_t) u;
        printf("i = %d, u = %zu\n", i, u);
    }
    return 0;
}

and the subsequent assmebly 以及随后的组装

        .file   "t2.c"
        .section        .rodata
.LC0:
        .string "i = %d, u = %zu\n"
        .text
.globl main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        andl    $-16, %esp
        subl    $32, %esp
        movl    $0, 24(%esp)
        jmp     .L2
.L3:
        movl    28(%esp), %eax
        movl    %eax, 24(%esp)
        movl    $.LC0, %eax
        movl    28(%esp), %edx
        movl    %edx, 8(%esp)
        movl    24(%esp), %edx
        movl    %edx, 4(%esp)
        movl    %eax, (%esp)
        call    printf
        addl    $1, 24(%esp)
.L2:
        cmpl    $9, 24(%esp)
        jle     .L3
        movl    $0, %eax
        leave
        ret
        .size   main, .-main
        .ident  "GCC: (GNU) 4.4.6 20110731 (Red Hat 4.4.6-3)"
        .section        .note.GNU-stack,"",@progbits

Check the statements above 检查上面的陈述

movl    28(%esp), %eax
movl    %eax, 24(%esp)

now for the last example t3.c 现在是最后一个例子t3.c

#include <stdio.h>

int main(void)
{
    int i;
    int u;

    for (i = 0; i < 10; i++) {
        i = u;
        printf("i = %d, u = %zu\n", i, u);
    }
    return 0;
}

and the subsequent assembly 以及随后的组装

        .file   "t3.c"
        .section        .rodata
.LC0:
        .string "i = %d, u = %zu\n"
        .text
.globl main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        andl    $-16, %esp
        subl    $32, %esp
        movl    $0, 24(%esp)
        jmp     .L2
.L3:
        movl    28(%esp), %eax
        movl    %eax, 24(%esp)
        movl    $.LC0, %eax
        movl    28(%esp), %edx
        movl    %edx, 8(%esp)
        movl    24(%esp), %edx
        movl    %edx, 4(%esp)
        movl    %eax, (%esp)
        call    printf
        addl    $1, 24(%esp)
.L2:
        cmpl    $9, 24(%esp)
        jle     .L3
        movl    $0, %eax
        leave
        ret
        .size   main, .-main
        .ident  "GCC: (GNU) 4.4.6 20110731 (Red Hat 4.4.6-3)"
        .section        .note.GNU-stack,"",@progbits

Now you can observe t2 and t3 and see the difference here, but really varies from arch to arch though 现在您可以观察到t2和t3,并在这里看到了区别,但实际上每个拱形之间的差异很大

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM