简体   繁体   English

如何将 C 中的 NULL 等价物推送到汇编中的堆栈中?

[英]How do I push the equivalent of NULL in C to the stack in assembly?

I'm writing a bubble sort for string sorting in assembly language and I'm using strtok() to tokenize the string.我正在为汇编语言中的字符串排序编写冒泡排序,并且我正在使用 strtok() 来标记字符串。 However, after the first call strtok(str," "), I need to pass NULL as a parameter, ie strtok(NULL," ")但是,在第一次调用strtok(str," ")之后,需要传入NULL作为参数,即strtok(NULL," ")

I've tried NULL equ 0 in the .bss segment but this doesn't do anything.我在 .bss 段中尝试了 NULL equ 0 但这没有任何作用。

[SECTION .data]

[SECTION .bss]

string resb 64
NULL equ 0

[SECTION .text]

extern fscanf
extern stdin
extern strtok

global main

main:

    push ebp        ; Set up stack frame for debugger
    mov ebp,esp
    push ebx        ; Program must preserve ebp, ebx, esi, & edi
    push esi
    push edi

    push cadena
    push frmt
    push dword [stdin]      ;Read string from stdin
    call fscanf
    add esp,12              ;clean stack

    push delim
    push string             ;this works
    call strtok
    add esp,8               ;clean stack

    ;after this step, the return value in eax points to the first word 

    push string             ;this does not
    push NULL
    call strtok
    add esp,8               ;clean stack

    ;after this step, eax points to 0x0

    pop edi         ; Restore saved registers
    pop esi
    pop ebx
    mov esp,ebp     ; Destroy stack frame before returning
    pop ebp
    ret         ;return control to linux

I've read that in "most implementations" NULL points to 0, whatever that means.我读过在“大多数实现”中 NULL 指向 0,无论这意味着什么。 Why is there ambiguity?为什么会有歧义? What is the equivalent to NULL in x86 instruction set? x86 指令集中的 NULL 等价物是什么?

 push NULL 
 push string 
 call strtok

this is calling strtok(string, NULL) .这是调用strtok(string, NULL) You want strtok(NULL, " ") , so presuming that delim contains " " :你想要strtok(NULL, " ") ,所以假设delim包含" "

 push delim
 push NULL
 call strtok

Parameters go onto the stack in reverse (right-to-left) order in the cdecl calling convention.cdecl调用约定中,参数以相反(从右到左)的顺序进入堆栈。


For the other part of your question (is NULL always zero), see : Is NULL always zero in C?对于问题的另一部分(NULL 始终为零),请参阅: C 中的 NULL 始终为零吗?

I've read that in "most implementations" NULL points to 0, whatever that means.我读过在“大多数实现”中 NULL 指向 0,无论这意味着什么。

No, it is 0;不,0; it's not a pointer to anything.它不是一个指向任何东西。 So yes, NULL equ 0 is correct, or just push 0 .所以是的, NULL equ 0是正确的,或者只是push 0

In C source, (void*)0 is always NULL, but implementations are allowed to internally use a different non-zero bit-pattern for the object-representation of int *p = NULL;在 C 源代码中, (void*)0始终为 NULL,但允许实现在内部使用不同的非零位模式来表示int *p = NULL; . . Implementations that choose a non-zero bit-pattern need to translate at compile time.选择非零位模式的实现需要在编译时进行转换。 (And the translation only works at compile time for compile-time integer constant expressions with value zero that appear in a pointer context, not for memset or whatever.) The C++ FAQ has a whole section on NULL pointers . (并且翻译在编译时适用于出现在指针上下文中的值为 0 的编译时整数常量表达式,不适用于memset或其他任何内容。) C++ FAQ 有一整节关于NULL pointers (Which also applies to C in this case.) (在这种情况下,这也适用于 C。)

(It's legal in C to access the bit-pattern of an object with memcpy into an integer, or with (char*) aliasing onto it, so it is possible to detect this in a well-formed program that's free from undefined behaviour. Or of course by looking at the asm or memory contents with a debugger! In practice you can easily check that the right asm for a NULL is by compiling int*foo(){return NULL;} ) (在 C 中使用 memcpy 将对象的位模式访问为整数或使用(char*)别名访问对象的位模式在 C 中是合法的,因此可以在没有未定义行为的格式良好的程序中检测到这一点。或者当然,通过使用调试器查看 asm 或内存内容!在实践中,您可以通过编译int*foo(){return NULL;}轻松检查 NULL 的正确 asm

See also Why is address zero used for the null pointer?另请参阅为什么地址零用于空指针? for some more background.了解更多背景。

Why is there ambiguity?为什么会有歧义? What is the equivalent to NULL in x86 instruction set? x86 指令集中的 NULL 等价物是什么?

In all x86 calling conventions / ABIs, the asm bit-pattern for NULL pointers is integer 0 .在所有 x86 调用约定/ABI 中,空指针的 asm 位模式是整数 0

So push 0 or xor edi,edi (RDI=0) is always what you want on x86 / x86-64.所以push 0xor edi,edi (RDI=0) 在 x86 / x86-64 上总是你想要的。 (Modern calling conventions, including all x86-64 conventions, pass args in registers.) Windows x64 passes the first arg in RCX, not RDI. (现代调用约定,包括所有 x86-64 约定,在寄存器中传递 args。)Windows x64 在 RCX 中传递第一个 arg,而不是 RDI。


@J...'s answer shows how to push args in right-to-left order for the calling convention you're using , resulting in the first (left-most) arg at the lowest address. @J... 的答案显示了如何为您正在使用的调用约定按从右到左的顺序推送 args ,从而在最低地址处生成第一个(最左边的)arg。

Really you can store them to the stack however you like (eg with mov ) as long as they end up in the right place when call runs.实际上,您可以将它们存储到堆栈中,但是您喜欢(例如使用mov ),只要它们在call运行时最终位于正确的位置即可。


The C standard allows it to be different because C implementations on some hardware might want to use something else, eg a special bit-pattern that always faults when dereferenced, regardless of context. C 标准允许它不同,因为某些硬件上的 C 实现可能想要使用其他东西,例如一个特殊的位模式,在取消引用时总是出错,而不管上下文。 Or if 0 was a valid address value in real programs, it's better if p==NULL is always false for valid pointers.或者,如果0是实际程序中的有效地址值,那么对于有效指针而言,如果p==NULL始终为 false,则更好。 Or any other arcane hardware-specific reason.或任何其他神秘的硬件特定原因。

So yes there could have been some C implementations for x86 where (void*)0 in the C source turns into a non-zero integer in the asm.所以是的,可能有一些 x86 的 C 实现,其中 C 源代码中的(void*)0变成 asm 中的非零整数。 But in practice there aren't.但实际上并没有。 (And most programmers are happy that memset(array_of_pointers, 0, size) actually sets them to NULL, which relies on the bit-pattern being 0 , because some code makes that assumption without thinking about the fact that it's not guaranteed to be portable). (并且大多数程序员很高兴memset(array_of_pointers, 0, size)实际上将它们设置为 NULL,这依赖于位模式为0 ,因为某些代码在没有考虑它不能保证可移植的事实的情况下做出这种假设) .

This is not done on x86 in any of the standard C ABIs.这在任何标准 C ABI 中都没有在 x86 上完成。 (An ABI is a set of implementation choices that compilers all follow so their code can call each other, eg agreeing on struct layout, calling conventions, and what p == NULL means.) (ABI 是编译器都遵循的一组实现选择,因此它们的代码可以相互调用,例如就结构布局、调用约定以及p == NULL含义达成一致。)

I'm not aware of any modern C implementations that use non-zero NULL on other 32 or 64-bit CPUs either;我不知道有任何现代 C 实现在其他 32 位或 64 位 CPU 上使用非零NULL virtual memory makes it easy to avoid address 0.虚拟内存可以轻松避免地址 0。

http://c-faq.com/null/machexamp.html has some historical examples: http://c-faq.com/null/machexamp.html有一些历史例子:

The Prime 50 series used segment 07777 , offset 0 for the null pointer, at least for PL/I. Prime 50 系列使用段07777 ,空指针的偏移量为0 ,至少对于 PL/I。 Later models used segment 0 , offset 0 for null pointers in C, necessitating new instructions such as TCNP (Test C Null Pointer), evidently as a sop to [footnote] all the extant poorly-written C code which made incorrect assumptions.后来的模型在 C 中使用段0和偏移量 0 来表示空指针,因此需要新的指令,例如TCNP (测试 C 空指针),显然是为了 [脚注] 所有现存的编写不当的 C 代码做出了不正确的假设。 Older, word-addressed Prime machines were also notorious for requiring larger byte pointers ( char * ) than word pointers ( int * ).较旧的字寻址 Prime 机器也因需要比字指针( int * )更大的字节指针( char * )而臭名昭著。

... see the link for more machines, and the footnote from this paragraph. ...查看更多机器的链接,以及本段的脚注。

https://www.quora.com/On-which-actual-architectures-is-Cs-null-pointer-not-a-binary-zero-all-bits-zero reports finding a non-zero NULL on 286 Xenix, I guess using segmented pointers. https://www.quora.com/On-which-actual-architectures-is-Cs-null-pointer-not-a-binary-zero-all-bits-zero报告在 286 Xenix 上发现非零 NULL,我想使用分段指针。


Modern x86 OSes make sure processes can't map anything into the lowest page of virtual address space, so NULL pointer dereference always faults noisily to make debugging easier.现代 x86 操作系统确保进程无法将任何内容映射到虚拟地址空间的最低页面,因此 NULL 指针取消引用总是会发出嘈杂的错误以使调试更容易。

eg Linux by default reserves the low 64kiB of address space ( vm.mmap_min_address ).例如,Linux 默认保留低 64kiB 的地址空间( vm.mmap_min_address )。 This helps whether it came from a NULL pointer in the source, or whether some other bug zeroed a pointer with integer zeros.这有助于它是否来自源中的 NULL 指针,或者其他错误是否将带有整数零的指针归零。 64k instead of just the low 4k page catches indexing a pointer as an array, like p[i] with small to medium i values. 64k 而不仅仅是低 4k 页面捕获索引指针作为数组,例如p[i]具有小到中等i值。

Fun fact: Windows 95 mapped the lowest pages of user-space virtual address space to the first 64kiB of physical memory to work around a 386 B1 stepping erratum.有趣的事实:Windows 95 将用户空间虚拟地址空间的最低页面映射到物理内存的前 64 KB 以解决 386 B1 步进错误。 But fortunately it was able to set things up so access from a normal 32-bit process did fault.但幸运的是,它能够进行设置,因此从正常的 32 位进程访问确实出错了。 Still, 16-bit code running in DOS compat mode could trash the whole machine very easily.尽管如此,在 DOS 兼容模式下运行的 16 位代码很容易破坏整个机器。

Seehttps://devblogs.microsoft.com/oldnewthing/20141003-00/?p=43923 and https://news.ycombinator.com/item?id=13263976请参阅https://devblogs.microsoft.com/oldnewthing/20141003-00/?p=43923https://news.ycombinator.com/item?id=13263976

You are actually asking two questions:你实际上是在问两个问题:

Question 1问题 1

I've read that ... NULL points to 0, whatever that means.我读过...... NULL 指向 0,无论这意味着什么。

This means that nearly all C compilers define NULL as (void *)0 .这意味着几乎所有 C 编译器都将NULL定义为(void *)0

This means that a NULL pointer is a pointer to the memory location with the address zero.这意味着NULL指针是指向地址为零的内存位置的指针。

I've read that in "most implementations" ...我在“大多数实现”中读到过......

"Most" mean that before the introduction of ISO C and ANSI C in the late 1980s , there were C compilers that defined NULL in a different way. “大多数”是指1980 年代后期引入 ISO C 和 ANSI C之前,存在以不同方式定义NULL C 编译器。

Maybe a few non-standard C compilers still exist that do not recognize the address 0 as NULL .也许仍然存在一些不将地址 0 识别为NULL的非标准C 编译器。

However, you can assume that your C compiler and the C library you use in your assembly project defines NULL as pointer to the address 0.但是,您可以假设您的 C 编译器和您在汇编项目中使用的 C 库将NULL定义为指向地址 0 的指针。

Question 2问题2

How do I push the equivalent of NULL in C to the stack in assembly?如何将 C 中的NULL等价物推送到汇编中的堆栈中?

A pointer is an address.指针是地址。

(Unlike some other CPUs), x86 CPUs don't distinguish between integers and addresses: (与其他一些 CPU 不同),x86 CPU 不区分整数和地址:

You push a NULL pointer by pushing the integer value 0.您可以通过推送整数值 0 来推送NULL指针。

 NULL equ 0 push NULL

Unfortunately, you did not write which assembler you use.不幸的是,您没有编写您使用的汇编程序。 (Other users assume it is NASM.) (其他用户认为它是 NASM。)

In this context, the instruction push NULL may be interpreted in two different ways by different assemblers:在这种情况下,指令push NULL可能被不同的汇编程序以两种不同的方式解释:

  • Some assemblers would interpret this as: " Push the value 0 ".一些汇编程序会将其解释为:“推送值 0 ”。

    This would be correct.这是正确的。

  • Other assemblers would interpret this as: " Read the memory at memory location 0 and push that value "其他汇编程序会将其解释为:“读取内存位置 0 处的内存并推送该值

    This would be equal to someFunction(*(int *)NULL) in C and therefore cause an exception ( NULL pointer access).这将等于 C 中的someFunction(*(int *)NULL)并因此导致异常( NULL指针访问)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM