简体   繁体   English

将指针从C传递到程序集

[英]Passing a pointer from C to assembly

I want to use "_test_and_set lock" assembly language implementation with atomic swap assembly instruction in my C/C++ program. 我想在我的C / C ++程序中使用“_test_and_set lock”汇编语言实现和原子交换汇编指令。

class LockImpl 
{
  public:
  static void lockResource(DWORD resourceLock )
  {
    __asm 
    {
      InUseLoop:  mov     eax, 0;0=In Use
                  xchg    eax, resourceLock
                  cmp     eax, 0
                  je      InUseLoop
    }

  }

  static void unLockResource(DWORD resourceLock )
  {
    __asm 
    {
      mov resourceLock , 1 
    }   

  }
};

This works but there is a bug in here. 这有效,但这里有一个错误。

The problem is that i want to pass DWORD * resourceLock instead of DWORD resourceLock. 问题是我想传递DWORD * resourceLock而不是DWORD resourceLock。

So question is that how to pass a pointer from C/C++ to assembly and get it back. 所以问题是如何将指针从C / C ++传递到程序集并将其取回。 ?

thanks in advance. 提前致谢。

Regards, -Jay. 问候,-Jay。

PS this is done to avoid context switches between user space and kernel space. PS这样做是为了避免用户空间和内核空间之间的上下文切换。

If you're writing this for Windows, you should seriously consider using a critical section object. 如果您是为Windows编写的,则应认真考虑使用临界区对象。 The critical section API functions are optimised such that they won't transition into kernel mode unless they really need to, so the normal case of no contention has very little overhead. 关键部分API函数经过优化,除非确实需要,否则它们不会转换到内核模式,因此无争用的正常情况下开销很小。

The biggest problem with your spin lock is that if you're on a single CPU system and you're waiting for the lock, then you're using all the cycles you can and whatever is holding the lock won't even get a chance to run until your timeslice is up and the kernel preempts your thread. 你的旋转锁定的最大问题是,如果你在一个单独的CPU系统而你正在等待锁定,那么你就可以使用所有的循环,而持有锁定的任何东西都不会有机会运行直到你的时间片结束,内核抢占你的线程。

Using a critical section will be more successful than trying to roll your own user mode spin lock. 使用临界区比尝试滚动自己的用户模式自旋锁更成功。

In terms of your actual question, it's pretty simple: just change the function headers to use volatile DWORD *resourceLock , and change the assembly lines that touch resourceLock to use indirection: 就实际问题而言,它非常简单:只需更改函数头以使用volatile DWORD *resourceLock ,并更改触及resourceLock以使用间接的装配线:

mov ecx, dword ptr [resourceLock]
xchg eax, dword ptr [ecx]

and

mov ecx, dword ptr [resourceLock]
lock mov dword ptr [ecx], 1

However, note that you've got a couple of other problems looming: 但请注意,您还有其他一些问题迫在眉睫:

  • You say you're developing this on Windows, but want to switch to Linux. 你说你在Windows上开发这个,但是想切换到Linux。 However, you're using MSVC-specific inline assembly - this will have to be ported to gcc-style when you move to Linux (in particular that involves switching from Intel syntax to AT&T syntax). 但是,您正在使用特定于MSVC的内联汇编 - 当您转移到Linux时,必须将其移植到gcc样式(特别是涉及从Intel语法切换到AT&T语法)。 You will be much better off developing with gcc even on Windows; 您将关闭与海湾合作委员会甚至在Windows上开发更好 ; that will minimise the pain of migration (see mingw for gcc for Windows). 这将最大限度地减少迁移的痛苦(请参阅mingw for Windows的gcc)。

  • Greg Hewgill is absolutely right about spinning uselessly, stopping the lock-holder from getting CPU. 格雷格·休吉尔对于无用地旋转是绝对正确的,阻止锁定器获得CPU。 Consider yielding the CPU if you've been spinning for too long. 如果你已经旋转太久,请考虑让CPU产生。

  • On a multiprocessor x86, you might well have a problem with memory loads and stores being re-ordered around your lock - mfence instructions in the lock and unlock procedures might be necessary. 在多处理器x86上,您可能会遇到内存加载和存储在锁定周围重新排序的问题 - 可能需要锁定和解锁过程中的mfence指令。


Really, if you're worrying about locking that means you're using threading, which probably means you're using the platform-specific threading APIs already. 真的,如果你担心锁定意味着你正在使用线程,这可能意味着你已经在使用特定于平台的线程API。 So use the native synchronisation primitives, and switch out to the pthreads versions when you switch to Linux. 因此,请使用本机同步原语,并在切换到Linux时切换到pthreads版本。

Apparently, you are compiling with MSVC using inline assembly blocks in your C++ code. 显然,您正在使用C ++代码中的内联汇编块来编译MSVC。

As a general remark, you should really use compiler intrinsics as inline assembly has no future: it's no more supported my MS compilers when compiling for x64. 作为一般性评论,您应该使用编译器内在函数,因为内联汇编没有前途:在编译x64时,我的MS编译器不再受支持。

If you need to have functions fine tuned in assembly, you will have to implement them in separate files. 如果需要在汇编中对微调功能进行微调,则必须在单独的文件中实现它们。

You should be using something like this: 你应该使用这样的东西:

volatile LONG resourceLock = 1;

if(InterlockedCompareExchange(&resourceLock, 0, 1) == 1) {
    // success!
    // do something, and then
    resourceLock = 1;
} else {
    // failed, try again later
}

See InterlockedCompareExchange . 请参阅InterlockedCompareExchange

The main problems with the original version in the question is that it needs to use register indirect addressing and take a reference (or pointer parameter) rather than a by-value parameter for the lock DWORD. 该问题中原始版本的主要问题是它需要使用寄存器间接寻址并获取引用(或指针参数)而不是锁定DWORD的按值参数。

Here's a working solution for Visual C++. 这是Visual C ++的可行解决方案。 EDIT: I have worked offline with the author and we have verified the code in this answer works in his test harness correctly. 编辑: 我已经与作者离线工作,我们已经验证了这个答案中的代码正确地在他的测试工具中工作。

But if you're using Windows, you should really by using the Interlocked API (ie InterlockedExchange). 但是,如果您使用的是Windows,则应该使用Interlocked API(即InterlockedExchange)。

Edit: As noted by CAF, lock xchg is not required because xchg automatically asserts a BusLock. 编辑:如CAF所述,不需要lock xchg因为xchg自动断言BusLock。

I also added a faster version that does a non-locking read before attempting to do the xchg . 我还添加了一个更快的版本,在尝试执行xchg之前执行非锁定读取。 This significantly reduces BusLock contention on the memory interface. 这显着减少了内存接口上的BusLock争用。 The algorithm can be sped up quite a bit more (in a contentious multithreaded case) by doing backoffs (yield then sleep) for locks held a long time. 通过对持有很长时间的锁进行退避(yield,然后睡眠),可以加快算法(在一个有争议的多线程情况下)。 For the single-threaded-CPU case, using a OS lock that sleeps immediately on held-locks will be fastest. 对于单线程CPU情况,使用立即在挂锁上休眠的OS锁定将是最快的。

class LockImpl
{
    // This is a simple SpinLock
    //  0 - in use / busy
    //  1 - free / available
public:
    static void lockResource(volatile DWORD &resourceLock )
    {
        __asm 
        {
            mov     ebx, resourceLock
InUseLoop:
            mov     eax, 0           ;0=In Use
            xchg    eax, [ebx]
            cmp     eax, 0
            je      InUseLoop
        }

    }

    static void lockResource_FasterVersion(DWORD &resourceLock )
    {
        __asm 
        {
            mov     ebx, resourceLock
InUseLoop:
            mov     eax, [ebx]    ;// Read without BusLock 
            cmp     eax, 0
            je      InUseLoop     ;// Retry Read if Busy

            mov     eax, 0
            xchg    eax, [ebx]    ;// XCHG with BusLock
            cmp     eax, 0
            je      InUseLoop     ;// Retry if Busy
        }
    }

    static void unLockResource(volatile DWORD &resourceLock)
    {
        __asm 
        {
            mov     ebx, resourceLock
            mov     [ebx], 1 
        }       

    }
};

// A little testing code here
volatile DWORD aaa=1;
void test()
{
 LockImpl::lockResource(aaa);
 LockImpl::unLockResource(aaa);
}

Look at your compiler documentation to find out how to print the generated assembly language for functions. 查看编译器文档以了解如何为函数打印生成的汇编语言。

Print the assembly language for this function: 打印此功能的汇编语言:

static void unLockResource(DWORD resourceLock )
{
  resourceLock = 0;
  return;
}

This may not work because the compiler can optimize the function and remove all the code. 这可能不起作用,因为编译器可以优化函数并删除所有代码。 You should change the above function to pass a pointer to resourceLock and then have the function set the lock. 您应该更改上面的函数以传递指向resourceLock的指针,然后让该函数设置锁定。 Print the assembly of this working function. 打印此工作功能的组件。

I already provided a working version which answered the original poster's question both on how to get the parameters passed in ASM and how to get his lock working correctly. 我已经提供了一个工作版本,它回答了原始海报的问题,如何获取ASM中传递的参数以及如何使其锁定正常工作。

Many other answers have questioned the wiseness of using ASM at all and mentioned that either intrinsics or C OS calls should be used. 许多其他答案都质疑使用ASM的明智性,并提到应该使用内在函数或C OS调用。 The following works as well and is a C++ version of my ASM answer. 以下也适用,是我的ASM答案的C ++版本。 There is a snippet of ASM in there that only needs to be used if your platform does not support InterlockedExchange(). 如果您的平台不支持InterlockedExchange(),那么只需要使用ASM片段。

class LockImpl
{
    // This is a simple SpinLock
    //  0 - in use / busy
    //  1 - free / available
public:
#if 1
    static DWORD MyInterlockedExchange(volatile DWORD *variable,DWORD newval)
    {
        // InterlockedExchange() uses LONG / He wants to use DWORD
        return((DWORD)InterlockedExchange(
            (volatile LONG *)variable,(LONG)newval));
    }
#else
    // You can use this if you don't have InterlockedExchange()
    // on your platform. Otherwise no ASM is required.
    static DWORD MyInterlockedExchange(volatile DWORD *variable,DWORD newval)
    {
        DWORD old;
        __asm 
        {
            mov     ebx, variable
            mov     eax, newval
            xchg    eax, [ebx]  ;// XCHG with BusLock
            mov     old, eax
        }
        return(old);
    }
#endif
    static void lockResource(volatile DWORD &resourceLock )
    {
        DWORD oldval;
        do 
        {
            while(0==resourceLock)
            {
                // Could have a yield, spin count, exponential 
                // backoff, OS CS fallback, etc. here
            }
            oldval=MyInterlockedExchange(&resourceLock,0);
        } while (0==oldval);
    }
    static void unLockResource(volatile DWORD &resourceLock)
    {
        // _ReadWriteBarrier() is a VC++ intrinsic that generates
        // no instructions / only prevents compiler reordering.
        // GCC uses __sync_synchronize() or __asm__ ( :::"memory" )
        _ReadWriteBarrier();
        resourceLock=1;
    }
};

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM