简体   繁体   中英

Passing a pointer from C to assembly

I want to use "_test_and_set lock" assembly language implementation with atomic swap assembly instruction in my C/C++ program.

class LockImpl 
{
  public:
  static void lockResource(DWORD resourceLock )
  {
    __asm 
    {
      InUseLoop:  mov     eax, 0;0=In Use
                  xchg    eax, resourceLock
                  cmp     eax, 0
                  je      InUseLoop
    }

  }

  static void unLockResource(DWORD resourceLock )
  {
    __asm 
    {
      mov resourceLock , 1 
    }   

  }
};

This works but there is a bug in here.

The problem is that i want to pass DWORD * resourceLock instead of DWORD resourceLock.

So question is that how to pass a pointer from C/C++ to assembly and get it back. ?

thanks in advance.

Regards, -Jay.

PS this is done to avoid context switches between user space and kernel space.

If you're writing this for Windows, you should seriously consider using a critical section object. The critical section API functions are optimised such that they won't transition into kernel mode unless they really need to, so the normal case of no contention has very little overhead.

The biggest problem with your spin lock is that if you're on a single CPU system and you're waiting for the lock, then you're using all the cycles you can and whatever is holding the lock won't even get a chance to run until your timeslice is up and the kernel preempts your thread.

Using a critical section will be more successful than trying to roll your own user mode spin lock.

In terms of your actual question, it's pretty simple: just change the function headers to use volatile DWORD *resourceLock , and change the assembly lines that touch resourceLock to use indirection:

mov ecx, dword ptr [resourceLock]
xchg eax, dword ptr [ecx]

and

mov ecx, dword ptr [resourceLock]
lock mov dword ptr [ecx], 1

However, note that you've got a couple of other problems looming:

  • You say you're developing this on Windows, but want to switch to Linux. However, you're using MSVC-specific inline assembly - this will have to be ported to gcc-style when you move to Linux (in particular that involves switching from Intel syntax to AT&T syntax). You will be much better off developing with gcc even on Windows; that will minimise the pain of migration (see mingw for gcc for Windows).

  • Greg Hewgill is absolutely right about spinning uselessly, stopping the lock-holder from getting CPU. Consider yielding the CPU if you've been spinning for too long.

  • On a multiprocessor x86, you might well have a problem with memory loads and stores being re-ordered around your lock - mfence instructions in the lock and unlock procedures might be necessary.


Really, if you're worrying about locking that means you're using threading, which probably means you're using the platform-specific threading APIs already. So use the native synchronisation primitives, and switch out to the pthreads versions when you switch to Linux.

Apparently, you are compiling with MSVC using inline assembly blocks in your C++ code.

As a general remark, you should really use compiler intrinsics as inline assembly has no future: it's no more supported my MS compilers when compiling for x64.

If you need to have functions fine tuned in assembly, you will have to implement them in separate files.

You should be using something like this:

volatile LONG resourceLock = 1;

if(InterlockedCompareExchange(&resourceLock, 0, 1) == 1) {
    // success!
    // do something, and then
    resourceLock = 1;
} else {
    // failed, try again later
}

See InterlockedCompareExchange .

The main problems with the original version in the question is that it needs to use register indirect addressing and take a reference (or pointer parameter) rather than a by-value parameter for the lock DWORD.

Here's a working solution for Visual C++. EDIT: I have worked offline with the author and we have verified the code in this answer works in his test harness correctly.

But if you're using Windows, you should really by using the Interlocked API (ie InterlockedExchange).

Edit: As noted by CAF, lock xchg is not required because xchg automatically asserts a BusLock.

I also added a faster version that does a non-locking read before attempting to do the xchg . This significantly reduces BusLock contention on the memory interface. The algorithm can be sped up quite a bit more (in a contentious multithreaded case) by doing backoffs (yield then sleep) for locks held a long time. For the single-threaded-CPU case, using a OS lock that sleeps immediately on held-locks will be fastest.

class LockImpl
{
    // This is a simple SpinLock
    //  0 - in use / busy
    //  1 - free / available
public:
    static void lockResource(volatile DWORD &resourceLock )
    {
        __asm 
        {
            mov     ebx, resourceLock
InUseLoop:
            mov     eax, 0           ;0=In Use
            xchg    eax, [ebx]
            cmp     eax, 0
            je      InUseLoop
        }

    }

    static void lockResource_FasterVersion(DWORD &resourceLock )
    {
        __asm 
        {
            mov     ebx, resourceLock
InUseLoop:
            mov     eax, [ebx]    ;// Read without BusLock 
            cmp     eax, 0
            je      InUseLoop     ;// Retry Read if Busy

            mov     eax, 0
            xchg    eax, [ebx]    ;// XCHG with BusLock
            cmp     eax, 0
            je      InUseLoop     ;// Retry if Busy
        }
    }

    static void unLockResource(volatile DWORD &resourceLock)
    {
        __asm 
        {
            mov     ebx, resourceLock
            mov     [ebx], 1 
        }       

    }
};

// A little testing code here
volatile DWORD aaa=1;
void test()
{
 LockImpl::lockResource(aaa);
 LockImpl::unLockResource(aaa);
}

Look at your compiler documentation to find out how to print the generated assembly language for functions.

Print the assembly language for this function:

static void unLockResource(DWORD resourceLock )
{
  resourceLock = 0;
  return;
}

This may not work because the compiler can optimize the function and remove all the code. You should change the above function to pass a pointer to resourceLock and then have the function set the lock. Print the assembly of this working function.

I already provided a working version which answered the original poster's question both on how to get the parameters passed in ASM and how to get his lock working correctly.

Many other answers have questioned the wiseness of using ASM at all and mentioned that either intrinsics or C OS calls should be used. The following works as well and is a C++ version of my ASM answer. There is a snippet of ASM in there that only needs to be used if your platform does not support InterlockedExchange().

class LockImpl
{
    // This is a simple SpinLock
    //  0 - in use / busy
    //  1 - free / available
public:
#if 1
    static DWORD MyInterlockedExchange(volatile DWORD *variable,DWORD newval)
    {
        // InterlockedExchange() uses LONG / He wants to use DWORD
        return((DWORD)InterlockedExchange(
            (volatile LONG *)variable,(LONG)newval));
    }
#else
    // You can use this if you don't have InterlockedExchange()
    // on your platform. Otherwise no ASM is required.
    static DWORD MyInterlockedExchange(volatile DWORD *variable,DWORD newval)
    {
        DWORD old;
        __asm 
        {
            mov     ebx, variable
            mov     eax, newval
            xchg    eax, [ebx]  ;// XCHG with BusLock
            mov     old, eax
        }
        return(old);
    }
#endif
    static void lockResource(volatile DWORD &resourceLock )
    {
        DWORD oldval;
        do 
        {
            while(0==resourceLock)
            {
                // Could have a yield, spin count, exponential 
                // backoff, OS CS fallback, etc. here
            }
            oldval=MyInterlockedExchange(&resourceLock,0);
        } while (0==oldval);
    }
    static void unLockResource(volatile DWORD &resourceLock)
    {
        // _ReadWriteBarrier() is a VC++ intrinsic that generates
        // no instructions / only prevents compiler reordering.
        // GCC uses __sync_synchronize() or __asm__ ( :::"memory" )
        _ReadWriteBarrier();
        resourceLock=1;
    }
};

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM