简体   繁体   中英

How to loop over a list in cython pure mode

In an attempt to speed up struct.pack() , I have the following to pack an int to bytes:

import cython as c
from cython import nogil, compile, returns, locals, cfunc, pointer, address

int_bytes_buffer = c.declare(c.char[400], [0] * 400)


@locals(i = c.int, num = c.int)
@returns(c.int)
@cfunc
@nogil
@compile
def int_to_bytes(num):
    i = 0
    while num >0:
        int_bytes_buffer[i] = num%256
        num//=256
        i+=1

    return int_bytes_buffer[0]


int_to_bytes(259)

I'm trying to get this to work on a list of ints, with the following bad code:

@locals(i = c.int, ints_p = pointer(c.int[100]), num = c.int)
@returns(c.int)
@cfunc
@nogil
@compile
def int_to_bytes(num):
    i = 0
    for num in ints_p:
        while num >0:
            int_bytes_buffer[i] = num%256
            num//=256
            i+=1

    return int_bytes_buffer[0]

ints = c.declare(c.int[100],  [259]*100)
int_to_bytes(address(ints))

which gives me:

    for num in ints_p:
              ^
----------------------------------------------------------

 Accessing Python global or builtin not allowed without gil

Evidently I shouldn't be using in , or looping over a pointer.

How can I loop over the list-made-array inside the function?

EDIT :

I'm trying to pass a pointer to an array of ints to the function, and have it work without the gil so it can be parallelized.

The parameter to the function should've been ints_p:

@locals(ints_p = pointer(c.int[100]), i = c.int, num = c.int)
@returns(c.int)
@cfunc
@nogil
@compile
def int_to_bytes(ints_p):
    i = 0
    for num in (*ints_p):
        while num >0:
            int_bytes_buffer[i] = num%256
            num//=256
            i+=1

    return int_bytes_buffer[0]

ints = c.declare(c.int[100],  [259]*100)
int_to_bytes(address(ints))

and I want to run over the actual ints and pack them (without the gil)

EDIT 2 :

I am aware of struct.pack . I wish to make a parallelizeable variant with cython and nogil .

This is pointless:

  1. A Python int can be arbitrarily big. The actual computational work in "packing" it is working out if it fits in a given size and then copying it to a space of that size. However, you're using an array of C int s. These have a fixed size. There is basically no work to be done in extracting them into an array of bytes. All you have done is written a very inefficient version of memcpy . They are literally already in memory as a contiguous set of bytes - all you have to do is view them as such:

     # using Numpy (no Cython) ints = np.array([1,2,3,4,5,6,7], dtype=np.int) # some numpy array already initialized as_bytes = ints.view(dtype=np.byte) # no data is copied - wonderfully efficient

    you could make a similar approach work with another array library or with C arrays too:

     # slightly pointless use of pure-Python mode since this won't # be valid in Python. @cython.cfunc @cython.returns(cython.p_char) @cython.locals(x = cython.p_int) def cast_ptr(x): return cython.cast(cython.p_char,x)
  2. You say you want nogil so it can be parallelized. Parallelization works well when there's actual computational work to be done. It doesn't work well when the task is limited by memory access, since the threads tend to end up waiting for each other for access to the memory. This task will not parallelize well.

  3. Memory management is a problem. You're only capable of writing into fixed-size buffers. To allocate variable-sized arrays you have a number of choices: you can use numpy or the Python array module (or similar) to let Python take care of the memory-management or you can use malloc and free to allocate arrays on a C level. Since you claim to need nogil you have to use the C approach. However, you cannot do this from Cython's pure-Python mode since everything also has to work in Python and there is no Python equivalent of malloc and free . If you insist on trying to make this work then you need to abandon Cython's pure-Python mode and use the standard Cython syntax since what you are trying to do cannot be made compatible with both.

    Note that currently int_bytes_buffer is a global array. This means that multiple threads will share it - a disaster for your supposed parallelization.


You need to think clearly what your inputs are going to be. If it's a list of Python ints then you cannot make this work with nogil (since you are manipulating Python objects and this requires the GIL). If it's some C-level array (be it Numpy, the array module, or a Cython declared C array) then your data is already in the format you want and you just have to view it as such.


Edit: From the comments this is clearly an XY problem (you're asking about fixing this Cython syntax because you want to pack a list of ints) I've added a fast way of packing a list of Python ints using Cython. This is 7x faster than struct pack and 5x faster than passing a list to array.array . It's mostly faster because it's specialized to only do one thing.

I've used bytearray as a convenient writeable data store and the Python memoryview class (not quite the same as the Cython memoryview syntax...) as a way to cast the data-types. No real effort has been spent optimising it so you may be able to improve it. Note that the copy into bytes at the end does not change the time measurable, illustrating just how irrelevant copying the memory is to the overall speed.

@cython.boundscheck(False)
@cython.wraparound(False)
def packlist(a):
    out = bytearray(4*len(a))
    cdef int[::1] outview = memoryview(out).cast('i')
    cdef int i
    for i in range(len(a)):
        outview[i] = a[i]
    return bytes(out)

There are a few errors in your code.

  1. In the error Accessing Python global or builtin not allowed without gil , so you need to remove the tag of @nogil . After you remove that, it will not show the error. Tested in my code. But there are other errors.

  2. Your function has a few problems. def int_to_bytes(num): You shouldn't pass num in the function since the value num will be assigned in the for loop. I remove it as def int_to_bytes(): and the function works. But there's still error.

    @locals(i = c.int, ints_p = c.int(5), num = c.int)
    @returns(c.int)
    @cfunc
    @compile

    def int_to_bytes():
        ints_p = [1,2,3,4,5]
        i = 0
        for num in ints_p:
            while num >0:
                int_bytes_buffer[i] = num%256
                num//=256
                i+=1

        return int_bytes_buffer[1]

    a = int_to_bytes()
    print(a)
  1. Last, I don't understand why you pass the address to the function, since the function shouldn't take anything.

The code works for me:

import cython as c
from cython import nogil, compile, returns, locals, cfunc, pointer, address

int_bytes_buffer = c.declare(c.char[400], [0] * 400)

ints = c.declare(c.int[100],  [259]*100)
# for i in list(*address(ints)):
#   print(i)
@locals(i = c.int, num = c.int)
@returns(c.int)
@cfunc
@compile

def int_to_bytes(values):
    i = 0
    for num in list(*address(values)):
        while num >0:
            int_bytes_buffer[i] = num%256
            num//=256
            i+=1

    return int_bytes_buffer

a = int_to_bytes(ints)
print([i for i in a])

Hope it helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM