简体   繁体   中英

interfacing Python and Torch7(Lua) via shared library

I am trying to pass data (arrays) between python and lua and I want to manipulate the data in lua using the Torch7 framework. I figured this can best be done through C, since python and lua interface with C. Also some advantages are that no data copying is needed this way (passing only pointers) and is fast.

I implemented two programs, one where lua is embedded in c and one where python passes data to c. They both work when compiled to executable binaries. However when the c to lua program is instead made to be a shared library things don't work.

The details: I'm using 64-bit ubuntu 14.04 and 12.04. I'm using luajit 2.0.2 with lua 5.1 installed in /usr/local/ Dependency libs are in /usr/local/lib and headers are in /usr/local/include I'm using python 2.7

The code for the c to lua program is:

tensor.lua

require 'torch'

function hi_tensor(t)
   print(‘Hi from lua')
   torch.setdefaulttensortype('torch.FloatTensor')
   print(t)
return t*2
end

cluaf.h

void multiply (float* array, int m, int n, float *result, int m1, int n1);

cluaf.c

#include <stdio.h>
#include <string.h>
#include "lua.h"
#include "lauxlib.h"
#include "lualib.h"
#include "luaT.h"
#include "TH/TH.h"

void multiply (float* array, int m, int n, float *result, int m1, int n1)
{
    lua_State *L = luaL_newstate();
    luaL_openlibs( L );

    // loading the lua file
    if (luaL_loadfile(L, "tensor.lua") || lua_pcall(L, 0, 0, 0))
    {
        printf("error: %s \n", lua_tostring(L, -1));
    }

    // convert the c array to Torch7 specific structure representing a tensor
    THFloatStorage* storage =  THFloatStorage_newWithData(array, m*n);
    THFloatTensor* tensor = THFloatTensor_newWithStorage2d(storage, 0, m, n, n, 1);
    luaT_newmetatable(L, "torch.FloatTensor", NULL, NULL, NULL, NULL);

    // load the lua function hi_tensor
    lua_getglobal(L, "hi_tensor");
    if(!lua_isfunction(L,-1))
    {
        lua_pop(L,1);
    }

    //this pushes data to the stack to be used as a parameter
    //to the hi_tensor function call
    luaT_pushudata(L, (void *)tensor, "torch.FloatTensor");

    // call the lua function hi_tensor
    if (lua_pcall(L, 1, 1, 0) != 0)
    {
        printf("error running function `hi_tensor': %s \n", lua_tostring(L, -1));
    }

    // get results returned from the lua function hi_tensor
    THFloatTensor* z = luaT_toudata(L, -1, "torch.FloatTensor");
    lua_pop(L, 1);
    THFloatStorage *storage_res =  z->storage;
    result = storage_res->data;

    return ;
}

Then to test I do:

luajit -b tensor.lua tensor.o

gcc -w -c -Wall -Wl,-E -fpic cluaf.c -lluajit -lluaT -lTH -lm -ldl -L /usr/local/lib

gcc -shared cluaf.o tensor.o -L/usr/local/lib -lluajit -lluaT -lTH -lm -ldl -Wl,-E -o libcluaf.so

gcc -L. -Wall -o test main.c -lcluaf

./test

The output:

Hi from lua
 1.0000  0.2000
 0.2000  5.3000
[torch.FloatTensor of dimension 2x2]

c result 2.000000 
c result 0.400000 
c result 0.400000 
c result 10.60000

So far so good. But when I try to use the shared library in python it breaks.

test.py

from ctypes import byref, cdll, c_int
import ctypes
import numpy as np
import cython

l = cdll.LoadLibrary(‘absolute_path_to_so/libcluaf.so')

a = np.arange(4, dtype=np.float64).reshape((2,2))
b = np.arange(4, dtype=np.float64).reshape((2,2))

l.multiply.argtypes = [ctypes.POINTER(ctypes.c_float), ctypes.c_int, ctypes.c_int,     ctypes.POINTER(ctypes.c_float), ctypes.c_int, ctypes.c_int]
a_list = []
b_list = []

for i in range(a.shape[0]):
    for j in range(a.shape[1]):
            a_list.append(a[i][j])

for i in range(b.shape[0]):
     for j in range(b.shape[1]):
        b_list.append(b[i][j])

arr_a = (ctypes.c_float * len(a_list))()
arr_b = (ctypes.c_float * len(b_list))()

l.multiply(arr_a, ctypes.c_int(2), ctypes.c_int(2), arr_b, ctypes.c_int(2), ctypes.c_int(2))

I run:

python test.py

and the output is:

error: error loading module 'libpaths' from file '/usr/local/lib/lua/5.1/libpaths.so':
    /usr/local/lib/lua/5.1/libpaths.so: undefined symbol: lua_gettop

I searched for this error here and everywhere on the web but they either suggest (1) to include -Wl,-E to export symbols or (2) to add dependencies on linking which I did. (1) I have -Wl,-E but it seems to not be doing anything. (2) I have included the dependencies (-L/usr/local/lib -lluajit -lluaT -lTH -lm -ldl)

The python test fails not when the shared library is imported but when the 'require torch' inside lua is called. That is also the different thing in this case from the other cases I found.

luajit.so defines the symbol lua_gettop (nm /usr/local/lib/luajit.so to see that) lua.h defines LUA_API int (lua_gettop) (lua_State *L);

I guess when compiling c to binary all works because it finds all symbols in lua.h but using the shared library it doesn't pick lua_gettop from luajit.so (I don't know why).

www.luajit.org/running.html says: 'On most ELF-based systems (eg Linux) you need to explicitly export the global symbols when linking your application, eg with: -Wl,-E require() tries to load embedded bytecode data from exported symbols (in *.exe or lua51.dll on Windows) and from shared libraries in package.cpath.'

package.cpath and package.path are:

./?.so;/usr/local/lib/lua/5.1/?.so;/usr/local/lib/lua/5.1/loadall.so

./?.lua;/usr/local/share/luajit-2.0.2/?.lua;/usr/local/share/lua/5.1/?.lua;/usr/local/share/lua/5.1/?/init.lua

Here is what nm libcluaf.so returns:

00000000002020a0 B __bss_start
00000000002020a0 b completed.6972
                 w __cxa_finalize@@GLIBC_2.2.5
0000000000000a50 t deregister_tm_clones
0000000000000ac0 t __do_global_dtors_aux
0000000000201dd8 t __do_global_dtors_aux_fini_array_entry
0000000000202098 d __dso_handle
0000000000201de8 d _DYNAMIC
00000000002020a0 D _edata
00000000002020a8 B _end
0000000000000d28 T _fini
0000000000000b00 t frame_dummy
0000000000201dd0 t __frame_dummy_init_array_entry
0000000000000ed0 r __FRAME_END__
0000000000202000 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
0000000000000918 T _init
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
0000000000201de0 d __JCR_END__
0000000000201de0 d __JCR_LIST__
                 w _Jv_RegisterClasses
                 U lua_getfield
0000000000000d99 R luaJIT_BC_tensor
                 U luaL_loadfile
                 U luaL_newstate
                 U luaL_openlibs
                 U lua_pcall
                 U lua_settop
                 U luaT_newmetatable
                 U lua_tolstring
                 U luaT_pushudata
                 U luaT_toudata
                 U lua_type
0000000000000b35 T multiply
                 U printf@@GLIBC_2.2.5
0000000000000a80 t register_tm_clones
                 U THFloatStorage_newWithData
                 U THFloatTensor_newWithStorage2d
00000000002020a0 d __TMC_END__

Thanks in advance

On Linux Lua modules don't link to the Lua library directly but instead expect to find the Lua API functions already loaded. This is usually done by exporting them from the interpreter using the -Wl,-E linker flag. This flag only works for symbols in executables , not shared libraries. For shared libraries there exists something similar: the RTLD_GLOBAL flag for the dlopen function. By default all shared libraries listed on the compiler command line are loaded using RTLD_LOCAL instead, but fortunately Linux reuses already opened library handles. So you can either:

Preload the Lua(JIT) library using RTLD_GLOBAL before it gets loaded automatically (which happens when you load libcluaf.so ):

from ctypes import byref, cdll, c_int
import ctypes

lualib = ctypes.CDLL("libluajit-5.1.so", mode=ctypes.RTLD_GLOBAL)
l = cdll.LoadLibrary('absolute_path_to_so/libcluaf.so')
# ...

Or change the flags of the Lua(JIT) library handle afterwards using the RTLD_NOLOAD flag for dlopen . This flag is not in POSIX though, and you probably have to use C to do so. See eg here .

For exchanging data between python/numpy and lua/torch, you could try a library named " lutorpy ". It does exactly what you are trying to do, share the memory and only pass the pointer with "asNumpyArray()" method.

import lutorpy as lua
import numpy as np

## run lua code in python with minimal modification:  replace ":" to "._"
t = torch.DoubleTensor(10,3)
print(t._size()) # the corresponding lua version is t:size()

## convert torch tensor to numpy array
### Note: the underlying object are sharing the same memory, so the conversion is instant
arr = t.asNumpyArray()
print(arr.shape)

## or, you can convert numpy array to torch tensor
xn = np.random.randn(100)
## convert the numpy array into torch tensor
xt = torch.fromNumpyArray(xn)

The lua_gettop is a function defined in the Lua .so, which in your case must be luajit.so. Looks like you link your lib to it, that's good, and then you link main to your lib, so presumably the c compiler finds the Lua functions used by main in the luajit. So far so good.

Now when you load your lib via ctypes in python, does luajit lib automatically get loaded? You would expect it to, but you should confirm, maybe you have to tell ctypes to load linked libs. Another possibility is that ctypes or the lib loader does not find luajit, perhaps because it looks in places where luajit is not located. To be sure you might want to try putting all the libs in same folder from where you call Python.

If that doesn't help, try a variant of what you tried: don't load your module in python, just load luajit directly using ctypes, and try calling some of its methods.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM