简体   繁体   English

numpy.ctypeslib.as_ctypes 到底在做什么

[英]What is numpy.ctypeslib.as_ctypes exacty doing

I have this Fortran code :我有这个 Fortran 代码:

module example

    use iso_c_binding
    implicit none
    
contains

    subroutine array_by_ref_modifying(array, nbrows, nbcols, coeff) BIND(C, NAME='array_by_ref_modifying')
        !DEC$ ATTRIBUTES DLLEXPORT :: array_by_ref_modifying
        integer, intent(in) :: nbrows, nbcols, coeff
        integer, intent(inout) :: array(nbrows, nbcols)
        integer :: i, j
        do j = 1, nbcols
            do i = 1, nbrows
                array(i, j) = coeff * array(i, j)
            enddo
        enddo
    end subroutine array_by_ref_modifying

end module example

than I compiled to a TestLib.dll that I call from Python as follows :比我编译成我从 Python 调用的 TestLib.dll 如下:

redist_path = r"C:\Program Files (x86)\Intel\oneAPI\compiler\2021.3.0\windows\redist\intel64_win\compiler"
dll_full_name = r"C:\DLLS\TestLib.dll"

import os
os.add_dll_directory(redist_path)

import ctypes as ct
import numpy as np

fortlib = ct.CDLL(dll_full_name)

nbrows = 2
nbcols = 6
pnbrows = ct.pointer( ct.c_int(nbrows))  
pnbcols = ct.pointer( ct.c_int(nbcols))
myarr = np.array([[1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12]], dtype=ct.c_int)

print(myarr)

coeff = 2
pcoeff = ct.pointer( ct.c_int(coeff))

pyfunc_array_by_ref_modifying = getattr(fortlib, "array_by_ref_modifying")
pyfunc_array_by_ref_modifying(np.ctypeslib.as_ctypes(myarr), pnbrows, pnbcols, pcoeff)

print(myarr)

The python code outputs : python代码输出:

[[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]
 [[ 2  4  6  8 10 12]
 [14 16 18 20 22 24]]

as expected.正如预期的那样。 Now, what I expect less is that, if I replace the python script "natural" bit现在,我期望较少的是,如果我替换 python 脚本“自然”位

nbrows = 2
nbcols = 6

(leading in the Fortran to a (1:2,1:6) array) with the "less natural" bit (在 Fortran 中引导到 (1:2,1:6) 数组)具有“不太自然”的位

nbrows = 6
nbcols = 2

(leading in the Fortran to a (1:6,1:2) array, thereby showing that a correct array is passed to Fortran) the python script still outputs (在 Fortran 中引导到 (1:6,1:2) 数组,从而显示正确的数组被传递给 Fortran)python 脚本仍然输出

[[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]
 [[ 2  4  6  8 10 12]
 [14 16 18 20 22 24]]

As far as I understand, in the line据我了解,在线

pyfunc_array_by_ref_modifying(np.ctypeslib.as_ctypes(myarr), pnbrows, pnbcols, pcoeff)

the np.ctypeslib.as_ctypes(myarr) does some pretreatment/transposition. np.ctypeslib.as_ctypes(myarr)做了一些预处理/换位。 It this really the case ?真的是这样吗?

I don't want this pretreatment to be done as it has performance overhead that can be significative in real code with big array dimensions.我不希望进行这种预处理,因为它的性能开销在具有大数组维度的实际代码中可能很重要。

(I guess that the "correct" way is with the "natural bit", which does not imply any pretreatment.) (我想“正确”的方式是使用“自然位”,这并不意味着任何预处理。)

As I still don't understand what the problem is, I'll just assume, and try to answer "blindly".由于我仍然不明白问题是什么,我只是假设,并尝试“盲目地”回答。

I'll start by answering the punctual question (somewhere at the end):我将首先回答准时的问题(在最后的某个地方):
numpy.ctypeslib.as_ctypes converts a NumPy array (or object, or ...) into a CTypes one. numpy.ctypeslib.as_ctypesNumPy数组(或对象,或 ...)转换为CTypes 数组 But conversion only happens on the metadata (as it is different for the 2 modules).但是转换只发生在元数据上(因为这两个模块不同)。 The array contents (or the pointer, or the memory address where the actual array data is (or begins)) is left unchanged ( not copied / modified / altered, ...).数组内容(或指针,或实际数组数据所在(或开始)的内存地址)保持不变复制/修改/更改,...)。

References:参考:

  1. [NumPy.Docs]: C-Types Foreign Function Interface (numpy.ctypeslib) [NumPy.Docs]:C 类型外部函数接口 (numpy.ctypeslib)

    1.1. 1.1. The source code (somewhere at the end): [GitHub]: numpy/numpy - numpy/numpy/ctypeslib.py源代码(在最后的某处): [GitHub]:numpy/numpy - numpy/numpy/ctypeslib.py

  2. [Python.Docs]: ctypes - A foreign function library for Python [Python.Docs]: ctypes - Python 的外部函数库

So, no transposition is done.所以,没有进行转置。
I'm not sure what you mean by " pretreatment " (do all the checks and operations performed by as_ctypes fit there?).我不确定您所说的“预处理”是什么意思(由as_ctypes执行的所有检查和操作是否适合那里?)。 Same about " natural bit ".与“自然位”相同。

Also note that as_ctypes (1 st argument of pyfunc_array_by_ref_modifying ) is completely unaware of the rest of them ( pnbrows and pnbcols in particular).还要注意的是as_ctypes(pyfunc_array_by_ref_modifying1个变量)是完全不知道他们(尤其是pnbrowspnbcols)的其余部分。

Even if not directly impacting (I think it's a matter of "luck"), here's something you might want to check out: [SO]: C function called from Python via ctypes returns incorrect value (@CristiFati's answer) .即使没有直接影响(我认为这是“运气”的问题),您也可能需要查看以下内容: [SO]: C 函数通过 ctypes 从 Python 调用返回不正确的值(@CristiFati 的答案)

Going to (what I think is) the real problem (that lead to the question):去(我认为是)真正的问题(导致问题):

  • The array is passed to the Fortran subroutine by reference (its buffer start memory address).该数组通过引用(其缓冲区起始内存地址)传递给Fortran子例程。 I don't have an official source to state this (it was an assumption since the beginning), but it's like in C passing a pointer (I guess it has something to do with C bind from the subroutine declaration)我没有官方消息来源来说明这一点(从一开始就是一个假设),但这就像在C 中传递一个指针一样(我猜这与子程序声明中的C绑定有关)
  • No metadata (rows, columns) is passed (otherwise the next 2 arguments would be useless)不传递元数据(行、列)(否则接下来的 2 个参数将无用)

The ( 2D ) array is stored in memory as an 1D one: 1 st row, followed by the 2 nd one, 3 rd , ..., and the last one.(2D)阵列存储在存储器中作为一个1D:1行,然后第2一个,第3,...,和最后一个。

In order to reach element array[i][j] ( i = [0, row_count) , j = [0, column_count) ), the following formula (pointer logic) is used:为了达到元素array[i][j] ( i = [0, row_count) , j = [0, column_count) ),使用以下公式(指针逻辑):
array_start_address + array_element_size_in_bytes * (i * column_count + j) . array_start_address + array_element_size_in_bytes * (i * column_count + j)

Hoping to clear some of the confusion, here's a small C demo.希望消除一些困惑,这里有一个小的C演示。 I replicated (what I thought it was) the Fortran subroutine behavior.我复制了(我认为是) Fortran子程序的行为。 I also increased the row count to make things clearer.我还增加了行数以使事情更清晰。

main00.c :主00.c

#include <stdio.h>
#include <string.h>

#define ROWS 3
#define COLS 6
//#pragma align 4


typedef int Array2D[ROWS][COLS];

/*
void modifyArray(Array2D array, int rows, int cols, int coef) {
    for (int i = 0; i < rows; ++i)
        for (int j = 0; j < cols; ++j)
            array[i][j] *= coef;
}
//*/

void modifyPtr(int *pArray, int rows, int cols, int coef) {  // This is the equivalent.
    for (int i = 0; i < rows; ++i)
        for (int j = 0; j < cols; ++j)
            (*(pArray + (i * cols + j))) *= coef;
}

void printArray(Array2D array, int rows, int cols, char *head) {
    printf("\n%s:\n", head);
    for (int i = 0; i < rows; ++i) {
        for (int j = 0; j < cols; ++j)
            printf("% 3d  ", array[i][j]);
        printf("\n");
    }
}

void modify(Array2D arr, int rows, int cols, int coef, char *head) {
    printf("\nRows: %d, Cols: %d, Coef: %d", rows, cols, coef);
    Array2D arr_dup;
    memcpy(arr_dup, arr, sizeof(Array2D));
    modifyPtr(arr_dup, rows, cols, coef);
    printArray(arr_dup, ROWS, COLS, head);
}

int main() {
    Array2D arr = {  // ROWS X COLS
        { 1, 2, 3, 4, 5, 6 },
        { 7, 8, 9, 10, 11, 12 },
        { 13, 14, 15, 16, 17, 18 },
    };
    char *modifiedText = "Modified";

    //printf("Array size: %d %d %d\n", sizeof(arr), sizeof(arr[0]), sizeof(arr[0][0]));

    printArray(arr, ROWS, COLS, "Original");

    modify(arr, 3, 6, 2, modifiedText);
    printArray(arr, ROWS, COLS, "Original");
    modify(arr, 2, 6, 2, modifiedText);
    modify(arr, 1, 6, 2, modifiedText);
    modify(arr, 3, 5, 2, modifiedText);
    modify(arr, 3, 4, 2, modifiedText);
    modify(arr, 5, 2, 2, modifiedText);
    modify(arr, 7, 1, 2, modifiedText);

    printf("\nDone.\n");
    return 0;
}

Output :输出

 [cfati@CFATI-5510-0:e:\\Work\\Dev\\StackOverflow\\q068314707]> sopr.bat ### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ### [prompt]> "c:\\Install\\pc032\\Microsoft\\VisualStudioCommunity\\2019\\VC\\Auxiliary\\Build\\vcvarsall.bat" x64 > nul [prompt]> dir /b code00.py example.f90 main00.c [prompt]> [prompt]> cl /nologo /MD /W0 main00.c /link /NOLOGO /OUT:main00_pc064.exe main00.c [prompt]> dir /b code00.py example.f90 main00.c main00.obj main00_pc064.exe [prompt]> [prompt]> main00_pc064.exe Original: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Rows: 3, Cols: 6, Coef: 2 Modified: 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Original: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Rows: 2, Cols: 6, Coef: 2 Modified: 2 4 6 8 10 12 14 16 18 20 22 24 13 14 15 16 17 18 Rows: 1, Cols: 6, Coef: 2 Modified: 2 4 6 8 10 12 7 8 9 10 11 12 13 14 15 16 17 18 Rows: 3, Cols: 5, Coef: 2 Modified: 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 16 17 18 Rows: 3, Cols: 4, Coef: 2 Modified: 2 4 6 8 10 12 14 16 18 20 22 24 13 14 15 16 17 18 Rows: 5, Cols: 2, Coef: 2 Modified: 2 4 6 8 10 12 14 16 18 20 11 12 13 14 15 16 17 18 Rows: 7, Cols: 1, Coef: 2 Modified: 2 4 6 8 10 12 14 8 9 10 11 12 13 14 15 16 17 18 Done.

Back to Fortran (saved your script locally) + Python (created a new one).回到Fortran (在本地保存您的脚本)+ Python (创建一个新脚本)。

code00.py :代码00.py

#!/usr/bin/env python

import sys
import ctypes as ct
import numpy as np


IntPtr = ct.POINTER(ct.c_int)

DLL_NAME = "./example.{:s}".format("dll" if sys.platform[:3].lower() == "win" else "so")


def modify(arr, rows, cols, coef, modify_func):
    print("\nRows: {:d}, Cols {:d}, Coef: {:d}".format(rows, cols, coef))
    arr_dup = arr.copy()
    arr_ct = np.ctypeslib.as_ctypes(arr_dup)
    rows_ct = ct.c_int(rows)
    cols_ct = ct.c_int(cols)
    coef_ct = ct.c_int(coef)
    modify_func(arr_ct, ct.byref(rows_ct), ct.byref(cols_ct), ct.byref(coef_ct))
    print("Modified array:\n {:}".format(arr_dup))


def main(*argv):
    dll00 = ct.CDLL(DLL_NAME)
    func = getattr(dll00, "array_by_ref_modifying")
    #func.argtypes = (ct.c_void_p, ct.c_void_p, ct.c_void_p, ct.c_void_p)
    func.argtypes = (ct.c_void_p, IntPtr, IntPtr, IntPtr)
    func.restype = None

    arr = np.array([
        [1, 2, 3, 4, 5, 6],
        [7, 8, 9, 10, 11, 12],
        [13, 14, 15, 16, 17, 18],
    ], dtype=ct.c_int)

    print("Original array:\n {:}".format(arr))

    modify(arr, 3, 6 , 2, func)
    modify(arr, 2, 6 , 2, func)
    modify(arr, 1, 6 , 2, func)
    modify(arr, 3, 5 , 2, func)
    modify(arr, 3, 4 , 2, func)
    modify(arr, 5, 2 , 2, func)
    modify(arr, 7, 1 , 2, func)


if __name__ == "__main__":
    print("Python {:s} {:03d}bit on {:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")),
                                                   64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    rc = main(*sys.argv[1:])
    print("\nDone.")
    sys.exit(rc)

Output :输出

 [prompt]> "f:\\Install\\pc032\\Intel\\OneAPI\\Version\\compiler\\2021.3.0\\windows\\bin\\intel64\\ifort.exe" /c example.f90 Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.3.0 Build 20210609_000000 Copyright (C) 1985-2021 Intel Corporation. All rights reserved. [prompt]> link /NOLOGO /DLL /OUT:example.dll /LIBPATH:"f:\\Install\\pc032\\Intel\\OneAPI\\Version\\compiler\\2021.3.0\\windows\\compiler\\lib\\intel64_win" example.obj Creating library example.lib and object example.exp [prompt]> dir /b code00.py example.dll example.exp example.f90 example.lib example.mod example.obj main00.c main00.obj main00_pc064.exe [prompt]> [prompt]> "e:\\Work\\Dev\\VEnvs\\py_pc064_03.08.07_test0\\Scripts\\python.exe" code00.py Python 3.8.7 (tags/v3.8.7:6503f05, Dec 21 2020, 17:59:51) [MSC v.1928 64 bit (AMD64)] 064bit on win32 Original array: [[ 1 2 3 4 5 6] [ 7 8 9 10 11 12] [13 14 15 16 17 18]] Rows: 3, Cols 6, Coef: 2 Modified array: [[ 2 4 6 8 10 12] [14 16 18 20 22 24] [26 28 30 32 34 36]] Rows: 2, Cols 6, Coef: 2 Modified array: [[ 2 4 6 8 10 12] [14 16 18 20 22 24] [13 14 15 16 17 18]] Rows: 1, Cols 6, Coef: 2 Modified array: [[ 2 4 6 8 10 12] [ 7 8 9 10 11 12] [13 14 15 16 17 18]] Rows: 3, Cols 5, Coef: 2 Modified array: [[ 2 4 6 8 10 12] [14 16 18 20 22 24] [26 28 30 16 17 18]] Rows: 3, Cols 4, Coef: 2 Modified array: [[ 2 4 6 8 10 12] [14 16 18 20 22 24] [13 14 15 16 17 18]] Rows: 5, Cols 2, Coef: 2 Modified array: [[ 2 4 6 8 10 12] [14 16 18 20 11 12] [13 14 15 16 17 18]] Rows: 7, Cols 1, Coef: 2 Modified array: [[ 2 4 6 8 10 12] [14 8 9 10 11 12] [13 14 15 16 17 18]] Done.

Conclusions :结论

  • The array is modified in the same way in the 2 examples, confirming my assumption数组在 2 个示例中以相同的方式修改,证实了我的假设
  • No matter what row and column numbers are passed, the array is modified from the beginning, left to right, top to bottom.无论传递什么行号和列号,数组都是从头开始、从左到右、从上到下修改的。 This might be a bit confusing (when picturing the 2D representation and for example passing a column number that's smaller than the actual one).这可能有点令人困惑(在描绘2D表示时,例如传递一个小于实际的列号时)。 That's why it works with a dimension greater than the actual one.这就是为什么它的尺寸大于实际尺寸的原因。 The only thing to be careful about is not to go beyond the number of elements ( row_count * column_count ), because that would yield Undefined Behavior (and it might crash)唯一需要注意的是不要超过元素的数量( row_count * column_count ),因为这会产生未定义的行为(并且可能会崩溃)
  • I'm not sure about this one, but mentioning it anyway: there might be some other Undefined Behavior cases, eg when each row in the array would be padded by compiler in order to be properly aligned (similar to #pragma pack ).我不确定这个,但无论如何都要提到它:可能还有其他一些未定义行为的情况,例如,当数组中的每一行都将由编译器填充以正确对齐时(类似于#pragma pack )。 Example: a char array with 7 columns (a row would have 7 bytes) might be 8 bytes aligned.示例:具有 7 列(一行将有 7 个字节)的char数组可能是 8 个字节对齐的。 Not sure how pointer logic copes with that不确定指针逻辑如何处理

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM