[英]Link C++ program output with Python script
I have a C++ program that uses some very specific method to calculate pairwise distances for a data set (30,000 elements). 我有一个C ++程序,它使用一些非常特定的方法来计算数据集(30,000个元素)的成对距离。 The output file would be 20 GB, and look something like this: 输出文件将为20 GB,如下所示:
point1, point2, distancex pointi, pointj, distancexx .....
I then input the file to Python and use Python (NumPy) for clustering. 然后,我将文件输入Python,并使用Python(NumPy)进行集群。 It takes forever using Python to read the output file. 永远需要使用Python来读取输出文件。 Is there a way to connect the C++ program directly with my Python code to save time on I/O on the intermediate file? 有没有一种方法可以直接将C ++程序与我的Python代码连接,以节省中间文件上I / O的时间? Maybe using SWIG? 也许使用SWIG?
I assume you have been saving ascii. 我认为您一直在保存ascii。 You could modify your C++ code to write binary instead, and read it with numpy.fromfile . 您可以修改C ++代码以改为编写二进制文件,然后使用numpy.fromfile读取它。
For a more direct connection, you would wrap your C++ code as a library (remove main() and drive it from Python) using swig. 对于更直接的连接,您可以使用swig将C ++代码包装为一个库(删除main()并从Python驱动)。 This allows you to share the memory of arrays between C++ and Python. 这使您可以在C ++和Python之间共享数组的内存。
You can use either Python's buffer protocol on the C++ side together with numpy.frombuffer on the Python side. 您可以在C ++端使用Python的缓冲区协议 ,也可以在python端使用numpy.frombuffer 。 Or you can use the numpy headers to directly work on numpy arrays in C++. 或者,您可以使用numpy标头直接在C ++中处理numpy数组。 Here is a small swig example project using the second method. 这是使用第二种方法的小示例项目 。 (Disclaimer: I wrote it.) (免责声明:我写的。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.