简体   繁体   中英

How to use pybind11 to return a DataFrame?

I am writing a Python module using pybind11 and Modern C++ .

How do I return a DataFrame from C++ to Python?

It is possible by returning an Apache Arrow table, which can be converted to a Python DataFrame with one line of Python.

For an example of an existing Python library that uses this:

Other links

Sometimes, it's useful to do a quick'n'dirty transfer of a DataFrame from PyBind11/C++ to Python for logging purposes. We don't want speed, we want ease of use.

Construct a string that represents a.csv file in C++, return that, then convert that into a DataFrame on the Python side:

from io import StringIO
logCsv = 'A,B\n2.3,4.5\n'  # This string could be generated in PyBind11/C++.
LOGDATA = StringIO(logCsv)
df = pd.read_csv(LOGDATA, sep=",")
df

Output:

     A    B
0  2.3  4.5

Once we have this data in a DataFrame, we can save it in any format including Excel and Parquet. Once the data is in Excel, it becomes easier to debug.

If the cells are tab separated, then the data can be pasted straight from the log into Excel, and it will correctly divide into multiple cells.

from io import StringIO
logCsv = 'A\tB\n'
logCsv += '2.3\t4.5\n'  # This string could be generated in PyBind11/C++.
LOGDATA = StringIO(logCsv)
df = pd.read_csv(LOGDATA, sep="\t")
print(df)
# Can now paste output straight into Excel.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2025 STACKOOM.COM