简体   繁体   English

将带有包含 timedelta64 对象列表的列的 DataFrame 保存为 Parquet 文件

[英]Save a DataFrame with a column containing a list of timedelta64 objects as a Parquet file

I have a list of lists of Pandas timedelta64 called td_list like我有一个名为 td_list 的 Pandas timedelta64 列表列表

[Timedelta('0 days 01:06:15'), Timedelta('0 days 01:34:46'), Timedelta('0 days 00:00:00'), Timedelta('0 days 00:00:00')]
[Timedelta('0 days 01:51:46'), Timedelta('0 days 01:40:40')]
[Timedelta('0 days 07:07:52'), Timedelta('0 days 07:32:00'), Timedelta('0 days 00:00:00'), Timedelta('0 days 04:54:26')]
[Timedelta('0 days 00:00:00'), Timedelta('0 days 04:28:36'), Timedelta('0 days 10:49:42'), Timedelta('0 days 06:36:23')]

I'm appending it as a DataFrame new column我将其附加为 DataFrame 新列

df["td_col"] = td_list 

and everything is fine, I obtain a column with elements一切都很好,我得到一个包含元素的列

[0 days 01:06:15, 0 days 01:34:46, 0 days 00:00:00, 0 days 00:00:00]
[0 days 01:51:46, 0 days 01:40:40]

etc...

as expected.正如预期的那样。 but when I'm saving it with但是当我保存它时

df.to_parquet(path,compression=None, engine="fastparquet")

I obtain我得到

Can't infer object conversion type: 0    0   0 days 02:58:20.333333333
dtype: timedelta...
1    0   0 days 02:05:58.727272727
dtype: timedelta...
2    0   0 days 01:45:38.250000
dtype: timedelta64[ns]
3    0   0 days 00:40:15.250000
dtype: timedelta64[ns]
4           0   0 days 01:46:13
dtype: timedelta64[ns]
5    0   0 days 04:53:34.500000
dtype: timedelta64[ns]
6    0   0 days 05:28:40.250000
dtype: timedelta64[ns]
7    0   0 days 02:23:05.500000
dtype: timedelta64[ns]
8    0   0 days 01:50:01.500000
dtype: timedelta64[ns]
9                    0   0 days
dtype: timedelta64[ns]
Name: td_col, dtype: object

Do you know how can I fix this?你知道我该如何解决这个问题吗?

Even though your values are pandas datetime values, your column d-type is object.即使您的值是 pandas 日期时间值,您的列 d 类型也是 object。 You should convert your date column to a datetime column first using:您应该首先使用以下方法将您的日期列转换为日期时间列:

df["td_col"] = pd.to_datetime(df["td_col"])

try using a different engine.尝试使用不同的引擎。 I tried engine="pyarrow" for a column with list and it worked我为带有列表的列尝试了 engine="pyarrow" 并且它有效

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM