將 df 保存到 excel 然后讀回 df 后，熊貓日期時間值搞砸了

Question

jan_21=[datetime(2021,1,1) + timedelta(hours=i) for i in range(5)]


jan_21
    
[datetime.datetime(2021, 1, 1, 0, 0),
 datetime.datetime(2021, 1, 1, 1, 0),
 datetime.datetime(2021, 1, 1, 2, 0),
 datetime.datetime(2021, 1, 1, 3, 0),
 datetime.datetime(2021, 1, 1, 4, 0)]

prices = np.random.randint(1,100,size=(5,))

prices

[46 23 13 26 52]

df = pd.DataFrame({'datetime':jan_21, 'price':prices})

df

             datetime  price
0 2021-01-01 00:00:00     83
1 2021-01-01 01:00:00     60
2 2021-01-01 02:00:00     29
3 2021-01-01 03:00:00     97
4 2021-01-01 04:00:00     67

到目前為止一切都很好，這就是我期望顯示數據框和日期時間值的方式。 當我將數據幀保存到 excel 文件然后將其讀回數據幀時，問題就出現了，日期時間值被弄亂了。

df.to_excel('price_data.xlsx', index=False)

new_df = pd.read_excel('price_data.xlsx')

new_df

                      datetime  price
0   2021-01-01 00:00:00.000000  83
1   2021-01-01 00:59:59.999999  60
2   2021-01-01 02:00:00.000001  29
3   2021-01-01 03:00:00.000000  97
4   2021-01-01 03:59:59.999999  67

我希望df == new_df評估為True

Answer 1

在問題的可能原因的背景下（參見 sophros 的回答），你可以做些什么來 - 表面上 - 規避問題是在生成 excel 文件之前將df["datetime"]的單元格轉換為字符串，然后轉換在創建new_df后，再次將字符串轉換為日期時間：

df["datetime"] = df["datetime"].dt.strftime("%m/%d/%Y, %H:%M:%S")
df.to_excel('price_data.xlsx', index=False)

new_df = pd.read_excel('price_data.xlsx')
new_df["datetime"] = pd.to_datetime(new_df["datetime"], format="%m/%d/%Y, %H:%M:%S")

Answer 2

00:59:59.999999和02:00:00.000001和03:59:59.999999時間部分差異的原因很可能與Excel和Python或pandas中日期/時間類型的二進制表示略有不同。

時間通常存儲為浮點數，但不同之處在於時間是第 0 次（例如，第 1 AC 年或 1970 年 - 如在 Linux 中；這里有很好的解釋）。 因此，轉換可能會丟失日期/時間的一些最不重要的部分，並且您無能為力，只能將其四舍五入或使用與任何 float 類似的近似比較。

將 df 保存到 excel 然后讀回 df 后，熊貓日期時間值搞砸了

問題描述

2 個解決方案

解決方案1
1 已采納 2021-06-22 10:55:14

解決方案2
0 2021-06-22 10:44:51

將 df 保存到 excel 然后讀回 df 后，熊貓日期時間值搞砸了

問題描述

2 個解決方案

解決方案1 1 已采納 2021-06-22 10:55:14

解決方案2 0 2021-06-22 10:44:51

解決方案1
1 已采納 2021-06-22 10:55:14

解決方案2
0 2021-06-22 10:44:51