pandas.loc 返回不一致的類型

Question

假設我有 2 個數據框

df1= pd.DataFrame(["2020-12-31","2021-01-01"],columns={"date"},index=['23845940781720275',"23845940781720275"])

和

df2 = pd.DataFrame(["2020-12-31"],columns={"date"},index=["23845940781720275"])

對於這兩種情況，我想獲得一種方法來枚舉“日期”列中的項目：

索引跨越多個日期 (df1)
索引包含唯一日期 (df2)

當我嘗試以下解決方案時，我得到不一致的結果

> type(df1.loc["23845940781720275"]["date"])

<class 'pandas.core.series.Series'>

> df1.loc["23845940781720275"]["date"]

23845940781720275    2020-12-31
23845940781720275    2021-01-01
Name: date, dtype: object

> type(df2.loc["23845940781720275"]["date"])

<class 'str'>

> df2.loc["23845940781720275"]["date"]

'2020-12-31'

我發現一些帖子說使用df.loc[x][['column']]總是得到 DataFrame，但是當我使用它時，我得到相同程度的不一致

> type(df1.loc["23845940781720275"][["date"]])

<class 'pandas.core.frame.DataFrame'>

> type(df2.loc["23845940781720275"][["date"]])

<class 'pandas.core.series.Series'>

使用pandas使我的 IRL 用例變得更容易、更易讀，有什么解決辦法嗎？

Answer 1

我想這是因為第二個 dataframe 只包含 1 行和 1 列。

在第二種情況下type(df2.loc["23845940781720275"][["date"]]) -

您將 str 轉換為系列。 它不是 dataframe 因為它仍然只包含一個指向單個系列的列。

如果要消除不一致，請使用 -

type(df2.loc[["23845940781720275"]][["date"]]) # pandas.core.frame.DataFrame

要獲取索引使用的日期列表 -

df2.loc[["23845940781720275"]][["date"]]["date"].values.tolist()

Answer 2

這是從數據框中提取日期的（臟）修復：

[date_str.strip() for date_str in df1.loc[['23845940781720275']][['date']].to_string().strip().split(f"\n{'23845940781720275'}")[1:]]

返回： ['2020-12-31', '2021-01-01']

[date_str.strip() for date_str in df2.loc[['23845940781720275']][['date']].to_string().strip().split(f"\n{'23845940781720275'}")[1:]]

返回： ['2020-12-31']

pandas.loc 返回不一致的類型

問題描述

2 個解決方案

解決方案1
2 已采納 2021-04-11 19:09:59

解決方案2
0 2021-04-11 19:34:51

pandas.loc 返回不一致的類型

問題描述

2 個解決方案

解決方案1 2 已采納 2021-04-11 19:09:59

解決方案2 0 2021-04-11 19:34:51

解決方案1
2 已采納 2021-04-11 19:09:59

解決方案2
0 2021-04-11 19:34:51