[英]Pandas: merging column entries into a single row
使用df
您的數據框,您可以執行以下操作:
import numpy as np
import pandas as pd
df_new = df[~df.Date.isna()].reset_index(drop=True)
df_new["Transaction Details"] = (
df["Transaction Details"]
.groupby(np.where(df.Date.isna(), 0, 1).cumsum())
.apply(lambda col: ", ".join(str(item) for item in col))
.reset_index(drop=True)
)
就像一個說明:結果 - df_new
- 對於以下數據幀
df = pd.DataFrame(
{
"Date": [1, np.NaN, np.NaN, 2, np.NaN, np.NaN, np.NaN],
"Transaction Details": ["a", "b", "c", "d", "e", "f", "g"]
}
)
Date Transaction Details
0 1.0 a
1 NaN b
2 NaN c
3 2.0 d
4 NaN e
5 NaN f
6 NaN g
是
Date Transaction Details
0 1.0 a, b, c
1 2.0 d, e, f, g
如果df["Transaction Details"]
只包含字符串,那么你可以替換
.apply(lambda col: ", ".join(str(item) for item in col))
與.apply(", ".join)
。
讓我們先創建一些示例數據。
df = pd.DataFrame({
"Date": ["01-Apr", np.nan, np.nan, "02-Apr", np.nan],
"Details": ["Payment", "Supplier Payment", "1000", "Payment", "SGD 1658.5"]
})
Date Details
0 01-Apr Payment
1 NaN Supplier Payment
2 NaN 1000
3 02-Apr Payment
4 NaN SGD 1658.5
如果你想合並行之間沒有任何分隔符,你可以試試這個。
df["Date"] = df["Date"].ffill()
df = df.fillna("").groupby("Date", as_index=False).sum()
這產生以下結果。
Date Details
0 01-Apr PaymentSupplier Payment1000
1 02-Apr PaymentSGD 1658.5
如果您想在合並的值之間有一些分隔符,事情會變得更加復雜。
sep = ", "
df["Date"] = df["Date"].ffill()
df["Details"] += sep
df = df.fillna("").groupby("Date", as_index=False).sum()
df["Details"] = df["Details"].str[:-1 * len(sep)]
這給出了以下結果。
Date Details
0 01-Apr Payment, Supplier Payment, 1000
1 02-Apr Payment, SGD 1658.5
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.