[英]Pandas: merging column entries into a single row
使用df
您的数据框,您可以执行以下操作:
import numpy as np
import pandas as pd
df_new = df[~df.Date.isna()].reset_index(drop=True)
df_new["Transaction Details"] = (
df["Transaction Details"]
.groupby(np.where(df.Date.isna(), 0, 1).cumsum())
.apply(lambda col: ", ".join(str(item) for item in col))
.reset_index(drop=True)
)
就像一个说明:结果 - df_new
- 对于以下数据帧
df = pd.DataFrame(
{
"Date": [1, np.NaN, np.NaN, 2, np.NaN, np.NaN, np.NaN],
"Transaction Details": ["a", "b", "c", "d", "e", "f", "g"]
}
)
Date Transaction Details
0 1.0 a
1 NaN b
2 NaN c
3 2.0 d
4 NaN e
5 NaN f
6 NaN g
是
Date Transaction Details
0 1.0 a, b, c
1 2.0 d, e, f, g
如果df["Transaction Details"]
只包含字符串,那么你可以替换
.apply(lambda col: ", ".join(str(item) for item in col))
与.apply(", ".join)
。
让我们先创建一些示例数据。
df = pd.DataFrame({
"Date": ["01-Apr", np.nan, np.nan, "02-Apr", np.nan],
"Details": ["Payment", "Supplier Payment", "1000", "Payment", "SGD 1658.5"]
})
Date Details
0 01-Apr Payment
1 NaN Supplier Payment
2 NaN 1000
3 02-Apr Payment
4 NaN SGD 1658.5
如果你想合并行之间没有任何分隔符,你可以试试这个。
df["Date"] = df["Date"].ffill()
df = df.fillna("").groupby("Date", as_index=False).sum()
这产生以下结果。
Date Details
0 01-Apr PaymentSupplier Payment1000
1 02-Apr PaymentSGD 1658.5
如果您想在合并的值之间有一些分隔符,事情会变得更加复杂。
sep = ", "
df["Date"] = df["Date"].ffill()
df["Details"] += sep
df = df.fillna("").groupby("Date", as_index=False).sum()
df["Details"] = df["Details"].str[:-1 * len(sep)]
这给出了以下结果。
Date Details
0 01-Apr Payment, Supplier Payment, 1000
1 02-Apr Payment, SGD 1658.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.