Pandas 将假人转换为新列

Question

我有一个 dataframe 将客户离散化为不同的 Q，它看起来像：

    CustomerID_num  Q1  Q2  Q3  Q4  Q5  Country
0   12346           1   0   0   0   0   United Kingdom
2   12347           0   0   0   0   1   Iceland
9   12348           0   1   0   0   0   Finland
13  12349           0   0   0   0   1   Italy
14  12350           0   1   0   0   0   Norway

我想要做的是向 dataframe 添加一个新列 Q，它显示该客户所在的部门，因此它看起来像：

    CustomerID_num  Q1  Q2  Q3  Q4  Q5  Q    Country
0   12346           1   0   0   0   0   1    United Kingdom
2   12347           0   0   0   0   1   5    Iceland
9   12348           0   1   0   0   0   2    Finland
13  12349           0   0   0   0   1   5    Italy
14  12350           0   1   0   0   0   2    Norway

我能想到的唯一方法是使用 for 循环，但它会让我一团糟。 还有其他方法吗？

Answer 1

一种选择是转储到 numpy：

仅过滤Q列：

cols = df.filter(like = 'Q')

获取等于 1 的列位置：

_, positions = cols.to_numpy().nonzero()
df.assign(Q = positions + 1)
    CustomerID_num  Q1  Q2  Q3  Q4  Q5         Country  Q
0            12346   1   0   0   0   0  United Kingdom  1
2            12347   0   0   0   0   1         Iceland  5
9            12348   0   1   0   0   0         Finland  2
13           12349   0   0   0   0   1           Italy  5
14           12350   0   1   0   0   0          Norway  2

Answer 2

这里有一些其他的选择

d = df.loc[:,lambda x: x.columns.str.startswith('Q')]

选项1：

np.where(d)[-1]+1

选项 2：

np.argmax(d.to_numpy(),axis=1)+1

选项 3：

d.set_axis(range(d.shape[1]),axis=1).idxmax(axis=1)+1

Answer 3

df.loc[df["Q1"] == 1, "Q"] = 1
df.loc[df["Q2"] == 1, "Q"] = 2
df.loc[df["Q3"] == 1, "Q"] = 3
df.loc[df["Q4"] == 1, "Q"] = 4
df.loc[df["Q5"] == 1, "Q"] = 5

This is a possible solution using loc from pandas here is the documentation https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html

当条件为真时，设置名为“Q”的整个列的值

Pandas 将假人转换为新列

问题描述

3 个解决方案

解决方案1
2 2022-09-18 23:26:40

解决方案2
0 2023-01-16 23:36:59

解决方案3
-1 2022-09-18 23:10:53

Pandas 将假人转换为新列

问题描述

3 个解决方案

解决方案1 2 2022-09-18 23:26:40

解决方案2 0 2023-01-16 23:36:59

解决方案3 -1 2022-09-18 23:10:53

解决方案1
2 2022-09-18 23:26:40

解决方案2
0 2023-01-16 23:36:59

解决方案3
-1 2022-09-18 23:10:53