如何将列添加到数据框的切片并应用更改

Question

I need to perform a data operation at different subsets of a dataframe and add an additional column, also maintain order of the data.我需要在数据框的不同子集上执行数据操作并添加一个额外的列，同时维护数据的顺序。

For example:例如：

df = pd.DataFrame({"x": [1,2,3,4], "y": [0,1,1,0]})

a = df[df.x>2]
b = df[df.x<=2]

a["foo"] = ["a", "b"]
b["foo"] = ["e"]

Now you get the usual warning of slicing operations, dataframe a and b are both changed.现在您会收到切片操作的常见警告，数据框 a 和 b 都已更改。 However, df remains untouched.但是， df 保持不变。

I am wondering if there is a way to get the outcome of the df like:我想知道是否有办法获得 df 的结果，例如：

x     y     foo 
1.    0.    "e"
2.    1.    "e"
3.    1.    "a"
4.    0.    "b"

Note that the operations on setting foo are quite different and complicated, the above is just an example.请注意，设置 foo 的操作非常不同且复杂，以上只是一个示例。 It's not about how to getting to the end output, more about how to subset a dataframe, perform operations and combine with the additional column added and maintaining the order.这不是关于如何获得最终输出，而是关于如何对数据框进行子集化、执行操作以及与添加的附加列组合并维护顺序。

Thanks in advance.提前致谢。

Answer 1

df.loc[df.x.gt(2), 'foo'] = ['a', 'b'] # Note, this only works because df.x.gt(2) returns 2 rows.
df.loc[df.x.le(2), 'foo'] = 'e'

Output:输出：

   x  y foo
0  1  0   e
1  2  1   e
2  3  1   a
3  4  0   b

如何将列添加到数据框的切片并应用更改

问题描述

1 个解决方案

解决方案1
2 已采纳 2022-06-23 04:16:05

如何将列添加到数据框的切片并应用更改

问题描述

1 个解决方案

解决方案1 2 已采纳 2022-06-23 04:16:05

解决方案1
2 已采纳 2022-06-23 04:16:05