[英]Replacing values in a Pandas Dataframe
I have a dataframe (named df) as follows:我有一个 dataframe(名为 df)如下:
s01 s03 s06 s07 s08
0 1 1 1 1 1
1 1 1 1 1 1
2 0 1 1 0 1
3 0 0 1 1 0
4 0 0 0 1 1
I would like to replace all the ones by its index value.我想用它的索引值替换所有的。
The final result should look like this:最终结果应如下所示:
s01 s03 s06 s07 s08
0 0 0 0 0 0
1 1 1 1 1 1
2 0 2 2 0 2
3 0 0 3 3 0
4 0 0 0 4 4
This is just a sample.这只是一个示例。 The real dataframe has thousands of rows and thousands of columns.
真正的dataframe有数千行数千列。 The priority is to have an efficient code that modifies the data as quickly as possible.
首要任务是拥有一个能够尽快修改数据的高效代码。
I have thought of 3 possible ways to solve this:我想到了 3 种可能的方法来解决这个问题:
Using 2 'for' loops and an 'if' statement and loop over the panda object directly or converting the data to a 2D numpy array and looping over that.使用 2 个“for”循环和一个“if”语句并直接循环 panda object 或将数据转换为 2D numpy 数组并循环该数组。
Using some kind of pandas build-in filtering function over the pandas dataframe.在 pandas Z6A8064B5DF4794555570553 上使用某种 pandas 内置过滤 function。
Converting the dataframe into a 2D Numpy array and using some kind of numpy build-in function to modify the data. Converting the dataframe into a 2D Numpy array and using some kind of numpy build-in function to modify the data.
Which is the most time efficient way?哪种方式最省时?
Is there some other way that is more efficient and I haven't thought of it?有没有其他更有效的方法,我还没有想到呢?
Thank you谢谢
You can do with mask
:你可以用
mask
做:
df.mask(df.eq(1), df.index)
Output: Output:
s01 s03 s06 s07 s08
0 0 0 0 0 0
1 1 1 1 1 1
2 0 2 2 0 2
3 0 0 3 3 0
4 0 0 0 4 4
If your index is numerical as in this sample, you can also do:如果您的索引是本示例中的数字,您还可以执行以下操作:
df.mul(df.index, axis=0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.