简体   繁体   English

条件 dataframe 按行操作

[英]Conditional dataframe manipulation by row

Say I have a df like说我有一个 df 喜欢

示例数据框

and I want a df like this我想要这样的 df

在此处输入图像描述

How would I do this in python or R?我将如何在 python 或 R 中执行此操作? This would be so easy in excel with a simple if statement, for example: c5 =IF(c2 = "X", "ccc", c4).这在 excel 中使用一个简单的 if 语句就非常容易,例如:c5 =IF(c2 = "X", "ccc", c4)。

I thought this would be simple in R too, but I tried df <- df %>% mutate(c4 = ifelse(c2 = 'X', paste(c3, c3, c3), c4)), and it fills all the other values with NA's:我认为这在 R 中也很简单,但我尝试了 df <- df %>% mutate(c4 = ifelse(c2 = 'X', paste(c3, c3, c3), c4)),它填充了所有NA 的其他值:

在此处输入图像描述

Why is this happening and how would I fix it?为什么会发生这种情况,我该如何解决?

Ideally though, I'd like to do this in python. I've tried dfply's mutate and ifelse similarly to the above, and using pandas loc function, but neither have worked.不过,理想情况下,我想在 python 中执行此操作。我尝试了与上述类似的 dfply 的 mutate 和 ifelse,并使用 pandas loc function,但都没有用。

This feels like it should be really simple - is there something obvious that I'm missing?这感觉应该很简单——有什么明显的我遗漏的东西吗?

We may need strrep in R我们可能需要strrep in R

library(dplyr)
df %>%
   mutate(c4 = ifelse(c2 %in% "X", strrep(c3, nchar(c4)), c4))

-output -输出

  id c2 c3  c4
1  1     a aaa
2  2     b bbb
3  3  X  c ccc

data数据

df <- structure(list(id = 1:3, c2 = c("", "", "X"), c3 = c("a", "b", 
"c"), c4 = c("aaa", "bbb", "zzz")), class = "data.frame", row.names = c(NA, 
-3L))
df.c4.where(df.c2.ne("X"), other=df.c3 * 3)

This reads as这读作

"for c4 column: where the c2 values are n ot e qual to "X", keep them as is; otherwise, put the 3-times repeated c3 values". “对于c4如果c2值不等于“X”,则保持原样;否则,放置 3 次重复的c3值”。

Example run:示例运行:

In [182]: df
Out[182]:
   id c2 c3   c4
0   1     a  aaa
1   2     b  bbb
2   3  X  c  zzz

In [183]: df.c4 = df.c4.where(df.c2.ne("X"), other=df.c3 * 3)

In [184]: df
Out[184]:
   id c2 c3   c4
0   1     a  aaa
1   2     b  bbb
2   3  X  c  ccc

I think you can just do in pandas:我想你可以在 pandas 中做:

m = df['c2'] == 'X'
df.loc[m, 'c4'] = df.loc[m, 'c3'].str.repeat(3)

Look for rows whose 'c2' is 'X' and locate 'c3' column, repeat it 3 times and modify the 'c4' column inplace with.loc查找 'c2' 为 'X' 的行并找到 'c3' 列,重复 3 次并修改 'c4' 列 inplace with.loc

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM