简体   繁体   English

基于多个其他列有条件地替换数据帧列中的值 - R.

[英]Conditional replacement of values in data frame column based off multiple other columns - R

My data frame looks like this 我的数据框看起来像这样

> tornado_frame
         tornado_names Level      value
1     node per cluster   low  -34.72222
2          TB per node   low  -52.08333
3  expense per cluster   low -104.16667
4             Total TB   low  -62.50000
5  revenue per cluster   low  -52.08333
6     node per cluster  high   20.83333
7          TB per node  high   41.66667
8  expense per cluster  high   52.08333
9             Total TB  high  145.83333
10 revenue per cluster  high  156.25000

I want the table to transform into this 我希望桌子变成这个

> tornado_frame
         tornado_names Level      value
1     node per cluster   low   34.72222
2          TB per node   low   52.08333
3  expense per cluster   low  104.16667
4             Total TB   low  -62.50000
5  revenue per cluster   low  -52.08333
6     node per cluster  high  -20.83333
7          TB per node  high  -41.66667
8  expense per cluster  high  -52.08333
9             Total TB  high  145.83333
10 revenue per cluster  high  156.25000

Where the negative sign in "value" changes if its absolute value is greater than that of the "high" Level column and of the same tornado_name column. 如果“值”的负号在绝对值大于“高”级别列和相同tornado_name列的绝对值时发生变化。

I tried a few nested if's but that got messy for me. 我尝试了几个嵌套if,但这对我来说很麻烦。 Any help would be appreciated! 任何帮助,将不胜感激!

Here is my data: 这是我的数据:

> dput(tornado_frame)
structure(list(tornado_names = structure(c(2L, 4L, 1L, 5L, 3L, 
2L, 4L, 1L, 5L, 3L), .Label = c("expense per cluster", "node per cluster", 
"revenue per cluster", "TB per node", "Total TB"), class = "factor"), 
    Level = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L
    ), .Label = c("high", "low"), class = "factor"), value = c(34.72222, 
    52.08333, 104.16667, -62.5, -52.08333, -20.83333, -41.66667, 
    -52.08333, 145.83333, 156.25)), .Names = c("tornado_names", 
"Level", "value"), class = "data.frame", row.names = c(NA, -10L
))

Here's a possible data.table solution 这是一个可能的data.table解决方案

library(data.table)
setDT(df)[, value := if(diff(abs(value)) < 0) value * -1,
                                            by = tornado_names]
df
#           tornado_names Level     value
#  1:    node per cluster   low  34.72222
#  2:         TB per node   low  52.08333
#  3: expense per cluster   low 104.16667
#  4:            Total TB   low -62.50000
#  5: revenue per cluster   low -52.08333
#  6:    node per cluster  high -20.83333
#  7:         TB per node  high -41.66667
#  8: expense per cluster  high -52.08333
#  9:            Total TB  high 145.83333
# 10: revenue per cluster  high 156.25000

This will check your condition per tornado_names and only change the sign for the values within the groups where the condition is satisfied. 这将检查每个tornado_names的条件,并仅更改满足条件的组内的值的符号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM