简体   繁体   English

改变一个因素的一个水平后一个水平

[英]Change a level of a factor after another level

I want to change the order of the levels of a factor so that a specific level comes right after another level, but I'm struggling how to do it efficiently.我想改变一个因素的水平顺序,以便一个特定的水平紧跟在另一个水平之后,但我正在努力如何有效地做到这一点。

Let's assume that we want to change the level of the following factor so that "20" comes right after "10".假设我们想改变以下因子的水平,使“20”紧跟在“10”之后。 So I tried this and succesfully got the expected result:所以我尝试了这个并成功地得到了预期的结果:

library(tidyverse)

sample_factor <- factor(1:30)

trial_factor1 <- sample_factor %>% fct_relevel("20", after=which(levels(.)=="10"))
levels(trial_factor1)
#>  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "20" "11" "12" "13" "14"
#> [16] "15" "16" "17" "18" "19" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30"

However, if the order of the initial factor is reversed, it doesn't work:但是,如果将初始因子的顺序颠倒,则不起作用:

trial_factor2 <- sample_factor %>% fct_rev() %>% fct_relevel("20", after=which(levels(.)=="10"))
levels(trial_factor2)
#>  [1] "30" "29" "28" "27" "26" "25" "24" "23" "22" "21" "19" "18" "17" "16" "15"
#> [16] "14" "13" "12" "11" "10" "9"  "20" "8"  "7"  "6"  "5"  "4"  "3"  "2"  "1"

This is probably because, in this case, "20" is initially positioned before "10".这可能是因为在这种情况下,“20”最初位于“10”之前。

In addition, if I also try to change the order so that "30" comes right after "20" (expected factor levels: ..., 10, 20, 30, ...), the result gets worse:此外,如果我还尝试更改顺序,使“30”紧随“20”之后(预期因子水平:...、10、20、30、...),结果会变得更糟:

trial_factor3 <- sample_factor %>% fct_rev() %>% fct_relevel("20", after=which(levels(.)=="10")) %>%
  fct_relevel("30", after=which(levels(.)=="20"))
levels(trial_factor3)
#>  [1] "29" "28" "27" "26" "25" "24" "23" "22" "21" "19" "18" "17" "16" "15" "14"
#> [16] "13" "12" "11" "10" "9"  "20" "8"  "30" "7"  "6"  "5"  "4"  "3"  "2"  "1"

Created on 2022-08-18 by the reprex package (v2.0.1)reprex package (v2.0.1) 于 2022 年 8 月 18 日创建

In my real situation, I want to change the order of levels multiple times (more than 5 times) and I don't clearly know the initial order of the factor levels, so I find it really difficult to change the order flexibly.在我的实际情况下,我想多次(超过5次)改变水平的顺序,我并不清楚因素水平的初始顺序,所以我发现灵活改变顺序真的很困难。

I really appreciate your help in advance!非常感谢您提前提供的帮助!

The issue is that the after argument specifies the position in which to place the level in the final output, whereas with after = which(levels(.) == "10") you are determining the after based on the current position of the target.问题是after参数指定 position 将级别放置在最终output 中,而使用after = which(levels(.) == "10")您正在根据当前Z4757FE07FD492A8BEDEEZA6 的目标确定after . Thus, if you remove it from earlier in the order, then your description of the target location needs to be adjusted accordingly.因此,如果您从订单的前面删除它,那么您对目标位置的描述需要相应地调整。 If it's moved from somewhere after the destination, then it's fine.如果它是从目的地之后的某个地方移动的,那很好。 Therefore for your application, I think you need to test which of these situations you have.因此,对于您的应用程序,我认为您需要测试您有哪些情况。 Here's a small helper function to test and return the appropriately positioned relevel.这是一个小帮手 function 来测试并返回适当定位的重新调平。

Note: If you're going to be moving more than one level at a time, you will have to make the function a bit more complex to test for how many levels are coming from before the destination and adjust accordingly.注意:如果您要一次移动多个级别,则必须使 function 更加复杂,以测试从目的地之前有多少级别并相应地进行调整。

library(tidyverse)

f <- factor(1:5)

# works because nothing is removed from before the "after" position
f %>% fct_relevel("5", after = which(levels(.) == "3")) %>% levels()
#> [1] "1" "2" "3" "5" "4"

# fails because you are removing one element from before the "after" position
# so the new location should be shifted by 1
f %>% fct_relevel("3", after = which(levels(.) == "4")) %>% levels()
#> [1] "1" "2" "4" "5" "3"

# this function tests if the moves comes from before or after destination
fct_relevel_after <- function(fct, lev, after){
  l <- levels(fct)
  a <- match(lev, l)
  b <- match(after, l)
  if(a < b) {
    return(fct_relevel(fct, lev, after = b-1))
  } else {
    return(fct_relevel(fct, lev, after = b))
  }
}

# both work as desired
f %>% fct_relevel_after("5", "3") %>% levels()
#> [1] "1" "2" "3" "5" "4"
f %>% fct_relevel_after("3", "4") %>% levels()
#> [1] "1" "2" "4" "3" "5"

Created on 2022-08-17 by the reprex package (v2.0.1)代表 package (v2.0.1) 于 2022 年 8 月 17 日创建

Seems like fct_relevel is designed to work positionally, make this level the nth level (where n = after + 1 ... sort of a strange design/naming decision), but you want to work based only on the level names (labels).似乎fct_relevel旨在按位置工作,将此级别设为第 n 级(其中n = after + 1 ... 有点奇怪的设计/命名决定),但您只想根据级别名称(标签)工作。

We can write our own version that does this, translating the after label to position , and accounting for the before/after after-label trouble.我们可以编写自己的版本来执行此操作,将label转换为position ,并解决标签前后的问题。 (Also, why use 30 levels in a sample when 5 will do nicely?) (另外,为什么在一个样本中使用 30 个级别,而 5 个级别会很好呢?)

fct_relevel_label = function(.f, level, after_label) {
  lev = levels(.f)
  move = which(lev == level)
  target = which(lev == after_label)
  after = if(move <= target) {target - 1} else {target}
  fct_relevel(.f, level, after = after)
}

factor(1:5) %>% fct_relevel_label("2", after_label = "4") %>% levels
# [1] "1" "3" "4" "2" "5"

factor(1:5) %>% fct_rev() %>% fct_relevel_label("2", after_label = "4") %>% levels
# [1] "5" "4" "2" "3" "1"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM