如何根据行值和列名替换列中的值？

Question

I have previously posted a question on subsetting columns from row values on GIS StackExchange: here .我之前发布了一个关于从 GIS StackExchange 上的行值中对列进行子集化的问题： here 。

In short, I would like to set data to NA , if the column name (eg 100 ) is less than the row value of s_mean (eg value is 101 ).简而言之，如果列名（例如100 ）小于s_mean的行值（例如值为101 ），我想将 data 设置为NA 。

It worked for specific applications but now it does not work, and I get the following error:它适用于特定应用程序，但现在不起作用，并且出现以下错误：

Error: Can't subset columns that don't exist.
x Locations 304, 303, 302, 301, 300, etc. don't exist.
i There are only 197 columns.
Run `rlang::last_error()` to see where the error occurred.

Here is the data:这是数据：

# A tibble: 2,937 x 197
      ID   doy FireID  Year    sE     NAME    L1NAME   ID_2   area s_count s_mean s_median s_stdev  s_min   doydiff ID_E5    32    33    34    35
   <dbl> <dbl>  <dbl> <dbl> <dbl> <chr>     <chr>     <dbl>  <dbl>   <dbl>  <dbl>    <dbl>   <dbl> <dbl>     <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl>
 1  2246   173  30048  2015     0 A         T         30048 3.86e6       0    100        0       0     0       73      56  267.  265.  264.  265.
 2  2275   174  30076  2015     0 A         T         30076 2.15e6       0    100        0       0     0       74     533  266.  266.  263.  264.
 3   704   294  28542  2015  1381 A         T         28542 6.44e5       0    100        0       0     0       194    562  277.  277.  278.  279.
 4   711   110  28549  2015     0 NA        NA        28549 2.15e5       0    101        0       0     0       9      569  262.  264.  260.  262.
 5   690   161  28528  2015   232 A         T         28528 4.29e5       0    101        0       0     0       60     580  280.  279.  280.  279.
 6   692   331  28530  2015     0 M         M         28530 2.15e5       0    101        0       0     0       130    582  280.  279.  281.  280.
 7   667    47  28506  2015   232 M         M         28506 2.79e6       0     10        0       0     0       37     589  280.  282.  281.  280.
 8   672   188  28511  2015     0 NA        NA        28511 2.79e6       0    101        0       0     0       87     594  254.  261.  259.  254.
 9   657   171  28496  2015   578 NA        NA        28496 8.59e5       0    101        0       0     0       170    611  256.  263.  260.  254.
10   635   301  28474  2015  1084 M         M         28474 1.50e6       0    101        0       0     0       200    621  282.  282.  282.  281.

The data columns continue until columns name 212 .数据列一直持续到列名212 。 It is not shown here.这里没有显示。

Here is the script:这是脚本：

polydata = read_csv("path/E15.csv")
polydata$s_mean <- round(polydata$s_mean)
polydata <- polydata[order(polydata$s_mean),]

# slice each row, and put each slice in a list
df_sub = lapply(1:nrow(polydata),
                function(x){
                  polydata[x,c(1,10,polydata$s_mean[x]:187+10)] # + 10 because of the offset: doy_columns start at 11
                })

Why do I get an error that I return too many columns when I specify 187+10 as the subsetting parameter?当我指定 187+10 作为子集参数时，为什么会收到返回太多列的错误？

What should be changed?应该改变什么？

I eventually want this to be the outcome (compare the column names to s_mean to better understand the desired output):我最终希望这是结果（将列名与s_mean进行比较以更好地理解所需的输出）：

ID    s_mean    32    33    34    35    36    ...    212
1     30        267   278   270   269   267   ...    298
2     100       NA    NA    NA    NA    NA    ...    298
3     35        NA    NA    NA    242   246   ...    298

Answer 1

We can use across from dplyr and refer to column names using cur_column .我们可以使用dplyr的across ，并使用cur_column来引用列名。 From there, we can use an ifelse to replace the data with NA if the column name is less than s_mean .从那里，如果列名小于s_mean ，我们可以使用ifelse将数据替换为NA 。 I created a toy dataset to illustrate the solution which can be found at the end of this post.我创建了一个玩具数据集来说明可以在本文末尾找到的解决方案。

library(dplyr)

pdat1 %>% 
  mutate(across(`32`:`35`, 
                ~ifelse(s_mean > as.numeric(cur_column()), NA, .)))

#>      ID s_mean  32  33  34  35
#> 1  2246     30 267 265 264 265
#> 2  2275    100  NA  NA  NA  NA
#> 3   704    100  NA  NA  NA  NA
#> 4   711     34  NA  NA 260 262
#> 5   690    101  NA  NA  NA  NA
#> 6   692    101  NA  NA  NA  NA
#> 7   667     10 280 282 281 280
#> 8   672    101  NA  NA  NA  NA
#> 9   657    101  NA  NA  NA  NA
#> 10  635    101  NA  NA  NA  NA

Toy Dataset:玩具数据集：

pdat1 <- structure(list(ID = c(2246L, 2275L, 704L, 711L, 690L, 692L, 667L, 672L, 
                               657L, 635L), 
                        s_mean = c(30L, 100L, 100L, 34L, 101L, 101L, 10L, 101L, 
                                   101L, 101L), 
                        `32` = c(267, 266, 277, 262, 280, 280, 280, 254, 256, 282), 
                        `33` = c(265, 266, 277, 264, 279, 279, 282, 261, 263, 282), 
                        `34` = c(264, 263, 278, 260, 280, 281, 281, 259, 260, 282), 
                        `35` = c(265, 264, 279, 262, 279, 280, 280, 254, 254, 281)), 
                   class = "data.frame", 
                   row.names = c("1", "2", "3", "4","5", "6", "7", "8", "9", "10"))

#>      ID s_mean  32  33  34  35
#> 1  2246     30 267 265 264 265
#> 2  2275    100 266 266 263 264
#> 3   704    100 277 277 278 279
#> 4   711     34 262 264 260 262
#> 5   690    101 280 279 280 279
#> 6   692    101 280 279 281 280
#> 7   667     10 280 282 281 280
#> 8   672    101 254 261 259 254
#> 9   657    101 256 263 260 254
#> 10  635    101 282 282 282 281

如何根据行值和列名替换列中的值？

问题描述

1 个解决方案

解决方案1
1 2022-07-07 18:53:22

Toy Dataset:玩具数据集：

如何根据行值和列名替换列中的值？

问题描述

1 个解决方案

解决方案1 1 2022-07-07 18:53:22

Toy Dataset:玩具数据集：

解决方案1
1 2022-07-07 18:53:22