简体   繁体   English

如果行中的变量在 R 中匹配或不匹配,如何将 1 和 0 分配给列

[英]How to assign 1s and 0s to columns if variable in row matches or not match in R

I'm an absolute beginner in coding and R and this is my third week doing it for a project.我是编码和 R 的绝对初学者,这是我为一个项目做这件事的第三周。 (for biologists, I'm trying to find the sum of risk alleles for PRS) but I need help with this part (对于生物学家,我试图找到 PRS 的风险等位基因的总和)但我需要这部分的帮助

df
  x y z
1 t c a
2 a t a
3 g g t

so when code applied:所以当代码应用时:

  x y z
1 t 0 0
2 a 0 1
3 g 1 0
```

I'm trying to make it that if the rows in y or z match x the value changes to 1 and if not, zero
I started with: 
```
for(i in 1:ncol(df)){
  df[, i]<-df[df$x == df[,i], df[ ,i]<- 1]
}
```
But got all NA values 
In reality, I have 100 columns I have to compare with x in the data frame. Any help is appreciated

A tidyverse approach一种tidyverse方法

library(dplyr)

df <-
  tibble(
    x = c("t","a","g"),
    y = c("c","t","g"),
    z = c("a","a","t")
  )

df %>% 
  mutate(
    across(
      .cols = c(y,z),
      .fns = ~if_else(. == x,1,0) 
    )
  )

# A tibble: 3 x 3
  x         y     z
  <chr> <dbl> <dbl>
1 t         0     0
2 a         0     1
3 g         1     0

An alternative way to do this is by using ifelse() in base R.另一种方法是在基础 R 中使用ifelse()

df$y <- ifelse(df$y == df$x, 1, 0)
df$z <- ifelse(df$z == df$x, 1, 0)
df
#  x y z
#1 t 0 0
#2 a 0 1
#3 g 1 0

Edit to extend this step to all columns efficiently编辑以有效地将此步骤扩展到所有列

For example:例如:

df1
#  x y z w
#1 t c a t
#2 a t a a
#3 g g t m

To apply column editing efficiently, a better approach is to use a function applied to all targeted columns in the data frame.要有效地应用列编辑,更好的方法是使用应用于数据框中所有目标列的函数。 Here is a simple function to do the work:这是一个简单的函数来完成这项工作:

edit_col <- function(any_col) any_col <- ifelse(any_col == df1$x, 1, 0)

This function takes a column, and then compare the elements in the column with the elements of df1$x , and then edit the column accordingly.此函数取一列,然后将列中的元素与df1$x的元素进行比较,然后相应地编辑该列。 This function takes a single column.此函数采用单列。 To apply this to all targeted columns, you can use apply() .要将其应用于所有目标列,您可以使用apply() Because in your case x is not a targeted column, you need to exclude it by indexing [,-1] because it is the first column in df .因为在您的情况下x不是目标列,您需要通过索引 [,-1] 来排除它,因为它是df的第一列。

# Here number 2 indicates columns. Use number 1 for rows.

df1[, -1] <- apply(df1[,-1], 2, edit_col)
df1
#  x y z w
#1 t 0 0 1
#2 a 0 1 1
#3 g 1 0 0

Of course you can also define a function that edit the data frame so you don't need to do apply() manually.当然你也可以定义一个编辑数据框的函数,这样你就不需要手动执行apply()

Here is an example of such function这是此类功能的示例

edit_df <- function(any_df){
    edit_col <- function(any_col) any_col <- ifelse(any_col == any_df$x, 1, 0)
    
    # Create a vector containing all names of the targeted columns.
    
    target_col_names <- setdiff(colnames(any_df), "x")
    
    any_df[,target_col_names] <-apply( any_df[,target_col_names], 2, edit_col)
    return(any_df)
}

Then use the function:然后使用函数:

edit_df(df1)
#  x y z w
#1 t 0 0 1
#2 a 0 1 1
#3 g 1 0 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM