基于 R 中的现有列创建新列

Question

This is a sample of my dataframe.这是我的数据框的示例。 It comes from a survey where the original question was: "Where are you located? Mark all that apply."它来自一项调查，最初的问题是：“你在哪里？标记所有适用的选项。”

Code   Option1   Option2   Option3   Option4
101        A        C         NA        NA
102        B        D         NA        NA
103        A        B         D         NA
104        D        NA        NA        NA
105        A        B         C         D

I would like to transform this data so that each column is one of the locations and you get a 0/1 if you're located in any of the 4 locations:我想转换此数据，以便每一列都是位置之一，如果您位于 4 个位置中的任何一个，则会得到 0/1：

Code   A   B   C   D
101    1   0   1   0
102    0   1   0   1
103    1   1   0   1
104    0   0   0   1
105    1   1   1   1

I tried using ifelse statements, but I kept getting errors.我尝试使用 ifelse 语句，但我不断收到错误消息。 Any suggestions?有什么建议？ Thanks!谢谢！

Answer 1

Using tidyverse使用tidyverse

library(dplyr)
library(tidyr)
df1 %>%
    pivot_longer(cols = -Code, values_drop_na = TRUE) %>% 
    mutate(n = 1) %>% 
    select(-name) %>% 
    pivot_wider(names_from = value, values_from = n, values_fill = list(n = 0)) %>%
    select(Code, LETTERS[1:4])
#   Code A B C D
#1  101 1 0 1 0
#2  102 0 1 0 1
#3  103 1 1 0 1
#4  104 0 0 0 1
#5  105 1 1 1 1

Or using mtabulate或者使用mtabulate

library(qdapTools)
cbind(df1[1], +(mtabulate(as.data.frame(t(df1[-1]))) > 0))

Or using melt/dcast或者使用melt/dcast

library(data.table)
dcast(melt(setDT(df1), id.var = 'Code', na.rm = TRUE), Code ~ value, length)

Answer 2

I've done this while converting True/False survey responses to binary 1,0 using gsub:我在使用 gsub 将 True/False 调查响应转换为二进制 1,0 时完成了此操作：

t <- function(x) gsub("A",1,x)
f <- function(x) gsub("B",0,x)

df[1:4] <- lapply(df[1:4], t)
df[1:4] <- lapply(df[1:4], f)

I'm sure there's a better way to do this, but this worked for me.我确信有更好的方法可以做到这一点，但这对我有用。

Answer 3

You can try:你可以试试：

tab <- table(cbind(df[1], unlist(df[-1])))
cbind(Code = row.names(tab), as.data.frame.matrix(tab), row.names = NULL)

  Code A B C D
1  101 1 0 1 0
2  102 0 1 0 1
3  103 1 1 0 1
4  104 0 0 0 1
5  105 1 1 1 1

Answer 4

Assuming 'df1' is your table, this approach takes a few more lines but is easy to understand:假设 'df1' 是你的表，这种方法需要多行几行，但很容易理解：

library(tidyverse)
library(reshape2)

df1 %>% 
  gather(Code) %>% 
  dcast(Code ~ value, fun.aggregate=length) %>%
  select(-'NA')

Your result is:你的结果是：

  Code A B C D
1  101 1 0 1 0
2  102 0 1 0 1
3  103 1 1 0 1
4  104 0 0 0 1
5  105 1 1 1 1

基于 R 中的现有列创建新列

问题描述

4 个解决方案

解决方案1
2 已采纳 2020-01-29 21:43:08

解决方案2
0

解决方案3
0 2020-01-29 21:37:48

解决方案4
0 2020-01-29 23:20:01

基于 R 中的现有列创建新列

问题描述

4 个解决方案

解决方案1 2 已采纳 2020-01-29 21:43:08

解决方案2 0

解决方案3 0 2020-01-29 21:37:48

解决方案4 0 2020-01-29 23:20:01

解决方案1
2 已采纳 2020-01-29 21:43:08

解决方案2
0

解决方案3
0 2020-01-29 21:37:48

解决方案4
0 2020-01-29 23:20:01