簡體   English   中英

將列添加到數據集

[英]Add column to dataset

我有一個包含 2 列的數據集:導演的姓名和他或她獲得的特定獎項。

這是我的數據:

df <- structure(list(Name = c("Mark", "Joseph", "Lucas"), Achievement = c("Cyber Award", 
"Biology Award", "Co-author of 'New are of technology safety'"
)), class = "data.frame", row.names = c(NA, -3L))
    Name                                 Achievement
1   Mark                                 Cyber Award
2 Joseph                               Biology Award
3  Lucas Co-author of 'New are of technology safety'

現在我想添加第三列,指示成就是否與向量中的字符串有關:

my_vector <- c("cyber", "Cyber", "technology", "Technology", "computer", "Computer")

(所以三個大寫和普通字母的條件)。

期望的輸出:

    Name                                 Achievement Cyber Achievement
1   Mark                                 Cyber Award                 1
2 Joseph                               Biology Award                 0
3  Lucas Co-author of 'New are of technology safety'                 1

我不知道從哪里開始,希望有人能幫助我。

首先使用帶有collapse參數的paste創建一個模式。

然后用str_detect查看這些模式字符串中的任何一個是否在列字符串中(成就)。

如果是 1 否則 0:

library(dplyr)
library(stringr)

pattern <- paste(c("cyber", "Cyber", "technology", "Technology", "computer", "Computer"), collapse = "|")


df %>% 
  mutate(`Cyber Achievement` = ifelse(str_detect(Achievement, pattern), 1, 0))

或使用grepl基 R :

df$Cyber_Achievemnt <- ifelse(grepl(pattern, df$Achievement), 1, 0)
    Name                                 Achievement Cyber Achievement
1   Mark                                 Cyber Award                 1
2 Joseph                               Biology Award                 0
3  Lucas Co-author of 'New are of technology safety'                 1

數據:

structure(list(Name = c("Mark", "Joseph", "Lucas"), Achievement = c("Cyber Award", 
"Biology Award", "Co-author of 'New are of technology safety'"
)), class = "data.frame", row.names = c(NA, -3L))

另外的選擇:

library(dplyr)
library(stringr)
condition <- c("Cyber", "cyber", "Technology", "technology", "Computer", "Computer")
df %>% 
  rowwise() %>% 
  mutate(`Cyber Achievement` = sum(str_detect(Achievement, condition)))

輸出:

# A tibble: 3 × 3
# Rowwise: 
  Name   Achievement                                 `Cyber Achievement`
  <chr>  <chr>                                                     <int>
1 Mark   Cyber Award                                                   1
2 Joseph Biology Award                                                 0
3 Lucas  Co-author of 'New are of technology safety'                   1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM