簡體   English   中英

將值分配給 R 中的 dataframe 中的新列

[英]Assigning values to a new column in a dataframe in R

我有一個包含以下示例數據的數據框:

在此處輸入圖像描述

我需要一個新列(我使用過 add_column)“CAGE_LOCATION”。 新值需要基於 Bldg 和 Bldg-Room 的條件/規則,如下所示:

  • 大廈 5A = 1
  • 62 號樓 = 2
  • 建築物 62-A = 4
  • 大廈 ARL = 5
  • 建築物 53C-106 室 = 6
  • 建築物 5A-147 室 = 7
  • 建築物 5A-157 室 = 7

所有其他建築 = 3。

期望的結果應該如下:

在此處輸入圖像描述

我嘗試使用 Location dataframe 和 ifelse 語句進行合並。 但只有部分列表獲得正確的值。

output 來自輸入:

structure(list(Bldg = c("Bldg 53A", "Bldg 53A", "Bldg 53A", "Bldg 53A", 
"Bldg 53C", "Bldg 53C", "Bldg 53C", "Bldg 53C", "Bldg 53C", "Bldg 53C", 
"Bldg 53C", "Bldg 53C", "Bldg 53C", "Bldg 53C", "Bldg 53C", "Bldg 53C", 
"Bldg 53C", "Bldg 53C", "Bldg 53C", "Bldg 53C", "Bldg 53C", "Bldg 53C", 
"Bldg 53C", "Bldg 5A", "Bldg 5A", "Bldg 5A", "Bldg 5A", "Bldg 5A", 
"Bldg 5A", "Bldg 5A", "Bldg 5A", "Bldg 5A", "Bldg 5A", "Bldg 5A", 
"Bldg 5A", "Bldg 5A", "Bldg 5A", "Bldg 5A", "Bldg 62", "Bldg 62-A", 
"Bldg 62-A", "Bldg 62-A", "Bldg 62-A", "Bldg 62-A", "Bldg 62-A"
), `Bldg-Room` = c("53A-113", NA, "53A-114", NA, "53C-102", NA, 
"53C-104", "53C-109", NA, NA, "53C-110 MS", NA, NA, NA, "53C-121", 
NA, "53C-122", NA, "53C-123", NA, "53C-131", NA, NA, "5A-142", 
NA, NA, NA, "5A-143", NA, NA, NA, "5A-146", "5A-148", NA, NA, 
"5A-157", "5A-181", "5A-183", "62-110", "62A-176", NA, "62A-178", 
NA, "62A-179 MS", NA), CAGE_LOCATION_TYPE = c(3, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 4, 4, 4, 4, 4, 4)), class = "data.frame", row.names = c(NA, 
-45L))

您需要創建一個查找表,然后將其連接到數據的兩個匹配列中的每一個。 然后你可以用 3 替換任何缺失值。

首先,讓我們以與您的數據相同的格式創建一個可重現的 dataframe。 我使用“其他位置”來明確我們不希望在給定列中匹配的位置。

library(conflicted)
library(tidyverse)

sample_data <-
  tibble(
    Bldg = c(
      "Bldg 5A",
      "Bldg 5A",
      "Bldg 5A",
      "Bldg ARL",
      "OTHER LOCATION",
      "OTHER LOCATION"
    ),
    Bldg_Room = c(
      "OTHER_LOCATION",
      "OTHER_LOCATION",
      "OTHER_LOCATION",
      "OTHER_LOCATION",
      "Bldg-Room 53C-106",
      "OTHER LOCATION"
    )
  )

現在創建查找表。

building_lookup <-
  tibble(
    key = c(
      "Bldg 5A",
      "Bldg 62",
      "Bldg 62-A",
      "Bldg ARL",
      "Bldg-Room 53C-106",
      "Bldg-Room 5A-147",
      "Bldg-Room 5A-157"
    ),
    value = c(1, 2, 4, 5, 6, 7, 7)
  )

最后,將您的數據加入查找表兩次(每個匹配列一次),並使用replace_na()CAGE_LOCATION的任何缺失值替換為 3。

sample_data %>%
  left_join(building_lookup, by = c("Bldg" = "key")) %>%
  left_join(building_lookup, by = c("Bldg_Room" = "key")) %>% 
  mutate(CAGE_LOCATION = ifelse(is.na(value.x), value.y, value.x)) %>% 
  select(-starts_with("value")) %>% 
  mutate(CAGE_LOCATION = replace_na(CAGE_LOCATION, 3))

以下解決方案基於 tidyverse 並使用通用矢量化 if 語句case_when() 它通過將使用case_when()產生的變量XY並列並優先化來創建CAGE_LOCATION

library(dplyr)

df %>%
  mutate(X = case_when(Bldg == 'Bldg 5A' ~ 1,
                       Bldg == 'Bldg 62' ~ 2,
                       Bldg == 'Bldg 62-A' ~ 4,
                       Bldg == 'Bldg ARL' ~ 5,
                       TRUE ~ 3),
         Y = case_when(`Bldg-Room` == '53C-106' ~ 6,
                       `Bldg-Room` == '5A-147' ~ 7,
                       `Bldg-Room` == '5A-157' ~ 7,
                       TRUE ~ NA_real_),
         CAGE_LOCATION = if_else(is.na(Y), X, Y)) %>%
  select(-X, -Y)

#         Bldg  Bldg-Room CAGE_LOCATION_TYPE CAGE_LOCATION
# 1   Bldg 53A    53A-113                  3             3
# 2   Bldg 53A       <NA>                  3             3
# 3   Bldg 53A    53A-114                  3             3
# 4   Bldg 53A       <NA>                  3             3
# 5   Bldg 53C    53C-102                  3             3
# 6   Bldg 53C       <NA>                  3             3
# 7   Bldg 53C    53C-104                  3             3
# 8   Bldg 53C    53C-109                  3             3
# 9   Bldg 53C       <NA>                  3             3
# 10  Bldg 53C       <NA>                  3             3
# 11  Bldg 53C 53C-110 MS                  3             3
# 12  Bldg 53C       <NA>                  3             3
# 13  Bldg 53C       <NA>                  3             3
# 14  Bldg 53C       <NA>                  3             3
# 15  Bldg 53C    53C-121                  3             3
# 16  Bldg 53C       <NA>                  3             3
# 17  Bldg 53C    53C-122                  3             3
# 18  Bldg 53C       <NA>                  3             3
# 19  Bldg 53C    53C-123                  3             3
# 20  Bldg 53C       <NA>                  3             3
# 21  Bldg 53C    53C-131                  3             3
# 22  Bldg 53C       <NA>                  3             3
# 23  Bldg 53C       <NA>                  3             3
# 24   Bldg 5A     5A-142                  1             1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM