簡體   English   中英

在 R 中有效地創建數字編碼的虛擬變量?

[英]Create numerically encoded dummy variables efficiently in R?

我們如何轉換表單的數據

df <- structure(list(customer_number = c(3, 3, 1, 1, 3), 
                     item = c("milkshake","burger", "apple", "burger", "water")
                       ), 
                row.names = c(NA, -5L), class = "data.frame")


#   customer_number      item
# 1               3 milkshake
# 2               3    burger
# 3               1     apple
# 4               1    burger
# 5               3     water

變成數字編碼的虛擬變量,像這樣


data.frame(customer_number=c(1,3),
           item_milkshake=c(0,1),
           item_burger=c(1,1),
           item_apple=c(1,0),
           item_water=c(0,1))

#   customer_number item_milkshake item_burger item_apple item_water
# 1               1              0           1          1          0
# 2               3              1           1          0          1

我們可以創建一個值為 1 的虛擬列,並以寬格式獲取數據。

library(dplyr)

df %>%
  mutate(n = 1) %>%
  arrange(customer_number) %>%
  tidyr::pivot_wider(names_from = item, values_from = n,
                     values_fill = list(n = 0), names_prefix = "item_")

# A tibble: 2 x 5
#  customer_number item_apple item_burger item_milkshake item_water
#            <dbl>      <dbl>       <dbl>          <dbl>      <dbl>
#1               1          1           1              0          0
#2               3          0           1              1          1

如果你想使用基本的 R 函數,這里有一個使用 table() 函數的簡單解決方案:

#Create the dataset
df <- structure(list(customer_number = c(3, 3, 1, 1, 3), item = c("milkshake", 
                                                             "burger", "apple", "burger", "water")), row.names = c(NA, -5L

res <- as.matrix(table(df$customer_number,df$item))
res[res > 0 ] <- 1 #dummy variable
res

    apple burger milkshake water
  1     1      1         0     0
  3     0      1         1     1

您可以將 customer_number 作為單獨的列添加到矩陣中:

res <- cbind(customer_number = as.numeric(rownames(res)), res)
res

  customer_number apple burger milkshake water
1               1     1      1         0     0
3               3     0      1         1     1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM