簡體   English   中英

為什么R中的熔體返回NA柱?

[英]Why does melt return NA column in R?

我在 R 中有以下列表df

structure(list(disease = structure(c(1L, 1L), .Label = "Barcelona", class = "factor"), 
    `<18` = structure(list(0.193103448275862, 
        0.0445344129554656), .Names = c(NA_character_, NA_character_
    )), `19-25` = structure(list(0.0413793103448276, 
        0.345748987854251), .Names = c(NA_character_, NA_character_
    )), `26-64` = structure(list(0.448275862068966, 0.167611336032389), .Names = c(NA_character_, 
    NA_character_)), `46-64` = structure(list(0.0344827586206897, 
        0.00647773279352227), .Names = c(NA_character_, NA_character_
    )), `>65` = structure(list(0.282758620689655, 
        0.435627530364373), .Names = c(NA_character_, NA_character_
    )), type = structure(1:2, .Label = c("Clinical Trial", "Real-World"
    ), class = "factor")), class = "data.frame", row.names = c(NA, 
-2L))

我想重新排列 dataframe,以便我可以使用melt按城市、平面和年齡組獲取每個值。 但是,我得到一個額外的列作為 output:

melt(df)
           city           type           variable      value          NA
1  Barcelona       flat                  <18           0.19310345 0.044534413
2  Barcelona       house                 <18           0.19310345 0.044534413
3  Barcelona       flat                  19 - 25       0.04137931 0.345748988
4  Barcelona       house                 19 - 25       0.04137931 0.345748988
5  Barcelona       flat                  26 - 45       0.44827586 0.167611336
6  Barcelona       house                 26 - 45       0.44827586 0.167611336
7  Barcelona       flat                  46 - 64       0.03448276 0.006477733
8  Barcelona       house                 46 - 64       0.03448276 0.006477733
9  Barcelona       flat                  > 65          0.28275862 0.435627530
10 Barcelona       house                 > 65          0.28275862 0.435627530

有沒有辦法沒有NA列並在value列中獲取唯一值?

問題是您的度量列是list class,而不是numeric class。 如果我們將它們轉換為數字, melt就可以正常工作。 (我展示了一種方法,但在您的工作流中更早地使用 go 可能會更好,並首先防止將列創建為列表......如果我的代碼適用於您的代碼,這絕對是您應該做的樣本數據在較大數據上遇到問題。在這種情況下, tidyr::unnest可能會有所幫助。)

sapply(df, class)
#  disease      <18    19-25    26-64    46-64      >65     type 
# "factor"   "list"   "list"   "list"   "list"   "list" "factor" 

list_cols = sapply(df, is.list)

df[list_cols] = lapply(df[list_cols], unlist)

reshape2::melt(df, id.vars = c("disease", "type"))
#      disease           type variable       value
# 1  Barcelona Clinical Trial      <18 0.193103448
# 2  Barcelona     Real-World      <18 0.044534413
# 3  Barcelona Clinical Trial    19-25 0.041379310
# 4  Barcelona     Real-World    19-25 0.345748988
# 5  Barcelona Clinical Trial    26-64 0.448275862
# 6  Barcelona     Real-World    26-64 0.167611336
# 7  Barcelona Clinical Trial    46-64 0.034482759
# 8  Barcelona     Real-World    46-64 0.006477733
# 9  Barcelona Clinical Trial      >65 0.282758621
# 10 Barcelona     Real-World      >65 0.435627530

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM