簡體   English   中英

將長 dataframe 重塑為多列的寬

[英]reshape long dataframe to wide for multiple columns

已經有很多關於reshape2庫的問題,但我對一段可以一次重塑我需要的所有列的代碼感興趣。 我有一個電話使用數據集,其中包含每個用戶的日常電話使用情況,如頻率、持續時間,按不同的應用類別划分。 前 10 個樣本的輸入:

structure(list(X = 1:10, user_id = c(10161L, 10161L, 10161L, 
10161L, 10161L, 10161L, 10161L, 10161L, 10161L, 10161L), date = c("2019-02-21", 
"2019-02-21", "2019-02-21", "2019-02-21", "2019-02-22", "2019-02-22", 
"2019-02-22", "2019-02-22", "2019-02-22", "2019-02-23"), categories = c("communication", 
"games & entertainment", "lifestyle", "utility & tools", "communication", 
"games & entertainment", "lifestyle", "social network", "utility & tools", 
"communication"), frequency = c(30L, 13L, 3L, 15L, 99L, 19L, 
8L, 2L, 73L, 57L), cat_duration = c(1663.83800005913, 1855.2380001545, 
38.9109997749329, 1016.48200011253, 7044.4249997139, 8498.35199904442, 
71.5590000152588, 741.676999807358, 2657.03099822998, 5145.73099899292
), dur_pro = c(0.363722652841753, 0.40556357472605, 0.00850612383078189, 
0.222207648601416, 0.370504849244312, 0.446974824255909, 0.00376367929444972, 
0.0390088509726144, 0.139747796232715, 0.314487675459045), freq_pro = c(0.491803278688525, 
0.213114754098361, 0.0491803278688525, 0.245901639344262, 0.492537313432836, 
0.0945273631840796, 0.0398009950248756, 0.00995024875621891, 
0.36318407960199, 0.463414634146341), monetary = c(55.4612666686376, 
142.7106153965, 12.970333258311, 67.7654666741689, 71.1558080779182, 
447.281684160233, 8.94487500190735, 370.838499903679, 36.39768490726, 
90.2759824384723), recency = c(6504.5680000782, 5023.14100003242, 
26999.1610000134, 32.4110000133514, 518.858000040054, 209.592000007629, 
30790.6349999905, 4608.17300009727, 14603.4340000153, 68.6960000991821
)), row.names = c(NA, 10L), class = "data.frame")

通過使用reshape2庫,我可以將其轉換為我需要的格式,但它一次只接受一個變量作為value.var的參數,例如,頻率:

dcast(phone_usage, user_id+date~categories, value.var = 'frequency')

有沒有辦法同時投射 6 個功能? 我相信有一種比單獨投射它們並組合它們更簡單的方法...... (frequency,cat_duration,dur_pro,freq_pro,monetary,recency)

預先感謝您的貢獻!

PS:當 dataframe 被轉換成寬格式時,我知道缺失值的問題,但現在讓我們忽略這個問題。

您可以使用data.table來使用多個value.var

library(data.table)

dcast(setDT(phone_usage), user_id + date ~ categories, value.var = c(names(phone_usage[,5:10])))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM