![](/img/trans.png)
[英]R - Function to create a data.frame containing manipulated data from another data.frame
[英]R: create new data.frame with time series from another data.frame
我有一個帶有結構的data.frame:
> str(prv)
'data.frame': 13184 obs. of 7 variables:
$ date : Factor w/ 103 levels "2020-01-01",..: 1 1 1 1 1 1 1 1 1 1 ...
$ code : int 13 13 13 13 13 17 17 17 21 21 ...
$ region : Factor w/ 21 levels "loc1","loc2",..: 1 1 1 1 1 2 2 2 12 12 ...
$ codprv : int 69 66 68 67 979 77 76 980 21 981 ...
$ denprv : Factor w/ 108 levels "city1","city2",..: 25 44 70 93 42 55 75 42 16 42 ...
$ shortprv : Factor w/ 107 levels "","C1","C2","C3",..: 24 7 65 92 1 58 74 1 20 1 ...
$ sum : int 0 0 0 0 0 0 0 0 0 0 ...
和 data.frame 是這樣的:
date code region codprv denprv shortprv sum
2020-01-01 13 loc1 69 city1 C1 0
2020-01-01 13 loc1 66 city2 C2 0
2020-01-01 14 loc2 70 city3 C3 0
...
2020-01-02 13 loc1 68 city1 C3 0
2020-01-02 13 loc1 66 city2 C2 5
2020-01-02 14 loc2 70 city3 C3 1
...
2020-01-03 13 loc1 68 city1 C3 15
2020-01-03 13 loc1 66 city2 C2 7
2020-01-03 14 loc2 70 city3 C3 5
...
等等...
我需要得到:
date city1 city2 city3 ... cityN
2020-01-01 0 0 0 ... n1
2020-01-02 0 5 1 ... n2
2020-01-03 15 7 5 ... n3
我最近學會了使用 R,我只用它來執行統計分析,而不是時間序列分析。
手動操作並不難,但我想知道一種正確的轉換方式(並學習如何(重新)獨立使用它)。
對不起我的語言。
感謝您的關注。
您需要來自tidyr
pivot_wider
df <- data.frame(date = rep(seq(as.Date("2020/1/1"), by = "day", length.out = 4), each = 3),
denprv = rep(c("city1", "city2", "city3"), 4),
sum = 1:12)
library(tidyr)
pivot_wider(df, names_from = denprv, values_from = sum)
# A tibble: 4 x 4
date city1 city2 city3
<date> <int> <int> <int>
1 2020-01-01 1 2 3
2 2020-01-02 4 5 6
3 2020-01-03 7 8 9
4 2020-01-04 10 11 12
您的數據是長格式的,而您想要寬格式。 查看有關整潔數據的信息。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.