繁体   English   中英

使用 tidyr 的 package 将数据列重塑为行和/或在 R 中重塑

[英]Reshaping data column into rows using the package of tidyr and/or reshape in R

我想将最后一列(在当前数据集中)移动到缺少数据的行(在所需数据集中)。 我浏览了 reshape、reshape2 和 tidyr 的软件包。 tidyr package 中的 function gather() 应该是解决方案,但到目前为止我无法处理。 对此有什么建议吗? 提前致谢。

当前数据集:

  date       citycode cityname       cases deaths cases_03_04
1 01-04-2020        1 city1            197      3         241
2 01-04-2020        2 city2             26      0          32

所需数据集:

  date       citycode cityname       cases deaths
1 01-04-2020        1 city1            197      3         
2 01-04-2020        2 city2             26      0          
3 03-04-2020        1 city1            241     na
4 03-04-2020        2 city2             32     na

data.table解决方案。

library(data.table)
dt <- fread('date citycode cityname cases deaths cases_03_04
01-04-2020 1 city1 197 3 241
01-04-2020  2 city2 26 0 32')


wdt <- dcast(dt,...~date,value.var = "cases")

wdt
#>    citycode cityname deaths cases_03_04 01-04-2020
#> 1:        1    city1      3         241        197
#> 2:        2    city2      0          32         26
setnames(wdt,old = "cases_03_04",new = "03-04-2020")

ldt <- melt(wdt,measure.vars = patterns("2020$"),variable.name = "date",value.name = "cases")

ldt[date=="03-04-2020",deaths:=NA][]
#>    citycode cityname deaths       date cases
#> 1:        1    city1     NA 03-04-2020   241
#> 2:        2    city2     NA 03-04-2020    32
#> 3:        1    city1      3 01-04-2020   197
#> 4:        2    city2      0 01-04-2020    26

tidyr解决方案

library(dplyr)
library(tidyr)
library(data.table)

dt <- fread('date citycode cityname cases deaths cases_03_04
01-04-2020 1 city1 197 3 241
01-04-2020  2 city2 26 0 32')

df <- tibble(dt)

wdf <- pivot_wider(df,names_from = date,values_from =  cases)

names(wdf) <- c("citycode", "cityname", "deaths", "03-04-2020", "01-04-2020")

wdf
#> # A tibble: 2 x 5
#>   citycode cityname deaths `03-04-2020` `01-04-2020`
#>      <int> <chr>     <int>        <int>        <int>
#> 1        1 city1         3          241          197
#> 2        2 city2         0           32           26

ldf <- pivot_longer(wdf,cols =c("03-04-2020", "01-04-2020"),names_to = "date",values_to = "cases")

ldf %>% 
  mutate(deaths=ifelse(date=="03-04-2020",NA,deaths))
#> # A tibble: 4 x 5
#>   citycode cityname deaths date       cases
#>      <int> <chr>     <int> <chr>      <int>
#> 1        1 city1        NA 03-04-2020   241
#> 2        1 city1         3 01-04-2020   197
#> 3        2 city2        NA 03-04-2020    32
#> 4        2 city2         0 01-04-2020    26

代表 package (v0.3.0) 于 2020 年 4 月 17 日创建

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM