R中的错误扩展数据集

Question

I have a long data set that is broken down by geographical location and year, with about 5 variables of interest (see structure blow), every time I try to convert it to wide form, I get told that there's duplication so it can't. 我有一个长的数据集，该数据集按地理位置和年份细分，有大约5个感兴趣的变量（请参见结构分解），每次我尝试将其转换为宽格式时，都会被告知存在重复，因此无法。

df
Yr    Geo     Obs1  Obs2  
2001  Dist1    1     3     
2002  Dist1    2     5   
2003  Dist1    4     2    
2004  Dist1    2     1   
2001  Dist2    1     3     
2002  Dist2   .9     5     
2003  Dist2    6     8     
2004  Dist2    2    .2

I want to convert it into something like this 我想把它转换成这样的东西

yr    dist1obs1  dist1obs2  dist2obs1 dist2obs2
2001
2002
2003
2004

Answer 1

Looking for something like this...? 寻找这样的东西...？

> reshape(df, v.names= c("Obs1", "Obs2"), idvar="Yr", timevar ="Geo", direction="wide")
    Yr Obs1.Dist1 Obs2.Dist1 Obs1.Dist2 Obs2.Dist2
1 2001          1          3        1.0        3.0
2 2002          2          5        0.9        5.0
3 2003          4          2        6.0        8.0
4 2004          2          1        2.0        0.2

Answer 2

Here is a solution using tidyr . 这是使用tidyr的解决方案。 Because spread works with one key-value pair, you need to first gather the Obs and unite the dist with it so that you have one key-value pair to work with. 由于spread的一个键值对的作品，你需要先gather的Obs和unite的dist ，让你有一个键值对一起工作吧。 I also set the column names to be lower case as shown in the requested output. 我还将列名设置为小写，如请求的输出所示。

library(tidyverse)
tbl <- read_table2(
  "Yr    Geo     Obs1  Obs2
  2001  Dist1    1     3
  2002  Dist1    2     5
  2003  Dist1    4     2
  2004  Dist1    2     1
  2001  Dist2    1     3
  2002  Dist2   .9     5
  2003  Dist2    6     8
  2004  Dist2    2    .2"
)

tbl %>%
  gather("obsnum", "obs", Obs1, Obs2) %>%
  unite(colname, Geo, obsnum, sep = "") %>%
  spread(colname, obs) %>%
  `colnames<-`(str_to_lower(colnames(.)))
#> # A tibble: 4 x 5
#>      yr dist1obs1 dist1obs2 dist2obs1 dist2obs2
#>   <int>     <dbl>     <dbl>     <dbl>     <dbl>
#> 1  2001        1.        3.     1.00      3.00 
#> 2  2002        2.        5.     0.900     5.00 
#> 3  2003        4.        2.     6.00      8.00 
#> 4  2004        2.        1.     2.00      0.200

Created on 2018-04-19 by the reprex package (v0.2.0). 由reprex软件包（v0.2.0）创建于2018-04-19。

R中的错误扩展数据集

问题描述

2 个解决方案

解决方案1
1 2018-04-19 16:24:25

解决方案2
0 2018-04-19 20:13:21

R中的错误扩展数据集

问题描述

2 个解决方案

解决方案1 1 2018-04-19 16:24:25

解决方案2 0 2018-04-19 20:13:21

解决方案1
1 2018-04-19 16:24:25

解决方案2
0 2018-04-19 20:13:21