[英]error spreading data set in R
I have a long data set that is broken down by geographical location and year, with about 5 variables of interest (see structure blow), every time I try to convert it to wide form, I get told that there's duplication so it can't. 我有一个长的数据集,该数据集按地理位置和年份细分,有大约5个感兴趣的变量(请参见结构分解),每次我尝试将其转换为宽格式时,都会被告知存在重复,因此无法。
df
Yr Geo Obs1 Obs2
2001 Dist1 1 3
2002 Dist1 2 5
2003 Dist1 4 2
2004 Dist1 2 1
2001 Dist2 1 3
2002 Dist2 .9 5
2003 Dist2 6 8
2004 Dist2 2 .2
I want to convert it into something like this 我想把它转换成这样的东西
yr dist1obs1 dist1obs2 dist2obs1 dist2obs2
2001
2002
2003
2004
Looking for something like this...? 寻找这样的东西...?
> reshape(df, v.names= c("Obs1", "Obs2"), idvar="Yr", timevar ="Geo", direction="wide")
Yr Obs1.Dist1 Obs2.Dist1 Obs1.Dist2 Obs2.Dist2
1 2001 1 3 1.0 3.0
2 2002 2 5 0.9 5.0
3 2003 4 2 6.0 8.0
4 2004 2 1 2.0 0.2
Here is a solution using tidyr
. 这是使用tidyr
的解决方案。 Because spread
works with one key-value pair, you need to first gather
the Obs
and unite
the dist
with it so that you have one key-value pair to work with. 由于spread
的一个键值对的作品,你需要先gather
的Obs
和unite
的dist
,让你有一个键值对一起工作吧。 I also set the column names to be lower case as shown in the requested output. 我还将列名设置为小写,如请求的输出所示。
library(tidyverse)
tbl <- read_table2(
"Yr Geo Obs1 Obs2
2001 Dist1 1 3
2002 Dist1 2 5
2003 Dist1 4 2
2004 Dist1 2 1
2001 Dist2 1 3
2002 Dist2 .9 5
2003 Dist2 6 8
2004 Dist2 2 .2"
)
tbl %>%
gather("obsnum", "obs", Obs1, Obs2) %>%
unite(colname, Geo, obsnum, sep = "") %>%
spread(colname, obs) %>%
`colnames<-`(str_to_lower(colnames(.)))
#> # A tibble: 4 x 5
#> yr dist1obs1 dist1obs2 dist2obs1 dist2obs2
#> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 2001 1. 3. 1.00 3.00
#> 2 2002 2. 5. 0.900 5.00
#> 3 2003 4. 2. 6.00 8.00
#> 4 2004 2. 1. 2.00 0.200
Created on 2018-04-19 by the reprex package (v0.2.0). 由reprex软件包 (v0.2.0)创建于2018-04-19。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.