[英]How to spread a single column based on multiple columns in R?
Each unique year, site, quadrant, and species has two values "Val" in the dataset.每个独特的年份、地点、象限和物种在数据集中都有两个值“Val”。 I want to spread the values into two columns "Val1" and "Val2".
我想将值分散到两列“Val1”和“Val2”中。 I tried to use the regular spread function but it doesn't seem like its the right fit.
我尝试使用常规传播 function 但它似乎不合适。 Any suggestions?
有什么建议么?
Year Site Quadrant Species Val
2019 1 1 A 20
2019 1 1 A 30
2019 1 1 B 20
2019 1 1 B 25
2019 1 2 A 20
2019 1 2 A 10
2019 1 2 B 11
2019 1 2 B 22
Desired Output所需 Output
Year Site Quadrant Species Val1 Val2
2019 1 1 A 20 30
2019 1 1 B 20 25
2019 1 2 A 20 10
2019 1 2 B 11 22
You can group_by
the columns, mutate
to make the new column headers and then spread
(or pivot_wider
):您可以
group_by
列, mutate
以制作新的列标题,然后spread
(或pivot_wider
):
library(dplyr)
mydata %>%
group_by(Year, Site, Quadrant, Species) %>%
mutate(Var = paste0("Val", row_number())) %>%
spread(Var, Val) %>%
ungroup()
Result:结果:
# A tibble: 4 x 6
Year Site Quadrant Species Val1 Val2
<int> <int> <int> <chr> <int> <int>
1 2019 1 1 A 20 30
2 2019 1 1 B 20 25
3 2019 1 2 A 20 10
4 2019 1 2 B 11 22
Data:数据:
mydata <- read.table(text = "Year Site Quadrant Species Val
2019 1 1 A 20
2019 1 1 A 30
2019 1 1 B 20
2019 1 1 B 25
2019 1 2 A 20
2019 1 2 A 10
2019 1 2 B 11
2019 1 2 B 22", header = TRUE)
You can do this this way: with lead
你可以这样做:用
lead
library(tidyverse)
df %>%
mutate(id = row_number(),
Val2 = lead(Val)) %>%
filter(id %% 2 == 1) %>%
select(-id, Val1 = Val)
Output: Output:
Year Site Quadrant Species Val1 Val2
<dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 2019 1 1 A 20 30
2 2019 1 1 B 20 25
3 2019 1 2 A 20 10
4 2019 1 2 B 11 22
data:数据:
df <- tribble(
~Year, ~Site, ~Quadrant, ~Species, ~Val,
2019, 1, 1, "A", 20,
2019, 1, 1, "A", 30,
2019, 1, 1, "B", 20,
2019, 1, 1, "B", 25,
2019, 1, 2, "A", 20,
2019, 1, 2, "A", 10,
2019, 1, 2, "B", 11,
2019, 1, 2, "B", 22)
using data.table::dcast
and rowid
:使用
data.table::dcast
和rowid
:
library(data.table)
dcast(dtt,
Year + Site + Quadrant + Species ~ rowid(Year, Site, Quadrant, Species),
value.var = 'Val')
# Year Site Quadrant Species 1 2
# 1: 2019 1 1 A 20 30
# 2: 2019 1 1 B 20 25
# 3: 2019 1 2 A 20 10
# 4: 2019 1 2 B 11 22
Similar operation can be done in a tidyverse way, if you prefer:如果您愿意,可以以 tidyverse 的方式完成类似的操作:
dtt %>%
group_by(Year, Site, Quadrant, Species) %>%
mutate(grp = row_number()) %>%
pivot_wider(names_from = grp, values_from = Val, names_prefix = 'Val') %>%
ungroup()
# A tibble: 4 x 6
# Year Site Quadrant Species Val1 Val2
# <int> <int> <int> <chr> <int> <int>
# 1 2019 1 1 A 20 30
# 2 2019 1 1 B 20 25
# 3 2019 1 2 A 20 10
# 4 2019 1 2 B 11 22
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.