[英]How to create a new variable (column) based on a combination of row values in R?
[英]How do I merge data in R based on row values and create new variable with it?
我有一个数据集,其中每一行都代表特定地理区域中特定产品的销售额(见图 1)。 我想获取其他产品的价格并将它们添加为附加列,以便它们可以成为回归中的附加变量(见图 2)。 我该怎么做呢?
我目前的数据:
期望的输出:
使用dplyr
包中的gather
、 unite
和spread
df <- tibble(
Time = rep(c("Week1","Week2","Week3"), 4),
Geography = c(rep("Dallas",6), rep("Houston",6)),
Product = c(rep("Apple",3), rep("Orange",3), rep("Apple",3), rep("Orange",3)),
Volume = c(1403, 3514, 3388, 2284, 3091, 3558, 3199, 2521, 3381, 2127, 2383, 2469),
Price = c(4.01, 4.11, 4.10, 2.63, 2.98, 2.25, 3.67, 3.80, 3.29, 5.30, 5.02, 5.57))
>df
# A tibble: 12 x 5
Time Geography Product Volume Price
<chr> <chr> <chr> <dbl> <dbl>
1 Week1 Dallas Apple 1403 4.01
2 Week2 Dallas Apple 3514 4.11
3 Week3 Dallas Apple 3388 4.1
4 Week1 Dallas Orange 2284 2.63
5 Week2 Dallas Orange 3091 2.98
6 Week3 Dallas Orange 3558 2.25
7 Week1 Houston Apple 3199 3.67
8 Week2 Houston Apple 2521 3.8
9 Week3 Houston Apple 3381 3.29
10 Week1 Houston Orange 2127 5.3
11 Week2 Houston Orange 2383 5.02
12 Week3 Houston Orange 2469 5.57
df <- df %>%
gather(Volume, Price, -(Time:Product)) %>%
unite(temp, Product, Volume) %>%
spread(temp, Price)
> df
# A tibble: 6 x 6
Time Geography Apple_Price Apple_Volume Orange_Price Orange_Volume
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Week1 Dallas 4.01 1403 2.63 2284
2 Week1 Houston 3.67 3199 5.3 2127
3 Week2 Dallas 4.11 3514 2.98 3091
4 Week2 Houston 3.8 2521 5.02 2383
5 Week3 Dallas 4.1 3388 2.25 3558
6 Week3 Houston 3.29 3381 5.57 2469
P/S:下次请复制问题中的数据样本(不是图片)。 它可以帮助其他人复制问题并更快地解决问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.