简体   繁体   中英

Having trouble using tidyr pivot_wider to spread data

I have a dataset comparing 15 hybrids, each with 5 separate measurements. I am trying to spread the data into a wider dataset using pivot_wider for a regression analysis, since spread() would not work (probably because of the repeated observations).

The dataset I am working with is below:

data <- structure(list(hybrid = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 
4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 
7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 
8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 
10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 
11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, 
12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13, 
13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 
14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 
15, 15, 15), measurement = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 
4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 
1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 
3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 
5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 
2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 
4, 5, 5, 5, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 1, 1, 1, 2, 2, 2, 3, 
3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 
5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 
2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 
4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5), value = c(245, 
889, 450, 45, 515, 318, 956, 434, 29, 740, 156, 516, 767, 292, 
753, 573, 636, 611, 777, 557, 408, 95, 482, 227, 495, 360, 55, 
76, 393, 37, 667, 802, 724, 900, 885, 191, 79, 143, 531, 398, 
324, 129, 172, 467, 25, 101, 476, 629, 915, 122, 498, 649, 354, 
527, 920, 788, 565, 552, 586, 127, 461, 307, 77, 552, 198, 240, 
816, 144, 136, 781, 593, 421, 233, 264, 812, 407, 492, 932, 940, 
139, 764, 200, 352, 754, 271, 506, 381, 973, 678, 848, 432, 358, 
218, 736, 287, 411, 220, 264, 531, 669, 666, 727, 841, 792, 79, 
460, 159, 426, 90, 395, 793, 507, 262, 814, 157, 641, 230, 870, 
304, 591, 636, 277, 534, 783, 562, 938, 889, 68, 557, 892, 809, 
157, 71, 54, 256, 246, 301, 823, 622, 953, 6, 66, 556, 902, 207, 
832, 248, 540, 192, 65, 381, 712, 15, 323, 1, 193, 146, 637, 
488, 158, 289, 839, 229, 237, 273, 978, 560, 969, 898, 204, 335, 
930, 444, 968, 920, 398, 303, 318, 975, 182, 630, 4, 624, 271, 
272, 438, 661, 728, 32, 106, 473, 465, 498, 33, 189, 918, 704, 
605, 867, 240, 833, 497, 514, 241, 860, 228, 643, 791, 4, 898, 
574, 225, 339, 365, 387, 548, 88, 604, 283)), class = "data.frame", row.names = c(NA, 
-219L))

I'm new to the pivot_wider function, so when I run my code, I get an error:

data%>%
  pivot_wider(cols = -hybrid, names_to = c("1","2","3","4","5"))

Error in pivot_wider(., cols = -hybrid, names_to = c("1", "2", "3", "4",  : 
  unused arguments (cols = -hybrid, names_to = c("1", "2", "3", "4", "5"))

How can I spread this data so that I have 5 columns? Hybrid, 1, 2, 3, 4, 5 (with the values under the columns entitled 1:5).

My guess is that you are you looking for this:

library(tidyr)
pivot_wider(data, id_cols = hybrid, names_from = measurement, values_from = "value", values_fn = sum)

# # A tibble: 15 x 6
#    hybrid   `1`   `2`   `3`   `4`   `5`
#     <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#  1      1  1584   878  1419  1412  1812
#  2      2  1820  1742   804   910   506
#  3      3  2193  1976   753   851   664
#  4      4  1206  1535  1530  2273  1265
#  5      5   845   990  1096  1795  1309
#  6      6  1831  1843  1306  1158  2499
#  7      7  1008  1434  1015  2062  1712
#  8      8  1045  1278  1583  1028  1765
#  9      9   913  1317  1500   957  1449
# 10     10  1037   556  1746  1025  1665
# 11     11  1620   638  1050   340  1283
# 12     12  1357  1488  2427  1469  2332
# 13     13  1019  1787   899  1371   866
# 14     14  1436  1140  2176  1570  1615
# 15     15  1662  1476   929  1023   887

Using dcast from data.table

library(data.table)
dcast(setDT(data), hybrid ~ measurement, sum)
#    hybrid    1    2    3    4    5
# 1:      1 1584  878 1419 1412 1812
# 2:      2 1820 1742  804  910  506
# 3:      3 2193 1976  753  851  664
# 4:      4 1206 1535 1530 2273 1265
# 5:      5  845  990 1096 1795 1309
# 6:      6 1831 1843 1306 1158 2499
# 7:      7 1008 1434 1015 2062 1712
# 8:      8 1045 1278 1583 1028 1765
# 9:      9  913 1317 1500  957 1449
#10:     10 1037  556 1746 1025 1665
#11:     11 1620  638 1050  340 1283
#12:     12 1357 1488 2427 1469 2332
#13:     13 1019 1787  899 1371  866
#14:     14 1436 1140 2176 1570 1615
#15:     15 1662 1476  929 1023  887

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM