简体   繁体   English

使用来自条件为R中矩阵中变量值的数据帧中的数据在矩阵中创建向量

[英]Create a vector in a matrix using data from a dataframe conditioned to variable values in the matrix in R

this is a bit hard to explain in writing, I'll do my best. 这在书面上很难解释,我会尽力而为。 (this is also my first post here). (这也是我在这里的第一篇文章)。 So, I have dataframe " x " like this: 所以,我有这样的数据框“ x ”:

+---+----+----+-----+
|   | A  | B  | C   |
| 1 | 50 | 40 | 30  |
| 2 | 60 | 80 | 40  |
| 3 | 70 | 30 | 20  |
| 4 | 10 | 40 | 100 |
| 5 | 35 | 50 | 20  |
| 6 | 20 | 50 | 30  |
+---+----+----+-----+

And a matrix " Y " like this: 像这样的矩阵“ Y ”:

+---+---+---+---+---+---+
| A | C | C | B | A | A |
| 1 | 5 | 5 | 4 | 3 | 6 |
+---+---+---+---+---+---+

(Assume the letters are numbers, I use letters just to explain more clearly). (假设字母是数字,我用字母只是为了更清楚地解释)。 Now I want 'R' to create a new row in " Y " matrix extracting data from " X " dataframe depending on the values of the first and second row of the matrix. 现在,我希望“ R”在“ Y ”矩阵中创建新行,以根据矩阵的第一行和第二行的值从“ X ”数据框中提取数据。 So, for example, for the third column in matrix " Y ", the extracted value from the dataframe would be 20. Since on the first row the value is "C" and on the second row the value is "5" and in the dataframe the value where "C" and "5" intersect is 20. So, basically I need 'R' to use the data in the first and second row from the matrix and go to the dataframe and check with the first row and column for each value when both conditions are met and extract that value in the intersection, creating a third row in the matrix " X " containing the corresponding value for that column. 因此,例如,对于矩阵“ Y ”中的第三列,从数据帧中提取的值为20。由于在第一行中的值为“ C”,在第二行中的值为“ 5”,而在第一行中的值为数据框,其中“ C”和“ 5”相交的值是20。因此,基本上,我需要'R'使用矩阵第一行和第二行中的数据,然后转到数据框并检查第一行和第一列是否为同时满足两个条件的每个值并在交点中提取该值,在矩阵“ X ”中创建第三行,其中包含该列的相应值。 Using the example tables, the third row should look like this: 使用示例表,第三行应如下所示:

+----+----+----+----+----+----+
| A  | C  | C  | B  | A  | A  |
| 1  | 5  | 5  | 4  | 3  | 6  |
| 50 | 20 | 20 | 40 | 70 | 20 |
+----+----+----+----+----+----+

I hope it was clear enough, I think the function to do this is "subset" but I dont really know how to get the desired result. 我希望它已经足够清楚了,我认为执行此操作的功能是“子集”,但我真的不知道如何获得所需的结果。 Thanks for any help given. 感谢您提供的任何帮助。

EDIT: The following data is dataframe" X " 编辑:以下数据是数据框“ X

structure(list(X = structure(c(52L, 1L, 2L, 3L, 4L, 25L, 26L, 
38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 
51L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 
18L, 19L, 20L, 21L, 22L, 23L, 24L, 27L, 28L, 29L, 30L, 31L, 32L, 
33L, 34L, 35L, 36L, 37L), .Label = c("0", "0.5", "1", "1.5", 
"10", "10.5", "11", "11.5", "12", "12.5", "13", "13.5", "14", 
"14.5", "15", "15.5", "16", "16.5", "17", "17.5", "18", "18.5", 
"19", "19.5", "2", "2.5", "20", "20.5", "21", "21.5", "22", "22.5", 
"23", "23.5", "24", "24.5", "25", "3", "3.5", "4", "4.5", "5", 
"5.5", "6", "6.5", "7", "7.5", "8", "8.5", "9", "9.5", "v"
), class = "factor"), AD = c(0.9, 0, 0, 0, 0, 0, 0, 
1, 15, 50, 94, 147, 209, 280, 361, 455, 564, 689, 830, 978, 1130, 
1281, 1431, 1579, 1728, 1872, 2011, 2144, 2263, 2353, 2418, 2462, 
2489, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500), X.1 = c(0.925, 
0, 0, 0, 0, 0, 0, 1, 16, 52, 97, 151, 215, 288, 372, 469, 581, 
710, 854, 1006, 1161, 1315, 1467, 1619, 1770, 1915, 2055, 2189, 
2300, 2381, 2439, 2477, 2496, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500), X.2 = c(0.95, 0, 0, 0, 0, 0, 0, 1, 17, 54, 100, 
156, 222, 297, 383, 483, 599, 731, 879, 1034, 1192, 1348, 1503, 
1657, 1810, 1956, 2096, 2230, 2331, 2405, 2455, 2488, 2500, 2500, 
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500, 2500, 2500, 2500, 2500, 2500), X.3 = c(0.975, 0, 
0, 0, 0, 0, 0, 1, 18, 56, 104, 161, 228, 305, 394, 497, 616, 
752, 903, 1061, 1222, 1381, 1539, 1696, 1849, 1996, 2135, 2260, 
2354, 2421, 2465, 2491, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500), X.4 = c(1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 19L, 58L, 
107L, 165L, 234L, 314L, 405L, 511L, 634L, 773L, 929L, 1097L, 
1274L, 1459L, 1649L, 1840L, 2030L, 2199L, 2327L, 2415L, 2470L, 
2495L, 2500L, 2500L, 2500L, 2500L, 2500L, 2500L, 2500L, 2500L, 
2500L, 2500L, 2500L, 2500L, 2500L, 2500L, 2500L, 2500L, 2500L, 
2500L, 2500L, 2500L, 2500L, 2500L), X.5 = c(1.025, 0, 0, 0, 0, 
0, 0, 1, 20, 60, 110, 170, 241, 322, 416, 525, 651, 795, 953, 
1125, 1307, 1497, 1692, 1889, 2084, 2245, 2362, 2440, 2485, 2499, 
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500
), X.6 = c(1.05, 0, 0, 0, 0, 0, 0, 1, 21, 62, 113, 175, 247, 
331, 427, 539, 668, 816, 978, 1154, 1340, 1535, 1735, 1937, 2132, 
2284, 2391, 2459, 2494, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500, 2500, 2500, 2500), X.7 = c(1.075, 0, 0, 0, 0, 0, 
0, 1, 22, 64, 116, 179, 254, 339, 438, 553, 685, 836, 1002, 1182, 
1373, 1572, 1778, 1986, 2175, 2316, 2414, 2473, 2497, 2500, 2500, 
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500), 
    X.8 = c(1.1, 0, 0, 0, 0, 0, 0, 1, 24, 66, 120, 184, 260, 
    348, 449, 566, 702, 856, 1026, 1211, 1406, 1610, 1821, 2035, 
    2217, 2349, 2437, 2486, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500), X.9 = c(1.125, 
    0, 0, 0, 0, 0, 0, 1, 25, 68, 123, 189, 267, 356, 460, 580, 
    719, 877, 1051, 1239, 1439, 1648, 1864, 2080, 2254, 2377, 
    2455, 2495, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500), X.10 = c(1.15, 0, 0, 
    0, 0, 0, 0, 1, 26, 70, 126, 193, 273, 365, 471, 594, 736, 
    897, 1075, 1267, 1472, 1686, 1908, 2119, 2284, 2397, 2467, 
    2496, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500), X.11 = c(1.175, 0, 0, 0, 0, 
    0, 0, 1, 27, 72, 129, 198, 279, 373, 482, 608, 753, 917, 
    1099, 1295, 1505, 1724, 1952, 2158, 2313, 2418, 2478, 2498, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500), X.12 = c(1.2, 0, 0, 0, 0, 0, 0, 
    1, 28, 74, 132, 203, 286, 382, 493, 622, 770, 937, 1123, 
    1324, 1537, 1761, 1995, 2197, 2343, 2439, 2489, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500), X.13 = c(1.225, 0, 0, 0, 0, 0, 0, 1, 29, 
    76, 136, 207, 292, 390, 504, 635, 787, 958, 1147, 1352, 1570, 
    1799, 2036, 2231, 2368, 2454, 2496, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500), X.14 = c(1.25, 0, 0, 0, 0, 0, 0, 1, 30, 78, 139, 212, 
    298, 399, 515, 649, 803, 978, 1171, 1380, 1603, 1838, 2071, 
    2257, 2386, 2464, 2497, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500), X.15 = c(1.275, 
    0, 0, 0, 0, 0, 0, 1, 31, 80, 142, 217, 305, 407, 525, 662, 
    820, 998, 1195, 1408, 1636, 1877, 2107, 2284, 2404, 2473, 
    2498, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500), X.16 = c(1.3, 0, 0, 
    0, 0, 0, 0, 1, 32, 82, 145, 221, 311, 415, 536, 676, 836, 
    1018, 1219, 1437, 1668, 1915, 2142, 2310, 2423, 2483, 2499, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 2500, 
    2500, 2500, 2500, 2500, 2500)), .Names = c("X", "AD", 
"X.1", "X.2", "X.3", "X.4", "X.5", "X.6", "X.7", "X.8", "X.9", 
"X.10", "X.11", "X.12", "X.13", "X.14", "X.15", "X.16"), class = "data.frame", row.names = c(NA, 
-52L))

I cannot paste all data from matrix " Y " because its too big, its a 3x1644 matrix. 我无法粘贴矩阵“ Y ”中的所有数据,因为它太大,它是3x1644矩阵。
Here are the first few columns of the matrix " Y " though. 不过,这是矩阵“ Y ”的前几列。

structure(list(V1 = structure(1:3, .Label = c("", "AD", "WS"), class = "factor"), 
    V2 = structure(c(3L, 1L, 2L), .Label = c("1.2", "3.5", "V1"
    ), class = "factor"), V3 = structure(c(3L, 1L, 2L), .Label = c("1.2", 
    "4", "V2"), class = "factor"), V4 = structure(c(3L, 1L, 2L
    ), .Label = c("1.2", "3.5", "V3"), class = "factor")), .Names = c("V1", 
"V2", "V3", "V4"), class = "matrix", row.names = c(NA, -3L
))

Please note that the matrix turned into a dataframe when I extracted the first columns to post them here, but it still is a matrix in my data. 请注意,当我提取第一列并将其发布到此处时,该矩阵变成了一个数据帧,但它仍然是我数据中的一个矩阵。

Try something like this: 尝试这样的事情:

rbind(Y, sapply(seq_along(Y), 
                function(z) 
                  X[Y[1, z], names(Y)[z]]))
#    A  C  C  B  A  A
# 1  1  5  5  4  3  6
# 2 50 20 20 40 70 20

Here, I'm basically subsetting using [ to match the values you're looking for. 在这里,我基本上是使用[子集来匹配您要查找的值。


For convenience of others, here's X and Y : 为方便起见,这里是XY

X <- structure(list(A = c(50, 60, 70, 10, 35, 20), 
                    B = c(40, 80, 30, 40, 50, 50), 
                    C = c(30, 40, 20, 100, 20, 30)), 
               .Names = c("A", "B", "C"), 
               row.names = c(NA, -6L), 
               class = "data.frame")
Y <- structure(list(A = 1, C = 5, C = 5, B = 4, A = 3, A = 6), 
               .Names = c("A", "C", "C", "B", "A", "A"), 
               row.names = c(NA, -1L), class = "data.frame")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM