[英]R output row and column index of dataframe with target values in a vector
[英]R output matrix index with values in dataframe
我正在嘗試從Position
列中的數據框值中找到“矩陣索引”。 我想引用的“矩陣”是 3 x 3 或 4 x 4 矩陣,具體取決於Position
列的長度(3 x 3 為1:9
9,4 x 4 為1:16
)。 col1
中的不同組將具有不同的Position
長度。
這是一個虛擬數據框來演示我的問題。
df <- structure(list(col1 = c("group1", "group1", "group1", "group1",
"group1", "group1", "group1", "group1", "group1", "group2", "group2",
"group2", "group2", "group2", "group2", "group2", "group2", "group2",
"group2", "group2", "group2", "group2", "group2", "group2", "group2",
"group3", "group3", "group3", "group3", "group3", "group3", "group3",
"group3", "group3", "group3", "group3", "group3", "group3"),
col2 = c("A", "Q", NA, "A", "K", "L", "O", "R", "J", "S",
"C", "S", "H", "O", "T", "Z", "D", "Y", "J", "V", "Z", "P",
"L", "X", "D", "K", "M", "X", "E", "P", "U", "Z", "Z", "L",
"W", "X", "F", "K"), Position = c(1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L,
12L, 13L, 14L, 15L, 16L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -38L))
從這個數據框中,如果它在矩陣中,我想獲得一個新列Position_ij
指定第i 個和第 j 個Position
。
例如,“group1”的Position
長度為 9,因此,它應該引用一個 3 x 3 矩陣, Position_ij
應該是 1 = "[1, 1]", 2 = "[1, 2]", 3 = "[1, 3]", 4 = "[2, 1]" ..., 9 = "[3, 3]"。
對於 "group2",它的Position
長度為 16,因此它應該引用一個 4 x 4 矩陣, Position_ij
應該是 1 = "[1, 1]", ..., 4 = "[1, 4] ", 5 = "[2, 1]" ..., 16 = "[4, 4]"。
對於“group3”,它的Position
長度為 13,大於 9,因此它應該引用一個 4 x 4 矩陣。
我當前的方法使用%/%
和%%
來獲得Position
的商和余數除以矩陣長度,但是,當Position
== 矩陣長度時,余數為 0,但我想要 3 或 4。
library(dplyr)
df %>% group_by(col1) %>%
mutate(Position_ij = if (n() == 9) {
paste0("[", (Position %/% 3) + 1, ", ", Position %% 3, "]")
} else {
paste0("[", (Position %/% 4) + 1, ", ", Position %% 4, "]")
})
# A tibble: 38 × 4
# Groups: col1 [3]
col1 col2 Position Position_ij
<chr> <chr> <int> <chr>
1 group1 A 1 [1, 1]
2 group1 Q 2 [1, 2]
3 group1 NA 3 [2, 0] # this should be [1, 3]
4 group1 A 4 [2, 1]
5 group1 K 5 [2, 2]
6 group1 L 6 [3, 0] # this should be [2, 3]
7 group1 O 7 [3, 1]
8 group1 R 8 [3, 2]
9 group1 J 9 [4, 0] # this should be [3, 3]
10 group2 S 1 [1, 1]
# … with 28 more rows
col1 col2 Position Position_ij
<chr> <chr> <int> <chr>
1 group1 A 1 [1, 1]
2 group1 Q 2 [1, 2]
3 group1 NA 3 [1, 3]
4 group1 A 4 [2, 1]
5 group1 K 5 [2, 2]
6 group1 L 6 [2, 3]
7 group1 O 7 [3, 1]
8 group1 R 8 [3, 2]
9 group1 J 9 [3, 3]
10 group2 S 1 [1, 1]
11 group2 C 2 [1, 2]
12 group2 S 3 [1, 3]
13 group2 H 4 [1, 4]
14 group2 O 5 [2, 1]
15 group2 T 6 [2, 2]
16 group2 Z 7 [2, 3]
17 group2 D 8 [2, 4]
18 group2 Y 9 [3, 1]
19 group2 J 10 [3, 2]
20 group2 V 11 [3, 3]
21 group2 Z 12 [3, 4]
22 group2 P 13 [4, 1]
23 group2 L 14 [4, 2]
24 group2 X 15 [4, 3]
25 group2 D 16 [4, 4]
26 group3 K 1 [1, 1]
27 group3 M 2 [1, 2]
28 group3 X 3 [1, 3]
29 group3 E 4 [1, 4]
30 group3 P 5 [2, 1]
31 group3 U 6 [2, 2]
32 group3 Z 7 [2, 3]
33 group3 Z 8 [2, 4]
34 group3 L 9 [3, 1]
35 group3 W 10 [3, 2]
36 group3 X 11 [3, 3]
37 group3 F 12 [3, 4]
38 group3 K 13 [4, 1]
僅供參考,我的參考矩陣實際上應該是 9 x 9 或 10 x 10。
在%/%
/ %%
之前的 'Position' 中減去 1,然后在結果上加 1
library(dplyr)
out <- df %>%
group_by(col1) %>%
mutate(Position_ij = if (n() == 9) {
paste0("[", ((Position-1) %/% 3) + 1, ", ", (Position-1) %% 3 + 1, "]")
} else {
paste0("[", ((Position-1) %/% 4) + 1, ", ", (Position-1) %% 4 + 1, "]")
}) %>%
ungroup
-輸出
> as.data.frame(out)
col1 col2 Position Position_ij
1 group1 A 1 [1, 1]
2 group1 Q 2 [1, 2]
3 group1 <NA> 3 [1, 3]
4 group1 A 4 [2, 1]
5 group1 K 5 [2, 2]
6 group1 L 6 [2, 3]
7 group1 O 7 [3, 1]
8 group1 R 8 [3, 2]
9 group1 J 9 [3, 3]
10 group2 S 1 [1, 1]
11 group2 C 2 [1, 2]
12 group2 S 3 [1, 3]
13 group2 H 4 [1, 4]
14 group2 O 5 [2, 1]
15 group2 T 6 [2, 2]
16 group2 Z 7 [2, 3]
17 group2 D 8 [2, 4]
18 group2 Y 9 [3, 1]
19 group2 J 10 [3, 2]
20 group2 V 11 [3, 3]
21 group2 Z 12 [3, 4]
22 group2 P 13 [4, 1]
23 group2 L 14 [4, 2]
24 group2 X 15 [4, 3]
25 group2 D 16 [4, 4]
26 group3 K 1 [1, 1]
27 group3 M 2 [1, 2]
28 group3 X 3 [1, 3]
29 group3 E 4 [1, 4]
30 group3 P 5 [2, 1]
31 group3 U 6 [2, 2]
32 group3 Z 7 [2, 3]
33 group3 Z 8 [2, 4]
34 group3 L 9 [3, 1]
35 group3 W 10 [3, 2]
36 group3 X 11 [3, 3]
37 group3 F 12 [3, 4]
38 group3 K 13 [4, 1]
或使用gl/rowid
library(data.table)
out2 <- df %>%
group_by(col1) %>%
mutate(Position_ij = sprintf('[%d, %d]',
as.integer(gl(n(), c(4, 3)[1 + !n()%%3], n())),
rowid(as.integer(gl(n(), c(4, 3)[1 + !n()%%3], n()))))) %>%
ungroup
-測試
> identical(out2, out)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.