簡體   English   中英

R輸出矩陣索引與數據框中的值

[英]R output matrix index with values in dataframe

我正在嘗試從Position列中的數據框值中找到“矩陣索引”。 我想引用的“矩陣”是 3 x 3 或 4 x 4 矩陣,具體取決於Position列的長度(3 x 3 為1:9 9,4 x 4 為1:16 )。 col1中的不同組將具有不同的Position長度。

這是一個虛擬數據框來演示我的問題。

df <- structure(list(col1 = c("group1", "group1", "group1", "group1", 
"group1", "group1", "group1", "group1", "group1", "group2", "group2", 
"group2", "group2", "group2", "group2", "group2", "group2", "group2", 
"group2", "group2", "group2", "group2", "group2", "group2", "group2", 
"group3", "group3", "group3", "group3", "group3", "group3", "group3", 
"group3", "group3", "group3", "group3", "group3", "group3"), 
    col2 = c("A", "Q", NA, "A", "K", "L", "O", "R", "J", "S", 
    "C", "S", "H", "O", "T", "Z", "D", "Y", "J", "V", "Z", "P", 
    "L", "X", "D", "K", "M", "X", "E", "P", "U", "Z", "Z", "L", 
    "W", "X", "F", "K"), Position = c(1L, 2L, 3L, 4L, 5L, 6L, 
    7L, 8L, 9L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
    12L, 13L, 14L, 15L, 16L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
    9L, 10L, 11L, 12L, 13L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -38L))

規則

從這個數據框中,如果它在矩陣中,我想獲得一個新列Position_ij指定第i 個第 j 個Position

例如,“group1”的Position長度為 9,因此,它應該引用一個 3 x 3 矩陣, Position_ij應該是 1 = "[1, 1]", 2 = "[1, 2]", 3 = "[1, 3]", 4 = "[2, 1]" ..., 9 = "[3, 3]"。

對於 "group2",它的Position長度為 16,因此它應該引用一個 4 x 4 矩陣, Position_ij應該是 1 = "[1, 1]", ..., 4 = "[1, 4] ", 5 = "[2, 1]" ..., 16 = "[4, 4]"。

對於“group3”,它的Position長度為 13,大於 9,因此它應該引用一個 4 x 4 矩陣。

當前嘗試(失敗)

我當前的方法使用%/%%%來獲得Position的商和余數除以矩陣長度,但是,當Position == 矩陣長度時,余數為 0,但我想要 3 或 4。

library(dplyr)

df %>% group_by(col1) %>% 
  mutate(Position_ij = if (n() == 9) {
      paste0("[", (Position %/% 3) + 1, ", ", Position %% 3, "]")
    } else {
      paste0("[", (Position %/% 4) + 1, ", ", Position %% 4, "]")
      })
# A tibble: 38 × 4
# Groups:   col1 [3]
   col1   col2  Position Position_ij
   <chr>  <chr>    <int> <chr>      
 1 group1 A            1 [1, 1]     
 2 group1 Q            2 [1, 2]     
 3 group1 NA           3 [2, 0]     # this should be [1, 3]
 4 group1 A            4 [2, 1]     
 5 group1 K            5 [2, 2]     
 6 group1 L            6 [3, 0]     # this should be [2, 3]
 7 group1 O            7 [3, 1]     
 8 group1 R            8 [3, 2]     
 9 group1 J            9 [4, 0]     # this should be [3, 3]
10 group2 S            1 [1, 1]     
# … with 28 more rows

期望的輸出

   col1   col2  Position Position_ij
   <chr>  <chr>    <int> <chr>      
 1 group1 A            1 [1, 1]     
 2 group1 Q            2 [1, 2]     
 3 group1 NA           3 [1, 3]     
 4 group1 A            4 [2, 1]     
 5 group1 K            5 [2, 2]     
 6 group1 L            6 [2, 3]     
 7 group1 O            7 [3, 1]     
 8 group1 R            8 [3, 2]     
 9 group1 J            9 [3, 3]     
10 group2 S            1 [1, 1]     
11 group2 C            2 [1, 2]     
12 group2 S            3 [1, 3]     
13 group2 H            4 [1, 4]     
14 group2 O            5 [2, 1]     
15 group2 T            6 [2, 2]     
16 group2 Z            7 [2, 3]     
17 group2 D            8 [2, 4]     
18 group2 Y            9 [3, 1]     
19 group2 J           10 [3, 2]     
20 group2 V           11 [3, 3]     
21 group2 Z           12 [3, 4]     
22 group2 P           13 [4, 1]     
23 group2 L           14 [4, 2]     
24 group2 X           15 [4, 3]     
25 group2 D           16 [4, 4]     
26 group3 K            1 [1, 1]     
27 group3 M            2 [1, 2]     
28 group3 X            3 [1, 3]     
29 group3 E            4 [1, 4]     
30 group3 P            5 [2, 1]     
31 group3 U            6 [2, 2]     
32 group3 Z            7 [2, 3]     
33 group3 Z            8 [2, 4]     
34 group3 L            9 [3, 1]     
35 group3 W           10 [3, 2]     
36 group3 X           11 [3, 3]     
37 group3 F           12 [3, 4]     
38 group3 K           13 [4, 1]        

僅供參考,我的參考矩陣實際上應該是 9 x 9 或 10 x 10。

%/% / %%之前的 'Position' 中減去 1,然后在結果上加 1

library(dplyr)
out <- df %>% 
  group_by(col1) %>% 
  mutate(Position_ij = if (n() == 9) {
      paste0("[", ((Position-1) %/% 3) + 1, ", ", (Position-1) %% 3 + 1, "]")
    } else {
      paste0("[", ((Position-1) %/% 4) + 1, ", ", (Position-1) %% 4 + 1, "]")
      }) %>%
  ungroup

-輸出

> as.data.frame(out)
     col1 col2 Position Position_ij
1  group1    A        1      [1, 1]
2  group1    Q        2      [1, 2]
3  group1 <NA>        3      [1, 3]
4  group1    A        4      [2, 1]
5  group1    K        5      [2, 2]
6  group1    L        6      [2, 3]
7  group1    O        7      [3, 1]
8  group1    R        8      [3, 2]
9  group1    J        9      [3, 3]
10 group2    S        1      [1, 1]
11 group2    C        2      [1, 2]
12 group2    S        3      [1, 3]
13 group2    H        4      [1, 4]
14 group2    O        5      [2, 1]
15 group2    T        6      [2, 2]
16 group2    Z        7      [2, 3]
17 group2    D        8      [2, 4]
18 group2    Y        9      [3, 1]
19 group2    J       10      [3, 2]
20 group2    V       11      [3, 3]
21 group2    Z       12      [3, 4]
22 group2    P       13      [4, 1]
23 group2    L       14      [4, 2]
24 group2    X       15      [4, 3]
25 group2    D       16      [4, 4]
26 group3    K        1      [1, 1]
27 group3    M        2      [1, 2]
28 group3    X        3      [1, 3]
29 group3    E        4      [1, 4]
30 group3    P        5      [2, 1]
31 group3    U        6      [2, 2]
32 group3    Z        7      [2, 3]
33 group3    Z        8      [2, 4]
34 group3    L        9      [3, 1]
35 group3    W       10      [3, 2]
36 group3    X       11      [3, 3]
37 group3    F       12      [3, 4]
38 group3    K       13      [4, 1]

或使用gl/rowid

library(data.table)
out2 <- df %>%
   group_by(col1) %>% 
   mutate(Position_ij = sprintf('[%d, %d]', 
      as.integer(gl(n(), c(4, 3)[1 + !n()%%3], n())),
      rowid(as.integer(gl(n(), c(4, 3)[1 + !n()%%3], n()))))) %>% 
   ungroup

-測試

> identical(out2, out)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM