简体   繁体   English

如何在 R 中重塑我的数据框(转置、选择、删除行)?

[英]how to reshape my data frame in R (transpose, select, remove row)?

Here is data frame(ttt) I have:这是我有的数据框(ttt):

 .id     dn     mavg    up      pctB
AA.1    18.8    21.1    23.4    0.8 
AA.2    18.7    21.1    23.5    0.8 
AA.3    18.7    21.2    23.7    0.8 
AAN.1   23.1    24.6    26.1    0.5 
AAN.2   23.1    24.6    26.0    0.4 
AAN.3   23.1    24.5    26.0    0.5 
AAP.1   145.5   179.2   212.9   0.3 
AAP.2   144.2   177.4   210.7   0.3 
AAP.3   143.4   175.6   207.7   0.3 

The shape that I want to have is following:我想要的形状如下:

    pctB.1  pctB.2  pctB.3
AA  0.8     0.8     0.8 
AAN 0.5     0.4     0.5 
AAP 0.3     0.3     0.3 

Only column that I need is pctB.我唯一需要的列是 pctB。 I tried by writing:我试着写:

ttt <- ttt %>% select(1,5)
ttt <- do.call(cbind, split(ttt, ttt$`.id`))
ttt <- t(ttt)

it gives a result that I don't want.它给出了我不想要的结果。 What should I do?我该怎么办?

<error/rlang_error>
`n()` must only be used inside dplyr verbs.
Backtrace:
  1. plyr::mutate(., .id = sub("\\..*", "", .id))
  1. dplyr::group_by(., .id)
  8. plyr::mutate(., col = paste0("pctB.", row_number()))
  9. [ base::eval(...) ] with 1 more call
 12. dplyr::row_number()
 13. dplyr::n()
 14. dplyr:::peek_mask("n()")
 15. dplyr:::context_peek("mask", fun)
 16. context_peek_bare(name) %||% abort(glue("`{fun}` must only be used inside {location}."))

You can remove additional characters from it, create a unique id column and get the data in wide format selecting only the interested columns.您可以从中删除其他字符,创建一个唯一的 id 列并仅选择感兴趣的列以宽格式获取数据。

library(dplyr)
library(tibble)

ttt %>%
  mutate(.id = sub('\\..*', '', .id)) %>%
  group_by(.id) %>%
  mutate(col = paste0('pctB.', row_number())) %>%
  select(-(dn:up)) %>%
  tidyr::pivot_wider(names_from = col, values_from = pctB)

#  .id   pctB.1 pctB.2 pctB.3
#  <chr>  <dbl>  <dbl>  <dbl>
#1 AA       0.8    0.8    0.8
#2 AAN      0.5    0.4    0.5
#3 AAP      0.3    0.3    0.3

data数据

ttt <- structure(list(.id = c("AA.1", "AA.2", "AA.3", "AAN.1", "AAN.2", 
"AAN.3", "AAP.1", "AAP.2", "AAP.3"), dn = c(18.8, 18.7, 18.7, 
23.1, 23.1, 23.1, 145.5, 144.2, 143.4), mavg = c(21.1, 21.1, 
21.2, 24.6, 24.6, 24.5, 179.2, 177.4, 175.6), up = c(23.4, 23.5, 
23.7, 26.1, 26, 26, 212.9, 210.7, 207.7), pctB = c(0.8, 0.8, 
0.8, 0.5, 0.4, 0.5, 0.3, 0.3, 0.3)),class = "data.frame", row.names = c(NA, -9L))

Split .id into two separate columns (eg. "AA.1" -> "AA", "1" ), then pivot it by these two columns..id拆分为两个单独的列(例如"AA.1" -> "AA", "1" ),然后通过这两列旋转它。

library(tidyverse)

ttt %>%
  mutate(.id.1 = str_split(.id, "\\.") %>% map(~ .[[1]]) %>% unlist,
         .id.2 = str_split(.id, "\\.") %>% map(~ paste0("pctB.", .[[2]])) %>% unlist) %>%
  pivot_wider(id_cols = .id.1,
              names_from = .id.2,
              values_from = pctB) %>%
  column_to_rownames(".id.1")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM