[英]Problems using tidyr separate on “|”
我有一個像這樣的對象lncRNA_lengths
:
> lncRNA_lengths
# A tibble: 1,071 x 3
tx_name Length Type
<chr> <int> <chr>
1 align_id:155048|asmbl_67 205 lncRNA
2 align_id:155049|asmbl_68 228 lncRNA
3 align_id:155143|asmbl_162 524 lncRNA
4 align_id:155148|asmbl_167 344 lncRNA
5 align_id:155226|asmbl_245 386 lncRNA
6 align_id:155265|asmbl_284 825 lncRNA
7 align_id:155270|asmbl_289 292 lncRNA
8 align_id:155331|asmbl_350 216 lncRNA
9 align_id:155332|asmbl_351 1152 lncRNA
10 align_id:155344|asmbl_363 243 lncRNA
# ... with 1,061 more rows
我想在“ |”上分隔tx_name列 符號。 我嘗試了這個:
lncRNA_lengths %>%
separate(tx_name, c("ID", "asmbl", sep = "\\|"))
但是我得到以下輸出:
# A tibble: 1,071 x 5
ID asmbl `\\|` Length Type
<chr> <chr> <chr> <int> <chr>
1 align id 155048 205 lncRNA
2 align id 155049 228 lncRNA
3 align id 155143 524 lncRNA
4 align id 155148 344 lncRNA
5 align id 155226 386 lncRNA
6 align id 155265 825 lncRNA
7 align id 155270 292 lncRNA
8 align id 155331 216 lncRNA
9 align id 155332 1152 lncRNA
10 align id 155344 243 lncRNA
# ... with 1,061 more rows
Warning message:
Expected 3 pieces. Additional pieces discarded in 1071 rows [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].
創建了三列而不是兩列,但我不理解錯誤消息...
應該這樣做,首先制作虛假數據:
df <- data.frame(tx_name = "align_id:155048|asmbl_67",length = 205, type = "lncRNA")
然后將其分開並創建列
df <- separate(df, col = tx_name, sep = "\\|", into = c("ID", "asmbl"))
您基本上沒有將向量關閉
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.