简体   繁体   English

split()函数,库(tidyverse)

[英]separate() function, library(tidyverse)

I have been using the function separate() from the library(tidyverse) to separate values into different columns: 我一直在使用来自library(tidyverse)的功能split()将值分成不同的列:

45 (10, 89) 
34

and with the code: 和代码:

dd %>% separate(a, c("x","y","z"), extra="drop") 

I got what I wanted: 我得到了想要的东西:

45 10 89
34

But now, my variable has a different format and is not working: 但是现在,我的变量具有不同的格式,无法正常工作:

45% (10,89)
34%

Why is not working when using the symbol '%'? 为什么使用符号“%”时不起作用?

******Edited: Ok, I know why is not working, it is because decimal symbol in my data: ******编辑:好的,我知道为什么不起作用,这是因为数据中的小数点符号:

4.5% (10/89)
3.4%

6.7%

7.8% (89/98)

How do you deal with decimals with the separate function? 如何使用单独的函数处理小数? Thank you very much!! 非常感谢你!!


Thank you! 谢谢!

I'm inferring that when you say "is not working", it's because the percent sign is being removed: 我推断当您说“不起作用”时,是因为百分号已被删除:

separate(data_frame(a=c("45 (10, 89)","34")), a, c('x','y','z'), extra="drop")
# Warning: Too few values at 1 locations: 2
# # A tibble: 2 × 3
#       x     y     z
# * <chr> <chr> <chr>
# 1    45    10    89
# 2    34  <NA>  <NA>
separate(data_frame(a=c("45% (10, 89)","34%")), a, c('x','y','z'), extra="drop")
# Warning: Too few values at 1 locations: 2
# # A tibble: 2 × 3
#       x     y     z
# * <chr> <chr> <chr>
# 1    45    10    89
# 2    34        <NA>

From ?separate : ?separate

 separate(data, col, into, sep = "[^[:alnum:]]+", remove = TRUE, convert = FALSE, extra = "warn", fill = "warn", ...) ... 

Since you are not overriding the default of sep , it finds anything that is not a letter or a number. 由于您没有覆盖sep的默认值,因此它将查找不是字母或数字的任何内容。 FYI, [^[:alnum:]]+ is analogous to [^A-Za-z0-9]+ , which matches "1 or more characters that are not in the character-ranges of AZ, az, or 0-9". FYI, [^[:alnum:]]+[^A-Za-z0-9]+类似,它匹配“ 1个或多个不在AZ,az或0-9字符范围内的字符”。

Simply provide a more-detailed sep , and you'll get what you want. 只需提供更详细的sep ,您就会得到想要的东西。

separate(data_frame(a=c("45% (10, 89)","34%")), a, c('x','y','z'), sep="[^[:alnum:]%]+", extra="drop")
# Warning: Too few values at 1 locations: 2
# # A tibble: 2 × 3
#       x     y     z
# * <chr> <chr> <chr>
# 1   45%    10    89
# 2   34%  <NA>  <NA>

Edit : using your most recent sample data: 编辑 :使用最新的样本数据:

separate(data_frame(a=c("45% (10/89)","34%","","67%","78% (89/98)")), a, c('x','y','z'), sep="[^[:alnum:]%]+", extra="drop")
# Warning: Too few values at 3 locations: 2, 3, 4
# # A tibble: 5 × 3
#       x     y     z
# * <chr> <chr> <chr>
# 1   45%    10    89
# 2   34%  <NA>  <NA>
# 3        <NA>  <NA>
# 4   67%  <NA>  <NA>
# 5   78%    89    98

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM