split（）函数，库（tidyverse）

Question

I have been using the function separate() from the library(tidyverse) to separate values into different columns: 我一直在使用来自library（tidyverse）的功能split（）将值分成不同的列：

45 (10, 89) 
34

and with the code: 和代码：

dd %>% separate(a, c("x","y","z"), extra="drop")

I got what I wanted: 我得到了想要的东西：

45 10 89
34

But now, my variable has a different format and is not working: 但是现在，我的变量具有不同的格式，无法正常工作：

45% (10,89)
34%

Why is not working when using the symbol '%'? 为什么使用符号“％”时不起作用？

******Edited: Ok, I know why is not working, it is because decimal symbol in my data: ******编辑：好的，我知道为什么不起作用，这是因为数据中的小数点符号：

4.5% (10/89)
3.4%

6.7%

7.8% (89/98)

How do you deal with decimals with the separate function? 如何使用单独的函数处理小数？ Thank you very much!! 非常感谢你！！

Thank you! 谢谢！

Answer 1

I'm inferring that when you say "is not working", it's because the percent sign is being removed: 我推断当您说“不起作用”时，是因为百分号已被删除：

separate(data_frame(a=c("45 (10, 89)","34")), a, c('x','y','z'), extra="drop")
# Warning: Too few values at 1 locations: 2
# # A tibble: 2 × 3
#       x     y     z
# * <chr> <chr> <chr>
# 1    45    10    89
# 2    34  <NA>  <NA>
separate(data_frame(a=c("45% (10, 89)","34%")), a, c('x','y','z'), extra="drop")
# Warning: Too few values at 1 locations: 2
# # A tibble: 2 × 3
#       x     y     z
# * <chr> <chr> <chr>
# 1    45    10    89
# 2    34        <NA>

From ?separate : 与?separate ：

 separate(data, col, into, sep = "[^[:alnum:]]+", remove = TRUE, convert = FALSE, extra = "warn", fill = "warn", ...) ...

Since you are not overriding the default of sep , it finds anything that is not a letter or a number. 由于您没有覆盖sep的默认值，因此它将查找不是字母或数字的任何内容。 FYI, [^[:alnum:]]+ is analogous to [^A-Za-z0-9]+ , which matches "1 or more characters that are not in the character-ranges of AZ, az, or 0-9". FYI， [^[:alnum:]]+与[^A-Za-z0-9]+类似，它匹配“ 1个或多个不在AZ，az或0-9字符范围内的字符”。

Simply provide a more-detailed sep , and you'll get what you want. 只需提供更详细的sep ，您就会得到想要的东西。

separate(data_frame(a=c("45% (10, 89)","34%")), a, c('x','y','z'), sep="[^[:alnum:]%]+", extra="drop")
# Warning: Too few values at 1 locations: 2
# # A tibble: 2 × 3
#       x     y     z
# * <chr> <chr> <chr>
# 1   45%    10    89
# 2   34%  <NA>  <NA>

Edit : using your most recent sample data: 编辑：使用最新的样本数据：

separate(data_frame(a=c("45% (10/89)","34%","","67%","78% (89/98)")), a, c('x','y','z'), sep="[^[:alnum:]%]+", extra="drop")
# Warning: Too few values at 3 locations: 2, 3, 4
# # A tibble: 5 × 3
#       x     y     z
# * <chr> <chr> <chr>
# 1   45%    10    89
# 2   34%  <NA>  <NA>
# 3        <NA>  <NA>
# 4   67%  <NA>  <NA>
# 5   78%    89    98

split（）函数，库（tidyverse）

问题描述

1 个解决方案

解决方案1
4 2017-10-26 17:14:36

split（）函数，库（tidyverse）

问题描述

1 个解决方案

解决方案1 4 2017-10-26 17:14:36

解决方案1
4 2017-10-26 17:14:36