[英]Issue with piping stringr str_detect into str_extract - extract is only pulling text from 1st row: argument is not an atomic vector; coercing
I'm trying to create a new column which just contains certain numeric data from an expression.我正在尝试创建一个新列,其中仅包含表达式中的某些数字数据。
Here's my data: https://pastebin.com/hYg3zqYz这是我的数据: https://pastebin.com/hYg3zqYz
I just need the numbers that come after Bipolar in column 12.我只需要第 12 列中 Bipolar 之后的数字。
Here's what works这是有效的
p <- df %>%
select(where(~ any(stringr::str_detect(.x, "Bipolar")))) #returns correct column
Where I try then try to make a new column that pulls just the text, it only ever returns the first row, not sure what I'm doing wrong.在我尝试然后尝试创建一个仅提取文本的新列的地方,它只返回第一行,不确定我做错了什么。
p %>%
mutate(group = "sr_bipol",
sr_bipol = as.numeric(stringr::str_extract(., "[0-9].[0-9]+"))) %>%
select(group, sr_bipol)
# A tibble: 20 × 2
group sr_bipol
<chr> <dbl>
1 sr_bipol 7.83
2 sr_bipol 7.83
3 sr_bipol 7.83
4 sr_bipol 7.83
5 sr_bipol 7.83
.....................
I also get the error code:我也得到错误代码:
argument is not an atomic vector; coercing
The .
.
refers to the whole dataset ( str_extract
needs a vector as input and not a data.frame).指整个数据集(
str_extract
需要一个向量作为输入,而不是 data.frame)。 According to ?str_extract
根据
?str_extract
string - Input vector.
字符串 - 输入向量。 Either a character vector, or something coercible to one.
要么是字符向量,要么是可以强制转换的东西。
We may need to apply str_extract
on the column 12. As the column name for 12 prefix include ...
that are unusual column names, use backticks to access the column values我们可能需要在第 12 列上应用
str_extract
。由于 12 前缀的列名包括...
是不常见的列名,因此使用反引号来访问列值
library(dplyr)
library(stringr)
df %>%
transmute(group = 'sr_bipol',
sr_bipol = as.numeric(str_extract(`...12`, "(?<=Bipolar\\s)[0-9]\\.[0-9]+")))
-output -输出
# A tibble: 20 × 2
group sr_bipol
<chr> <dbl>
1 sr_bipol 7.83
2 sr_bipol 2.34
3 sr_bipol 1.97
4 sr_bipol 1.94
5 sr_bipol 2.85
6 sr_bipol 2.92
7 sr_bipol 3.05
8 sr_bipol 2.80
9 sr_bipol 3.43
10 sr_bipol 2.11
11 sr_bipol 2.80
12 sr_bipol 1.81
13 sr_bipol 1.84
14 sr_bipol 3.87
15 sr_bipol 1.68
16 sr_bipol 2.21
17 sr_bipol 2.97
18 sr_bipol 3.09
19 sr_bipol 2.84
20 sr_bipol 3.48
The p
data is a single column tibble/data.frame
. p
数据是单列tibble/data.frame
。 When we use .
当我们使用
.
, it selects the data.frame as such ie ,它选择data.frame,即
> str(p)
tibble [20 × 1] (S3: tbl_df/tbl/data.frame)
$ ...12: chr [1:20] "Bipolar 7.827 / Unipolar 16.911 / LAT -9.0" "Bipolar 2.34 / Unipolar 9.09 / LAT -10.0" "Bipolar 1.974 / Unipolar 9.219 / LAT -11.0" "Bipolar 1.938 / Unipolar 10.572 / LAT -9.0" ...
> str_extract(p, "[0-9].[0-9]+")
[1] "7.827"
Warning message:
In stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) :
argument is not an atomic vector; coercing
It extracts the value from the first instance and this got recycled to create the whole column of 7.8它从第一个实例中提取值,然后将其回收以创建 7.8 的整个列
If there are more than one column having the 'Bipolar' we may loop across
(modify the transmute
to mutate
if we want to keep all other columns from the original data)如果有不止一列具有“双极”,我们可能会循环遍历(如果我们想保留原始数据中的
across
其他列,请修改transmute
以进行mutate
)
df %>%
transmute(across(where(~ any(stringr::str_detect(.x, "Bipolar"))),
~ as.numeric(str_extract(.x, "(?<=Bipolar\\s)[0-9]\\.[0-9]+")),
.names = "sr_bipol{str_remove(.col, '[.]+')}"))
# A tibble: 20 × 1
sr_bipol12
<dbl>
1 7.83
2 2.34
3 1.97
4 1.94
5 2.85
6 2.92
7 3.05
8 2.80
9 3.43
10 2.11
11 2.80
12 1.81
13 1.84
14 3.87
15 1.68
16 2.21
17 2.97
18 3.09
19 2.84
20 3.48
Here is an alternative approach:这是另一种方法:
library(tidyverse)
df %>%
select(...12) %>%
separate(...12, into="group", sep = "\\/") %>%
mutate(sr_bipol = parse_number(group),
group= str_extract(group, '[A-Za-z]+'))
group sr_bipol
<chr> <dbl>
1 Bipolar 7.83
2 Bipolar 2.34
3 Bipolar 1.97
4 Bipolar 1.94
5 Bipolar 2.85
6 Bipolar 2.92
7 Bipolar 3.05
8 Bipolar 2.80
9 Bipolar 3.43
10 Bipolar 2.11
11 Bipolar 2.80
12 Bipolar 1.81
13 Bipolar 1.84
14 Bipolar 3.87
15 Bipolar 1.68
16 Bipolar 2.21
17 Bipolar 2.97
18 Bipolar 3.09
19 Bipolar 2.84
20 Bipolar 3.48
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.