从R中向量中不同长度的字符串中提取字符值

Question

I have a vector that looks like the following:我有一个如下所示的向量：

**name**
a1
a2
a3
b1_z
b2_3z
b32z

I would like the output to only include the letters in each of these strings, not any numbers or symbols.我希望输出只包含每个字符串中的字母，而不是任何数字或符号。 Like this:像这样：

**name**
a
a
a
bz
bz
bz

I have tried using the following code:我尝试使用以下代码：

df$name <- stri_extract_all_regex(df$name, "[a-z]+")

I get this result:我得到这个结果：

**name**
a
a
a
c("b", "z")
c("b", "z")
c("b", "z")

How do I combine the values that are two separate strings into a single string?如何将两个单独字符串的值组合成一个字符串？ In particular, how do I do this when some of the values in the vector already contain only one string?特别是，当向量中的某些值已经只包含一个字符串时，我该怎么做？ I am also open to other solutions for extracting characters from strings that get around this issue.我也愿意接受其他解决方案来从字符串中提取字符来解决这个问题。

Answer 1

Please try gsub like below请尝试像下面这样的gsub

df$name <- gsub("[^[:alpha:]]","",df$name)

where non-alphabet characters are replaced by "" .其中非字母字符被替换为"" 。

We will get我们将获得

> df
  name
1    a
2    a
3    a
4   bz
5   bz
6   bz

Data数据

> dput(df)
structure(list(name = c("a1", "a2", "a3", "b1_z", "b2_3z", "b32z"
)), class = "data.frame", row.names = c(NA, -6L))

Answer 2

You can do it like this using the gsub function:您可以使用gsub函数这样做：

vals = c('a1', 'a2', 'b1_z', 'b2_3z')
df = data.frame(vals)

df$name = gsub("[^[:alpha:]]", "", df$vals)
print(df)

Output will look like this:输出将如下所示：

  name
1    a
2    a
3   bz
4   bz

Answer 3

An option with str_remove str_remove一个选项

library(stringr)
str_remove_all(df$name, "[0-9_]+")
#[1] "a"  "a"  "a"  "bz" "bz" "bz"

data数据

df <- structure(list(name = c("a1", "a2", "a3", "b1_z", "b2_3z", "b32z"
)), class = "data.frame", row.names = c(NA, -6L))

从R中向量中不同长度的字符串中提取字符值

问题描述

3 个解决方案

解决方案1
3 已采纳 2020-09-12 20:15:49

解决方案2
2 2020-09-12 20:17:19

解决方案3
2 2020-09-12 20:28:43

data数据

从R中向量中不同长度的字符串中提取字符值

问题描述

3 个解决方案

解决方案1 3 已采纳 2020-09-12 20:15:49

解决方案2 2 2020-09-12 20:17:19

解决方案3 2 2020-09-12 20:28:43

data数据

解决方案1
3 已采纳 2020-09-12 20:15:49

解决方案2
2 2020-09-12 20:17:19

解决方案3
2 2020-09-12 20:28:43