提取R中某列特殊字符之间的信息

Question

I'm sorry because I feel like versions of this question have been asked many times, but I simply cannot find code from other examples that works in this case.很抱歉，因为我觉得这个问题的版本已被问过很多次，但我根本无法从其他示例中找到适用于这种情况的代码。 I have a column where all the information I want is stored in between two sets of "%%", and I want to extract this information between the two sets of parentheses and put it into a new column, in this case called df$empty.我有一个列，我想要的所有信息都存储在两组“%%”之间，我想在两组括号之间提取这些信息并将其放入一个新列中，在本例中称为 df$empty .

This is a long column, but in all cases I just want the information between the sets of parentheses.这是一个很长的专栏，但在所有情况下，我只需要括号之间的信息。 Is there a way to code this out across the whole column?有没有办法在整个专栏中对此进行编码？

To be specific, I want in this example a new column that will look like "information", "wanted".具体来说，我希望在此示例中有一个看起来像“信息”、“通缉令”的新列。


empty <- c('NA', 'NA')
information <- c('notimportant%%information%%morenotimportant', 'ignorethis%%wanted%%notthiseither')

df <- data.frame(information, empty)

Answer 1

In this case you can do:在这种情况下你可以这样做：

df$empty <- sapply(strsplit(df$information, '%%'), '[', 2)

#                                   information       empty
# 1 notimportant%%information%%morenotimportant information
# 2           ignorethis%%wanted%%notthiseither      wanted

That is, split the text by '%%' and take second elements of the resulting vectors.也就是说，将文本按'%%'拆分，并获取结果向量的第二个元素。

Or you can get the same result using sub() :或者您可以使用sub()获得相同的结果：

df$empty <- sub('.*%%(.+)%%.*', '\\1', df$information)

提取R中某列特殊字符之间的信息

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-11-16 17:08:35

提取R中某列特殊字符之间的信息

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-11-16 17:08:35

解决方案1
1 已采纳 2022-11-16 17:08:35