R：跨多列的gregexpr并返回单个向量

Question

I have multiple columns which contain strings of data. 我有多个包含数据字符串的列。

(data$product, data$price, data$overview1, data$overview2, data$overview3, data$overview4) （data $ product，data $ price，data $ overview1，data $ overview2，data $ overview3，data $ overview4）

I would like to create a new vector which only contains strings which begin with the string "Material:" 我想创建一个仅包含以字符串“ Material：”开头的字符串的新矢量

Setting the pattern for GREP 设置GREP的模式

    matpattern <- "((?<=Material: ).*|(?<=Materials: ).*)"

Get strings which have material at start 获取开始时具有素材的字符串

    mat <- gregexpr(matpattern, data$Overview1, perl=TRUE)

Create vector to store string 创建向量以存储字符串

     data$material1 <- regmatches(data$Overview1, mat, invert = FALSE)

/ repeat for overview2 / / 重复进行概述2 /

    mat <- gregexpr(matpattern, data$Overview2, perl=TRUE)

    data$material2 <- regmatches(data$Overview2, mat, invert = FALSE)

The statement 该声明

    z <- cbind(material1, material2)

gives a matrix when I want a list 当我想要一个列表时给出一个矩阵

Is there a method to get lapply & gregexpr to work across multiple columns and then place the new strings in a single column? 有没有一种方法可以使lapply和gregexpr跨多个列工作，然后将新字符串放在单个列中？

I have looked below, with no avail, thanks for your help. 我看了下面，无济于事，谢谢您的帮助。

Convert R vector to string vector of 1 element 将R向量转换为1个元素的字符串向量

Regular Expressions in R - compare one column to another R中的正则表达式-将一列与另一列进行比较

Using regexp to select rows in R dataframe 使用正则表达式选择R数据框中的行

Answer 1

OK. 好。 This is aa complete hack, but I would like the final output to be a vector, rather than a list (ruling out apply, lapply?) 这是一个完整的技巧，但是我希望最终输出是向量，而不是列表（排除适用，适用吗？）

This gets the location and length of the required string across the 4 columns 这将获取4列中所需字符串的位置和长度

m1 <- gregexpr(matpattern, data[ ,c("Overview1")], perl=TRUE) m1 <-gregexpr（matpattern，data [，c（“ Overview1”）]，perl = TRUE）

m2 <- gregexpr(matpattern, data[ ,c("Overview2")], perl=TRUE) m2 <-gregexpr（matpattern，data [，c（“ Overview2”）]，perl = TRUE）

m3 <- gregexpr(matpattern, data[ ,c("Overview3")], perl=TRUE) m3 <-gregexpr（matpattern，data [，c（“ Overview3”）]，perl = TRUE）

m4 <- gregexpr(matpattern, data[ ,c("Overview4")], perl=TRUE) m4 <-gregexpr（matpattern，data [，c（“ Overview4”）]，perl = TRUE）

This operation creates a set of vectors 此操作将创建一组向量

mat1 <- regmatches(data[ ,c("Overview1")], m1, invert = FALSE) mat1 <-regmatches（data [，c（“ Overview1”）]，m1，invert = FALSE）

mat2 <- regmatches(data[ ,c("Overview2")], m2, invert = FALSE) mat2 <-regmatches（data [，c（“ Overview2”）]，m2，invert = FALSE）

mat3 <- regmatches(data[ ,c("Overview3")], m3, invert = FALSE) mat3 <-regmatches（data [，c（“ Overview3”）]，m3，invert = FALSE）

mat4 <- regmatches(data[ ,c("Overview4")], m4, invert = FALSE) mat4 <-regmatches（data [，c（“ Overview4”）]，m4，invert = FALSE）

Then I paste all the vectors into one big one (future operations will ignore 'character(0)') 然后我将所有向量粘贴到一个大向量中（未来的操作将忽略“ character（0）”）

data$Material <-paste(mat1,mat2,mat3,mat4) data $ Material <-paste（mat1，mat2，mat3，mat4）

I can then use this vector to calculate the mean of data$price based on occurrence of certain text strings in data$Material 然后，我可以使用此向量根据data $ Material中某些文本字符串的出现来计算data $ price的平均值

R：跨多列的gregexpr并返回单个向量

问题描述

1 个解决方案

解决方案1
0 2013-10-24 08:12:01

R：跨多列的gregexpr并返回单个向量

问题描述

1 个解决方案

解决方案1 0 2013-10-24 08:12:01

解决方案1
0 2013-10-24 08:12:01