从 R 数据框中的字符串中删除前缀

Question

I have a data frame, wkt_small with the following data:我有一个数据框wkt_small其中包含以下数据：

id             GEOMETRY                                                                                      
  <chr>          <chr>                                                                                         
1 PTK01        LINESTRING( 1.142 85.892 1.400, 0.991 85.892 1.400)
2 PTK02        LINESTRING( 2.142 85.892 1.400, 0.991 85.892 1.400)
...

What I need is it to look like this:我需要的是它看起来像这样：

id             GEOMETRY                                                                                      
  <chr>          <chr>                                                                                         
1 PTK01        ( 1.142 85.892 1.400, 0.991 85.892 1.400)
2 PTK02        ( 2.142 85.892 1.400, 0.991 85.892 1.400)
...

I have tried the following:我尝试了以下方法：

wkt_small[, 2] <- gsub('^\\w+', '', wkt_small[, 2])

This however gives me the following value for GEOMETRY for all rows:但是，这为所有行的GEOMETRY提供了以下值：

("LINESTRING( 1.142 85.892 1.400, 0.991 85.892 1.400, 0.991 85.301 1.4)","LINESTRING( 1.142 85.892 1.400, 0.991 85.892 1.400, 0.991 85.301 1.4)"...

concatenating the first row value with the string I want removed for all entries in the data frame.将第一行值与我想为数据框中所有条目删除的字符串连接起来。

Answer 1

Use [[…]] or $… to select a single column, not [, …] :使用[[…]]或$…选择单列，而不是[, …] ：

wkt_small$GEOMETRY <- sub('^\\w+', '', wkt_small$GEOMETRY)

… actually, with a proper data.frame your code would have worked as well; ...实际上，使用适当的data.frame您的代码也可以正常工作； but with a tibble, [ indexing always returns a tibble , not a column vector.但是对于 tibble， [索引总是返回 tibble ，而不是列向量。 The tibble semantics are equivalent of using [, …, drop = FALSE] with a regular data.frame . tibble 语义等同于将[, …, drop = FALSE]与常规data.frame 。

Answer 2

Update: We could use str_remove (which is better in this case):更新：我们可以使用str_remove （在这种情况下更好）：

library(stringr)
wkt_small %>% 
    mutate(GEOMETRY = str_remove(GEOMETRY, '^\\w+'))

We could use str_replace from stringr package with regular expression "^[AZ]*"我们可以使用stringr包中的str_replace和正则表达式"^[AZ]*"

library(dplyr)
library(stringr)
df %>% 
    mutate(GEOMETRY = str_replace(GEOMETRY, "^[A-Z]*", ""))

Output:输出：

  id    GEOMETRY                                 
  <chr> <chr>                                    
1 PTK01 ( 1.142 85.892 1.400, 0.991 85.892 1.400)
2 PTK02 ( 2.142 85.892 1.400, 0.991 85.892 1.400)

从 R 数据框中的字符串中删除前缀

问题描述

2 个解决方案

解决方案1
4 已采纳 2021-07-30 11:27:37

解决方案2
1 2021-07-30 11:34:44

从 R 数据框中的字符串中删除前缀

问题描述

2 个解决方案

解决方案1 4 已采纳 2021-07-30 11:27:37

解决方案2 1 2021-07-30 11:34:44

解决方案1
4 已采纳 2021-07-30 11:27:37

解决方案2
1 2021-07-30 11:34:44