[英]Removing prefix from strings in R data frame
I have a data frame, wkt_small
with the following data:我有一个数据框
wkt_small
其中包含以下数据:
id GEOMETRY
<chr> <chr>
1 PTK01 LINESTRING( 1.142 85.892 1.400, 0.991 85.892 1.400)
2 PTK02 LINESTRING( 2.142 85.892 1.400, 0.991 85.892 1.400)
...
What I need is it to look like this:我需要的是它看起来像这样:
id GEOMETRY
<chr> <chr>
1 PTK01 ( 1.142 85.892 1.400, 0.991 85.892 1.400)
2 PTK02 ( 2.142 85.892 1.400, 0.991 85.892 1.400)
...
I have tried the following:我尝试了以下方法:
wkt_small[, 2] <- gsub('^\\w+', '', wkt_small[, 2])
This however gives me the following value for GEOMETRY
for all rows:但是,这为所有行的
GEOMETRY
提供了以下值:
("LINESTRING( 1.142 85.892 1.400, 0.991 85.892 1.400, 0.991 85.301 1.4)","LINESTRING( 1.142 85.892 1.400, 0.991 85.892 1.400, 0.991 85.301 1.4)"...
concatenating the first row value with the string I want removed for all entries in the data frame.将第一行值与我想为数据框中所有条目删除的字符串连接起来。
Use [[…]]
or $…
to select a single column, not [, …]
:使用
[[…]]
或$…
选择单列,而不是[, …]
:
wkt_small$GEOMETRY <- sub('^\\w+', '', wkt_small$GEOMETRY)
… actually, with a proper data.frame
your code would have worked as well; ...实际上,使用适当的
data.frame
您的代码也可以正常工作; but with a tibble, [
indexing always returns a tibble , not a column vector.但是对于 tibble,
[
索引总是返回 tibble ,而不是列向量。 The tibble semantics are equivalent of using [, …, drop = FALSE]
with a regular data.frame
. tibble 语义等同于将
[, …, drop = FALSE]
与常规data.frame
。
Update: We could use str_remove
(which is better in this case):更新:我们可以使用
str_remove
(在这种情况下更好):
library(stringr)
wkt_small %>%
mutate(GEOMETRY = str_remove(GEOMETRY, '^\\w+'))
We could use str_replace
from stringr
package with regular expression "^[AZ]*"
我们可以使用
stringr
包中的str_replace
和正则表达式"^[AZ]*"
library(dplyr)
library(stringr)
df %>%
mutate(GEOMETRY = str_replace(GEOMETRY, "^[A-Z]*", ""))
Output:输出:
id GEOMETRY
<chr> <chr>
1 PTK01 ( 1.142 85.892 1.400, 0.991 85.892 1.400)
2 PTK02 ( 2.142 85.892 1.400, 0.991 85.892 1.400)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.