[英]regex commas not between two numbers
I am looking for a regex for gsub
to remove all the unwanted commas: 我正在寻找
gsub
的正则表达式删除所有不需要的逗号:
Data: 数据:
,,,,,,,12345
12345,1345,1354
123,,,,,,
12345,
,12354
Desired result: 期望的结果:
12345
12345,1345,1354
123
12345
12354
This is the progress I have made so far: 这是我迄今取得的进展:
(,(?!\\d+))
You seem to want to remove all leading and trailing commas. 您似乎想要删除所有前导和尾随逗号。
You may do it with 你可以这样做
gsub("^,+|,+$", "", x)
See the regex demo 请参阅正则表达式演示
The regex contans two alternations, ^,+
matches 1 or more commas at the start and ,+$
matches 1+ commas at the end, and gsub
replaces these matches with empty strings. 正则表达式包含两个替换,
^,+
在开头匹配1个或多个逗号,+$
匹配末尾的1个或多个逗号, gsub
用空字符串替换这些匹配。
See R demo 见R演示
x <- c(",,,,,,,12345","12345,1345,1354","123,,,,,,","12345,",",12354")
gsub("^,+|,+$", "", x)
## [1] "12345" "12345,1345,1354" "123" "12345"
## [5] "12354"
You can also use str_extract
from stringr
. 您也可以使用
str_extract
的stringr
。 Thanks to greedy matching, you don't have to specify how many times a digit occurs, the longest match is automatically chosen: 由于贪婪匹配,您不必指定数字出现的次数,自动选择最长匹配:
library(dplyr)
library(stringr)
df %>%
mutate(V1 = str_extract(V1, "\\d.+\\d"))
or if you prefer base R
: 或者如果您更喜欢
base R
:
df$V1 = regmatches(df$V1, gregexpr("\\d.+\\d", df$V1))
Result: 结果:
V1
1 12345
2 12345,1345,1354
3 123
4 12345
5 12354
Data: 数据:
df = read.table(text = ",,,,,,,12345
12345,1345,1354
123,,,,,,
12345,
,12354")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.