正则表达式逗号不在两个数字之间

Question

I am looking for a regex for gsub to remove all the unwanted commas: 我正在寻找gsub的正则表达式删除所有不需要的逗号：

Data: 数据：

,,,,,,,12345
12345,1345,1354
123,,,,,,
12345,
,12354

Desired result: 期望的结果：

12345
12345,1345,1354
123
12345
12354

This is the progress I have made so far: 这是我迄今取得的进展：

(,(?!\\d+))

Answer 1

You seem to want to remove all leading and trailing commas. 您似乎想要删除所有前导和尾随逗号。

You may do it with 你可以这样做

gsub("^,+|,+$", "", x)

See the regex demo 请参阅正则表达式演示

The regex contans two alternations, ^,+ matches 1 or more commas at the start and ,+$ matches 1+ commas at the end, and gsub replaces these matches with empty strings. 正则表达式包含两个替换， ^,+在开头匹配1个或多个逗号,+$匹配末尾的1个或多个逗号， gsub用空字符串替换这些匹配。

See R demo 见R演示

x <- c(",,,,,,,12345","12345,1345,1354","123,,,,,,","12345,",",12354")
gsub("^,+|,+$", "", x)
## [1] "12345"           "12345,1345,1354" "123"             "12345"          
## [5] "12354"

Answer 2

You can also use str_extract from stringr . 您也可以使用str_extract的stringr 。 Thanks to greedy matching, you don't have to specify how many times a digit occurs, the longest match is automatically chosen: 由于贪婪匹配，您不必指定数字出现的次数，自动选择最长匹配：

library(dplyr)
library(stringr)

df %>%
  mutate(V1 = str_extract(V1, "\\d.+\\d"))

or if you prefer base R : 或者如果您更喜欢base R ：

df$V1 = regmatches(df$V1, gregexpr("\\d.+\\d", df$V1))

Result: 结果：

               V1
1           12345
2 12345,1345,1354
3             123
4           12345
5           12354

Data: 数据：

df = read.table(text = ",,,,,,,12345
                12345,1345,1354
                123,,,,,,
                12345,
                ,12354")

正则表达式逗号不在两个数字之间

问题描述

2 个解决方案

解决方案1
3 已采纳 2017-10-30 19:14:59

解决方案2
2 2017-10-30 19:07:19

正则表达式逗号不在两个数字之间

问题描述

2 个解决方案

解决方案1 3 已采纳 2017-10-30 19:14:59

解决方案2 2 2017-10-30 19:07:19

解决方案1
3 已采纳 2017-10-30 19:14:59

解决方案2
2 2017-10-30 19:07:19