使用regEx从字符串中删除数字

Question

I am trying to remove all digits from a string that are not attached to a word. 我试图从字符串中删除未附加到单词的所有数字。 Examples: 例子：

 "python 3" => "python"
 "python3" => "python3"
 "1something" => "1something"
 "2" => ""
 "434" => ""
 "python 35" => "python"
 "1 " => ""
 " 232" => ""

Till now I am using the following regular expression: 直到现在我使用以下正则表达式：

((?<=[ ])[0-9]+(?=[ ])|(?<=[ ])[0-9]+|^[0-9]$)

which can correctly do some of the examples above, but not all. 这可以正确地做上面的一些例子，但不是全部。 Any help and some explanation? 任何帮助和一些解释？

Answer 1

Why not just use word boundaries? 为什么不使用单词边界？

\b\d+\b

Here is an example: 这是一个例子：

>>> import re
>>> words = ['python 3', 'python3', '1something', '2', '434', 'python 35', '1 ', ' 232']
>>> for word in words:
...     print("'{}' => '{}'".format(word, re.sub(r'\b\d+\b', '', word)))
...
'python 3' => 'python '
'python3' => 'python3'
'1something' => '1something'
'2' => ''
'434' => ''
'python 35' => 'python '
'1 ' => ' '
' 232' => ' '

Note that this will not remove spaces before and after. 请注意，这不会删除前后的空格。 I would advise using strip() , but if not you can probably do \\b\\d+\\b\\s* (for space after) or something similar. 我建议使用strip() ，但如果没有，你可以做\\b\\d+\\b\\s* （后面的空格）或类似的东西。

Answer 2

You could just split the words and remove any words that are digits which is a lot easier to read: 您可以拆分单词并删除任何数字更容易阅读的单词：

new = " ".join([w for w in s.split() if not w.isdigit()])

And also seems faster: 而且似乎更快：

In [27]: p = re.compile(r'\b\d+\b')

In [28]: s =  " ".join(['python 3', 'python3', '1something', '2', '434', 'python
    ...:  35', '1 ', ' 232'])

In [29]: timeit " ".join([w for w in s.split() if not w.isdigit()])

100000 loops, best of 3: 1.54 µs per loop

In [30]: timeit p.sub('', s)

100000 loops, best of 3: 3.34 µs per loop

It also removes the space like your expected output: 它还会删除预期输出的空间：

In [39]:  re.sub(r'\b\d+\b', '', " 2")
Out[39]: ' '

In [40]:  " ".join([w for w in " 2".split() if not w.isdigit()])
Out[40]: ''

In [41]:  re.sub(r'\b\d+\b', '', s)
Out[41]: 'python  python3 1something   python     '

In [42]:  " ".join([w for w in s.split() if not w.isdigit()])
Out[42]: 'python python3 1something python'

So both approaches are significantly different. 因此两种方法都有很大不同。

Answer 3

This regex, (\\s|^)\\d+(\\s|$), could work as shown below in javascript 这个正则表达式（\\ s | ^）\\ d +（\\ s | $），可以在javascript中如下所示工作

 var value = "1 3@bar @foo2 * 112"; var matches = value.replace(/(\\s|^)\\d+(\\s|$)/g,""); console.log(matches)

It works in 3 parts: 它分为3部分：

It first matches a space or begging of string using (\\s|^) with \\s matching a white-space | 它首先使用（\\ s | ^）匹配一个空格或字符串的乞讨，其中\\ s匹配一个空格| meaning or and ^ meaning beginning of string. 意思是和^意思是字符串的开头。
next matching digits from 1 to times using \\d for a digit and + to match 1 to N times but as many as possible. 下一个匹配数字从1到次使用\\ d表示数字，+表示匹配1到N次，但尽可能多。
Finally (\\s|$) matches a space or end of sting with \\s matching space, | 最后（\\ s | $）匹配带有\\ s匹配空间的sting的空格或结尾，| meaning or, and $ matching end of string. 含义或，和$匹配字符串的结尾。

You can replace $ with end of line or \\n if you have several lines or just add it in next to it like this (\\s|$|\\n). 您可以将$替换为行尾或\\ n如果您有多行，或者只是将其添加到它旁边（\\ s | $ | \\ n）。 Hope this is what your're looking for. 希望这是你正在寻找的。

使用regEx从字符串中删除数字

问题描述

3 个解决方案

解决方案1
5 已采纳 2016-10-21 13:54:34

解决方案2
3 2016-10-21 14:15:35

解决方案3
0 2016-10-21 13:57:02

使用regEx从字符串中删除数字

问题描述

3 个解决方案

解决方案1 5 已采纳 2016-10-21 13:54:34

解决方案2 3 2016-10-21 14:15:35

解决方案3 0 2016-10-21 13:57:02

解决方案1
5 已采纳 2016-10-21 13:54:34

解决方案2
3 2016-10-21 14:15:35

解决方案3
0 2016-10-21 13:57:02