[英]How to find 6 digits in a string in python?
An example string is "CPLR_DUK10_772989_2"
. 示例字符串是
"CPLR_DUK10_772989_2"
。 I want to pick out "772989"
specifically. 我想具体挑出
"772989"
。 I would imagine using re.findall
is a good way to go about it, however, I don't have a very good grasp on regular expression so I find myself stumped on this one. 我会想象使用
re.findall
是一个很好的方法来解决它,但是,我对正则表达式没有很好的把握,所以我发现自己被这个问题困住了。
Here is a sample of code that I thought would work, until I looked at the full list of strings, and saw that it definitely doesn't. 这是我认为可以使用的代码示例,直到我查看完整的字符串列表,并且看到它肯定没有。 I suppose I'm looking for some more robustness!
我想我正在寻找更强大的功能!
for ad in Ads:
num = ''.join(re.findall(numbers,ad)[1:7])
ID.append(num)
ID = pd.Series(ID)
Other sample strings: "Teb1_110765"
, "PAN1_111572_5"
. 其他示例字符串:
"Teb1_110765"
, "PAN1_111572_5"
。
The regex you are looking for is 你正在寻找的正则表达式是
p = re.findall(r'_(\d{6})', ad)
This will match a six-digit number preceded by an underscore, and give you a list of all matches ( should there be more than one ) 这将匹配以下划线开头的六位数字,并为您提供所有匹配的列表( 如果有多个匹配)
Demo: 演示:
>>> import re
>>> stringy = 'CPLR_DUK10_772989_2'
>>> re.findall(r'_(\d{6})', stringy)
['772989']
This should append all sets of 6 numbers that follow an underscore 这应该附加跟随下划线的所有6个数字组
for ad in Ads:
blocks = re.split('_', ad)
for block in blocks[1:]:
if len(block) == 6 and block.isdigit():
ID.append(block)
ID = pd.Series(ID)
You can use a list comprehension: 您可以使用列表理解:
>>> s="CPLR_DUK10_772989_2"
>>> [x for x in s.split('_') if len(x)==6 and x.isdigit()]
['772989']
If your strings are really long and you are only looking for one number, you could use intertools like so: 如果你的字符串非常长并且你只需要一个数字,你可以使用这样的intertools:
>>> from itertools import dropwhile
>>> next(dropwhile(lambda x: not(len(x)==6 and x.isdigit()), s.split('_')))
'772989'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.