简体   繁体   English

如何在python中找到一个字符串中的6位数?

[英]How to find 6 digits in a string in python?

An example string is "CPLR_DUK10_772989_2" . 示例字符串是"CPLR_DUK10_772989_2" I want to pick out "772989" specifically. 我想具体挑出"772989" I would imagine using re.findall is a good way to go about it, however, I don't have a very good grasp on regular expression so I find myself stumped on this one. 我会想象使用re.findall是一个很好的方法来解决它,但是,我对正则表达式没有很好的把握,所以我发现自己被这个问题困住了。

Here is a sample of code that I thought would work, until I looked at the full list of strings, and saw that it definitely doesn't. 这是我认为可以使用的代码示例,直到我查看完整的字符串列表,并且看到它肯定没有。 I suppose I'm looking for some more robustness! 我想我正在寻找更强大的功能!

for ad in Ads:
    num = ''.join(re.findall(numbers,ad)[1:7])
    ID.append(num)
ID = pd.Series(ID)

Other sample strings: "Teb1_110765" , "PAN1_111572_5" . 其他示例字符串: "Teb1_110765""PAN1_111572_5"

The regex you are looking for is 你正在寻找的正则表达式是

p = re.findall(r'_(\d{6})', ad)

This will match a six-digit number preceded by an underscore, and give you a list of all matches ( should there be more than one ) 这将匹配以下划线开头的六位数字,并为您提供所有匹配的列表( 如果有多个匹配)

Demo: 演示:

>>> import re
>>> stringy =  'CPLR_DUK10_772989_2'
>>> re.findall(r'_(\d{6})', stringy)
['772989']

This should append all sets of 6 numbers that follow an underscore 这应该附加跟随下划线的所有6个数字组

for ad in Ads:
    blocks = re.split('_', ad)
    for block in blocks[1:]:
        if len(block) == 6 and block.isdigit(): 
            ID.append(block)
ID = pd.Series(ID)

You can use a list comprehension: 您可以使用列表理解:

>>> s="CPLR_DUK10_772989_2"
>>> [x for x in s.split('_') if len(x)==6 and x.isdigit()]
['772989']

If your strings are really long and you are only looking for one number, you could use intertools like so: 如果你的字符串非常长并且你只需要一个数字,你可以使用这样的intertools:

>>> from itertools import dropwhile
>>> next(dropwhile(lambda x: not(len(x)==6 and x.isdigit()), s.split('_')))
'772989'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM