简体   繁体   English

int在python中的正则表达式中的def中不起作用

[英]int doesn't work in def within regular expression in python

I need to write function which takes a count and a string and return a list of all of the words in the string that are count word characters long or longer. 我需要编写一个函数,该函数需要一个计数和一个字符串,并返回该字符串中所有字的字符长度或更长的列表。

My function is: 我的职能是:

import re

def find_words(count, a_str):
    count = int(count)
    return re.findall(r'\w{},'.format(int(count)), a_str)

But it doesn't work, it is return empty list: 但这不起作用,它返回空列表:

Example: 例:

find_words(4, "dog, cat, baby, balloon, me")

Should return: 应该返回:

['baby', 'balloon']

The regex isn't correct. 正则表达式不正确。 The {} is interpreted as placeholder for format , but you want it to be the regexs' {} which specifies the number of repeats. {}被解释为format占位符,但您希望它是正则表达式的{} ,用于指定重复次数。 You need to use r'\\w{{{}}}' here. 您需要在此处使用r'\\w{{{}}}' Observe the difference: 观察差异:

>>> r'\w{},'.format(4)
'\\w4,'

>>> r'\w{{{},}}'.format(4)
'\\w{4,}'

And then it works correctly: 然后它可以正常工作:

import re
def find_words(count, a_str):
    count = int(count)
    return re.findall(r'\w{{{},}}'.format(count), a_str)

>>> find_words(4, "dog, cat, baby, balloon, me") 
['baby', 'balloon']

Why RegExp? 为什么选择RegExp?

>>> string = "dog, cat, baby, balloon, me"
>>> [word for word in string.split(', ') if len(word) >= 4]
['baby', 'balloon']

So function could be something like follow: 因此功能可能如下所示:

>>> def find_words(count, a_str):
...     return [word for word in a_str.split(', ') if len(word) >= count]
...
>>> find_words(4, 'dog, cat, baby, balloon, me')
['baby', 'balloon']

You can try this: 您可以尝试以下方法:

def find_words(count, a_str):
   s = [re.findall("\w{"+str(count)+",}", i) for i in ["dog, cat, baby, balloon, me"]]
   return s[0]

print(find_words(4, ["dog, cat, baby, balloon, me"]))

Output: 输出:

['baby', 'balloon']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM