简体   繁体   English

Python 在特定单词后捕获特定数字

[英]Python Capture certain number after certain words

I am trying to capture "book id:" (similar to isbn except this has numbers) numbers from this sentence.我试图从这句话中捕获“书号:”(类似于 isbn ,但有数字)数字。 I looked at the ISBN number example on stack and tried different kind of combination of regex how ever I am not able to capture them into a single list.我查看了堆栈上的 ISBN 编号示例,并尝试了不同类型的正则表达式组合,但我无法将它们捕获到单个列表中。 What am I missing here?我在这里缺少什么?

sentence="List of book ids that are important to read book id: A83827-121-1-23-1341-2315ad3  book id: N32-12-1-23-1341-2342  and  book id: A334121A313412342"
isbn = re.compile("(?:[0-9]{3}-)?[0-9]{1,5}-[0-9]{1,7}-[0-9]{1,6}-[0-9][A-Z]",re.IGNORECASE)

matches = []
for line in sentence:
    matches.extend(isbn.findall(line))
    print(line)

Trying to capture final output like this:尝试像这样捕获最终输出:

['A83827-121-1-23-1341-2315ad3','N32-12-1-23-1341-2342','A334121A313412342']

Be aware that your for loop will iterate over each character: you don't need it.请注意,您的for循环将遍历每个字符:您不需要它。

Here is the regex you could use:这是您可以使用的正则表达式:

isbn = re.compile(r"book id: ([\w-]+)")
print(isbn.findall(sentence))

Output:输出:

['A83827-121-1-23-1341-2315ad3', 'N32-12-1-23-1341-2342', 'A334121A313412342']

Explanation:解释:

  • [\\w-]+ matches any (non-empty) sequence of alphanumerical characters ( \\w ) and hyphens. [\\w-]+匹配字母数字字符 ( \\w ) 和连字符的任何(非空)序列。
  • The parentheses denote a capture group, and findall will only return what is matched in that group, so it does not include "book id: ".括号表示一个捕获组, findall将只返回该组中匹配的内容,因此它不包括“book id:”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM