简体   繁体   English

正则表达式匹配,直到找到除单词“and”之外的任何字符

[英]Regex match until any characters are found except for the word 'and'

I would like to find a regex solution to match a part of the string after Item Number(s) until any characters are found except if it is the word 'and'我想找到一个正则表达式解决方案来匹配项目编号之后的字符串的一部分,直到找到任何字符,除非它是单词“and”

s = 'this part 123 should be ignored Item Number(s)92349252 and 30239429434, 124029354,345340332,  234325923 hallo 2121124'

it works if I add specifically hallo如果我特别添加hallo它会起作用

re.match(r'.*?Item Number\(s\)(.*?)hallo.*$', s).group(1)
'92349252 and 30239429434, 124029354,345340332,  234325923 '

however I want it to work for any characters (including hallo ) except if it is the word and .但是我希望它适用于任何字符(包括hallo ),除非它是单词and

You dont need regex just use:您不需要正则表达式只需使用:

a,b,c = s.partition("and")
print(c)

c is the part after and. c 是 and 之后的部分。

We can try using a combination of string split with re.findall .我们可以尝试使用 string split 和re.findall的组合。 First, split the input on the text Item Number(s) , and retain the second entry in the array.首先,拆分文本Item Number(s)上的输入,并保留数组中的第二个条目。 This corresponds to all text to the right of Item Number(s) .这对应于Item Number(s)右侧的所有文本。 Then, use re.split to split on whitespace followed by some content which is not either the word and , a digit, space, or commad.然后,使用re.splitre.split拆分,后跟一些不是单词and 、数字、空格或逗号的内容。 Finally, use re.findall to capture all numbers from the remaining text.最后,使用re.findall从剩余文本中捕获所有数字。

s = 'this part 123 should be ignored Item Number(s)92349252 and 30239429434, 124029354,345340332,  234325923 hallo 2121124'
nums = re.findall(r'\b\d+\b', re.split(r' (?!\band\b|[\d\s,])', s.split('Item Number(s)')[1])[0])
print(nums)

['92349252', '30239429434', '124029354', '345340332', '234325923']

I stated the question incorrectly.我错误地陈述了这个问题。 The correct question is: find a string containing numbers after Item Number(s) until a word is found except if this word is and .正确的问题是:找到一个包含 Item Number(s) 之后的数字的字符串,直到找到一个单词,除非这个单词是and

Rephrasing: find the string after Item Number(s) which have 1 or more digits separated by either zero or more non word character(s) or repeated the word 'and` preceded with a non word character followed by 0 or more non word character(s)改写:查找Item Number(s)之后的字符串,它有 1 个或多个数字,由零个或多个非单词字符分隔,或者重复单词 'and' 前面有一个非单词字符,后跟 0 个或多个非单词字符(s)

import re
s = '123 ignore Item Number(s)92349252 and,,;^and,and;;;30239429434, 124029354,345340332,  and and 234325923 hallo 2121124'
pattern = r'.*?Item Number\(s\)(((\W*?|(\W+?and)+\W*?)\d+)+)'
m = re.match(pattern, s).group(1)
numbers = re.findall('\d+', m)
print(numbers)

is

['92349252', '30239429434', '124029354', '345340332', '234325923']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python正则表达式匹配以单词开头,以4位数字结尾,不包含除@和%之外的特殊字符并且至少有10个字符的模式 - Python regex to match a pattern that starts with word, end with 4 digits, contain no special characters except @ and % and have atleast 10 characters 正则表达式匹配除数字和特殊字符以外的所有内容 - Regex match everything except numbers and special characters 正则表达式匹配第一个和最后一个单词或任何单词 - regex match first and last word or any word Python 正则表达式匹配关键字的所有变体,除非前面有大写单词 - Python Regex to match a all variations of a keyword except if preceded by a capitalized word 正则表达式匹配 - Python - 任意数量的字符 - Regex match - Python - any amount of characters 正则表达式:匹配IP地址,除非前面有某些字符? - Regex: Match IP address except when preceded by certain characters? 匹配字符串中的任何单词,除了在python中以大括号开头的单词 - Match any word in string except those preceded by a curly brace in python Python正则表达式匹配,直到识别后的某些单词 - Python regex match until certain word after identaion 强大的正则表达式匹配所有字符,直到一个浮点数 - Robust regex to match all characters until a floating point number 正则表达式匹配任何包含 ^ 的语句,但以 \ 开头的语句除外 - Regex match any setence containing ^ except those that start with \
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM