简体   繁体   English

正则表达式捕获最多2位数字和逗号(如果后面跟另一个单词和数字)

[英]Regex to capture numbers up to 2 digits and coma if followed by another word and number

I need a regular expression that matches and return 2 numbers from a string when conditions are met 我需要一个正则表达式来匹配并在满足条件时从字符串中返回2个数字

  1. only numbers with a maximum of 2 digits and not greater than 29 (might include a decimal case - so up to 2 digits plus 1 decimal case) 仅包含最多2位数字且不大于29的数字(可能包括小数点后的数字-最多2位数字加1个小数点后的数字)

  2. they must have in between either one of the characters y or + and after the second number the word 'houses' 他们必须在字符y+之一之间,并且在第二个数字之后的单词“ houses”

And then capture both numbers 然后捕捉两个数字

for the string below: 对于以下字符串:

8 y 13 houses, 13 y 8 houses, 13 y 13 houses, 8 y 8 houses, 120 y 8 houses, 8 y 120 houses, 13,5 y 8 houses, 13,5 y 120 houses

I would get 我会得到

8 and 13 / 13 and 8 / 13 and 13 / 8,8 / 13,5 and 5

I was trying with this 我正在尝试这个

\b([0-9][0-9]?)\s[y|\+]\s([0-9]{1,2})\shouses\b

but can't manage to get the ',' as well. 但也无法获得','。

You If you want to match the optional decimal value with an optional group: 您如果要将可选的十进制值与可选组匹配:

re.compile(r"\b([1-2]?\d(?:,\d)?)\s[y+]\s([1-2]?\d(?:,\d)?)\shouses\b")

where (?:,[0-9])? 哪里(?:,[0-9])? will match a comma followed by a digit if present. 将匹配一个逗号,后跟一个数字(如果存在)。 Note that I limit the digit matching to values between 0 and 29; 请注意,我将数字匹配限制为0到29之间的值; matching an optional 1 or 2 first, followed by 0-9 . 首先匹配可选的12 ,然后匹配0-9

Demo: 演示:

>>> import re
>>> demo = '8 y 13 houses, 13 y 8 houses, 13 y 13 houses, 8 y 8 houses, 120 y 8 houses, 8 y 120 houses, 13,5 y 8 houses, 13,5 y 120 houses'
>>> pattern = re.compile(r"\b([1-2]?\d(?:,\d)?)\s[y+]\s([1-2]?\d(?:,\d)?)\shouses\b")
>>> pattern.findall(demo)
[('8', '13'), ('13', '8'), ('13', '13'), ('8', '8'), ('13,5', '8')]

Here's a try: 尝试一下:

#! /usr/bin/env python

import re

str = '8 y 13 houses, 13 y 8 houses, 13 y 13 houses, 8 y 8 houses, 120 y 8 houses, 8 y 120 houses, 13,5 y 8 houses, 13,5 y 120 houses'

regex = r'''
\b (
    [012]?     # number may go up to 29, so could have a leading 0, 1, or 2
    [0-9]      # but there must be at least one digit 0-9 here
    (,[0-9])?  # and the digits might be followed by one decimal point
)
\s* [y+] \s*   # must be a 'y' or '+' in between
(
    [012]?     # followed by another 0-29
    [0-9]
    (,[0-9])?  # and an optional decimal point
)
\s* houses \b  # followed by the word "houses"
'''

for match in re.finditer(regex, str, re.VERBOSE):
    print "found: %s and %s" % (match.group(1), match.group(3))

Demonstration: 示范:

$ python pyregex.py 
found: 8 and 13
found: 13 and 8
found: 13 and 13
found: 8 and 8
found: 13,5 and 8

When that regex matches a string in your input, the first number will be in match.group(1) and the second number will be in match.group(3) . 当该正则表达式与您输入中的字符串匹配时,第一个数字将在match.group(1) ,第二个数字将在match.group(3)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 正则表达式,用于匹配单词后跟斜杠和10位数字 - RegEx for matching a word followed by slash and 10 digits 我需要编写一个正则表达式来识别所有带有逗号分隔的数字,不包括 4 位数字 - I need to write a regex that recognize all numbers with coma separated or not, excluding 4 digits numbers 如果后面跟着另一个组,如何不捕获正则表达式中的组 - How to not capture a group in regex if it is followed by an another group Python Regex:匹配前面或后面没有带数字的单词的字符串 - Python Regex: Match a string not preceded by or followed by a word with digits in it Python正则表达式匹配任意数量的数字,而不是紧跟句点 - Python regex match any number of digits not immediately followed by period 匹配整个单词或单词后跟另一个的正则表达式模式 - regex pattern to match whole word or word followed by another 使用正则表达式仅捕获 Pyspark 中特定字母后跟的数字 - Capture only the number that is followed by a specific letter in Pyspark with regex 正则表达式捕获其中至少一个数字的单词 - Regex to capture word with at least one number in it 正则表达式忽略数字后跟一个单词/非数字字符 - regex to ignore number followed by a word/non-numeric character Python正则表达式:(\\ w +)不使用2位数或更少的数字 - Python Regex: (\w+) not picking up up numbers with 2 digits or less
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM