简体   繁体   English

如何使用 Regex 和 Python 从文本输入中查找具有对应值的特定文本?

[英]How to find specific text with it's correspondence value from a text input using Regex and Python?

I have some text input and I want to extract few information from the text.我有一些文本输入,我想从文本中提取一些信息。 For that, I am trying to use Regular Expression and am able to do that except for two fields- rent and transfer.为此,我正在尝试使用正则表达式并且能够做到这一点,除了两个领域 - 租金和转让。

The input text is as below-输入文本如下-

my_str = "19 Aug standing order rent Apolo Housing Assoc. 500.00 50.00
20 Aug transfer from John wick saving a/c 200.00 130.90"

Now I want to extract rent like- rent 500.00 and transfer as transfer 200.00 but somehow only 'rent' and 'transfer' keywords are extracting only.现在我想提取 rent like- rent 500.00和 transfer as transfer 200.00但不知何故只有“rent”和“transfer”关键字被提取。

Below is my code in Python for the same-下面是我在 Python 中的相同代码-

import re
find_rent = re.search(r"(rent)+([0-9,.]*)", my_str)
found = find_rent.group()
print(found)

With the above code, only 'rent' is extracted not 'rent 500.00'.使用上面的代码,只提取“租金”而不是“租金 500.00”。 Similar code I am using for transfer also.我也使用类似的代码进行传输。

Please guide me on what I am doing wrong here.请指导我在这里做错了什么。

You can use您可以使用

\b(transfer|rent)\D+(\d+(?:[,.]\d+)*)

See the regex demo .请参阅正则表达式演示 Details :详情

  • \b - a word boundary \b - 单词边界
  • (transfer|rent) - Group 1: a transfer or rent word (transfer|rent) - 第 1 组: transferrent
  • \D+ - one or more non-digits \D+ - 一个或多个非数字
  • (\d+(?:[,.]\d+)*) - Group 2: one or more digits, and then zero or more occurrences of a comma/period and one or more digits (\d+(?:[,.]\d+)*) - 第 2 组:一位或多位数字,然后零次或多次出现逗号/句点和一位或多位数字

See the Python demo :请参阅Python 演示

import re
s = '19 Aug standing order rent Apolo Housing Assoc. 500.00 50.00\n20 Aug transfer from John wick saving a/c 200.00 130.90'
rx = r'\b(transfer|rent)\D+(\d+(?:[,.]\d+)*)'
for m in re.finditer(rx, s):
    print(f'{m.group(1)} {m.group(2)}')

Output: Output:

rent 500.00
transfer 200.00

For a single term search, you can use对于单个术语搜索,您可以使用

import re
s = '19 Aug standing order rent Apolo Housing Assoc. 500.00 50.00\n20 Aug transfer from John wick saving a/c 200.00 130.90'
w = 'rent'
rx = fr'\b{w}\D+(\d+(?:[,.]\d+)*)'
m = re.search(rx, s)
if m:
    print(f'{w} {m.group(1)}')

See this Python demo .请参阅此 Python 演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM