[英]Parsing different numbers with RegEx
是否可以解析下面兩封電子郵件中的所有“權重”?
我需要足夠強大, 只有捕捉到這兩個電子郵件的“權重”正則表達式,和更多的電子郵件100的。 我正在使用的RegEx現在搜索逗號並獲取它們兩側的數字,這對於成千上萬的權重是完美的,但是不能捕獲低於一千的權重,例如下面的954lbs和800lbs值。
我想也許我可能會嘗試識別“lbs”並捕獲前面的數字,但在某些情況下價格先於“lbs”。
任何幫助將不勝感激,謝謝你們。
1) Subject: FW: NEFS 11 fish for lease
From: Claire Fitz-Gerald
Date: 11/15/2013 3:02 PM
NEFS 11 has the following fish for lease:
-GOM Cod up to 5,000 lbs (live wt) @ 1.40 lbs
-American Plaice 2,000 lbs .60 lbs or best offer
2) From: Claire Fitz-Gerald
Date: 9/5/2014 9:52 AM
Subject: NEFS 5 Available Fish
All,
NEFS 5 has the following fish available for lease/trade:
GB EAST cod: 954 lbs @ $0.83
GB EAST cod: 1,046 lbs to trade for 1,830 lbs GB WEST cod
GB blackback: 30,000 lbs @ $0.07
GOM blackback: 800 lbs @ $0.03
white hake: 6,322 lbs @ $0.13
pollock: 22,000 lbs @ $0.015
redfish: 14,000 lbs @ $0.015
GB yt: 1,873 lbs @ $1.13
GB yt: 5,127 lbs to trade for 10,254 lbs SNE yt
我的相關代碼:
with open(file_path, 'r') as f:
pattern = re.compile(r'\d+,\d+ ')
email = f.read()
weights = pattern.findall(email)
data_frame['Weights'].append(weights)
if weights:
print("Weight:", ''.join(weights))
電子郵件#2的打印輸出:(注意不包括1000的金額)
Weight: 1,046 1,830 30,000 6,322 22,000 14,000 1,873 5,127 10,254
有兩種方式,一種是
\d[\d,]{2,} lbs
這需要一個數字,后跟數字,逗號空格和lbs字面意思。 請參閱regex101.com上的演示 。
Python
:
import re email = """ 2) From: Claire Fitz-Gerald Date: 9/5/2014 9:52 AM Subject: NEFS 5 Available Fish All, NEFS 5 has the following fish available for lease/trade: GB EAST cod: 954 lbs @ $0.83 GB EAST cod: 1,046 lbs to trade for 1,830 lbs GB WEST cod GB blackback: 30,000 lbs @ $0.07 GOM blackback: 800 lbs @ $0.03 white hake: 6,322 lbs @ $0.13 pollock: 22,000 lbs @ $0.015 redfish: 14,000 lbs @ $0.015 GB yt: 1,873 lbs @ $1.13 GB yt: 5,127 lbs to trade for 10,254 lbs SNE yt """ rx = re.compile(r'(\\d[\\d,]{2,}) lbs') weights = rx.findall(email) print(weights)
看到它在ideone.com上工作 。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.