簡體   English   中英

使用RegEx解析不同的數字

[英]Parsing different numbers with RegEx

是否可以解析下面兩封電子郵件中的所有“權重”?

我需要足夠強大, 只有捕捉到這兩個電子郵件的“權重”正則表達式,和更多的電子郵件100的。 我正在使用的RegEx現在搜索逗號並獲取它們兩側的數字,這對於成千上萬的權重是完美的,但是不能捕獲低於一千的權重,例如下面的954lbs和800lbs值。

我想也許我可能會嘗試識別“lbs”並捕獲前面的數字,但在某些情況下價格先於“lbs”。

任何幫助將不勝感激,謝謝你們。

1) Subject: FW: NEFS 11 fish for lease
   From: Claire Fitz-Gerald 
   Date: 11/15/2013 3:02 PM

   NEFS 11 has the following fish for lease:

   -GOM Cod up to 5,000 lbs (live wt) @ 1.40 lbs
   -American Plaice 2,000 lbs      .60 lbs or best offer



2) From: Claire Fitz-Gerald 
   Date: 9/5/2014 9:52 AM
   Subject: NEFS 5 Available Fish

   All,
   NEFS 5 has the following fish available for lease/trade:

     GB EAST cod: 954 lbs @ $0.83
     GB EAST cod: 1,046 lbs to trade for 1,830 lbs GB WEST cod
     GB blackback: 30,000 lbs @ $0.07
     GOM blackback: 800 lbs @ $0.03
     white hake: 6,322 lbs @ $0.13
     pollock: 22,000 lbs @ $0.015
     redfish: 14,000 lbs @ $0.015
     GB yt: 1,873 lbs @ $1.13
     GB yt: 5,127 lbs to trade for 10,254 lbs SNE yt

我的相關代碼:

with open(file_path, 'r') as f:
            pattern = re.compile(r'\d+,\d+ ')
            email = f.read()
            weights = pattern.findall(email)
            data_frame['Weights'].append(weights)
            if weights:
                print("Weight:", ''.join(weights))

電子郵件#2的打印輸出:(注意不包括1000的金額)

Weight: 1,046 1,830 30,000 6,322 22,000 14,000 1,873 5,127 10,254 

有兩種方式,一種是

\d[\d,]{2,} lbs

這需要一個數字,后跟數字,逗號空格和lbs字面意思。 請參閱regex101.com上的演示


完整的Python

 import re email = """ 2) From: Claire Fitz-Gerald Date: 9/5/2014 9:52 AM Subject: NEFS 5 Available Fish All, NEFS 5 has the following fish available for lease/trade: GB EAST cod: 954 lbs @ $0.83 GB EAST cod: 1,046 lbs to trade for 1,830 lbs GB WEST cod GB blackback: 30,000 lbs @ $0.07 GOM blackback: 800 lbs @ $0.03 white hake: 6,322 lbs @ $0.13 pollock: 22,000 lbs @ $0.015 redfish: 14,000 lbs @ $0.015 GB yt: 1,873 lbs @ $1.13 GB yt: 5,127 lbs to trade for 10,254 lbs SNE yt """ rx = re.compile(r'(\\d[\\d,]{2,}) lbs') weights = rx.findall(email) print(weights) 

看到它在ideone.com上工作

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM