使用RegEx查找數字組，僅替換組中的最后一個成員

Question

我有一個csv文件，其格式如下（僅顯示相關行）：

Global equity - 45%/45.1%
Private Investments - 25%/21%
Hedge Funds - 17.5%/18.1%
Bonds & cash - 12.5%/15.3%

我寫了一個正則表達式來查找數字的每次出現（即45％/ 45.1％等），並且我試圖編寫它以便它僅將數字保留在斜杠標記之后。 這是我寫的：

with open('sheet.csv','rU') as f:
    rdr = csv.DictReader(f,delimiter=',')
    row1 = next(rdr)
    assets = str(row1['Asset Allocation '])
    finnum = re.sub(r'(\/[0-9]+.)','#This is where I want to replace with just the numbers after the slash',assets)
    print(finnum)

所需的輸出：

Global equity - 45.1%
Private Investments - 21%
etc...

如果我不知道我想要的數字的索引，這甚至可能嗎？

Answer 1

您可以嘗試使用此（'\\ d +％/'）正則表達式刪除無用的數據。

import re

string = 'Global equity - 45%/45.1%'
re.sub(r'\d+%/', '', string) # 'Global equity - 45.1%'

Answer 2

如果專門尋找該模式，則可以基於組使用replace和concat：

replace = lambda s: s.group(1) + ' ' + s.group(3)
re.sub(r'(.*) (\d+%/)(\d+%)', replace, 'Hedge Funds - 17.5%/18.1%')

然后可以簡單地刪除不需要的內容：

val = 'Hedge Funds - 17.5%/18.1%'
re.sub(r'\d+%/', '', val)

或者，如果您不想使用正則表達式：

val = 'Hedge Funds - 17.5%/18.1%'
replaced = val[0:val.find(' - ')] + ' - ' + val[val.find('%/') + 2:]

Answer 3

如果您不想替換並且需要將這些值用於代碼的其他部分。 你可以：

import re

cleanup = re.compile(r"(^.+?)-\s.+?\/(.+?)$",re.MULTILINE)
f = open(file_name, 'r')
text = f.read()
for match in cleanup.finditer(text):
    print match.group(1),match.group(2)

Answer 4

您還可以將第一個數字之前和/之后的內容分組。

import re

s = 'Hedge Funds - 17.5%/18.1%'
print re.sub('(.*-) .*/(.*)', '\g<1> \g<2>', s)

輸出：

Hedge Funds - 18.1%

使用RegEx查找數字組，僅替換組中的最后一個成員

問題描述

4 個解決方案

解決方案1
2 2016-01-25 18:44:51

解決方案2
2 2016-01-25 18:47:44

解決方案3
2 2016-01-25 19:36:21

解決方案4
1 已采納 2016-01-25 18:51:52

使用RegEx查找數字組，僅替換組中的最后一個成員

問題描述

4 個解決方案

解決方案1 2 2016-01-25 18:44:51

解決方案2 2 2016-01-25 18:47:44

解決方案3 2 2016-01-25 19:36:21

解決方案4 1 已采納 2016-01-25 18:51:52

解決方案1
2 2016-01-25 18:44:51

解決方案2
2 2016-01-25 18:47:44

解決方案3
2 2016-01-25 19:36:21

解決方案4
1 已采納 2016-01-25 18:51:52