從Python中的文本文件中提取數值數據

Question

說我有一個帶有數據/字符串的文本文件：

Dataset #1: X/Y= 5, Z=7 has been calculated
Dataset #2: X/Y= 6, Z=8 has been calculated
Dataset #10: X/Y =7, Z=9 has been calculated

我希望輸出在csv文件上為：

X/Y, X/Y, X/Y

應該顯示：

5, 6, 7

這是我當前的方法，我正在使用string.find，但是我覺得這很難解決這個問題：

data = open('TestData.txt').read()
#index of string
counter = 1

if (data.find('X/Y=')==1):      
#extracts segment out of string
    line = data[r+6:r+14]
    r = data.find('X/Y=')
    counter += 1 
    print line
else: 
    r = data.find('X/Y')`enter code here`
    line = data[r+6:r+14]
    for x in range(0,counter):
    print line


print counter

錯誤：由於某種原因，我只能得到5的值。設置#loop時，我得到的是無限5。

Answer 1

如果您想要數字，並且txt文件的格式類似於前兩行，即X/Y= 6 ，而不是X/Y =7 ：

import re
result=[]
with open("TestData.txt") as f:
    for line in f:
        s = re.search(r'(?<=Y=\s)\d+',line) # pattern matches up to "Y" followed by "=" and a space "\s" then a digit or digits. 
        if s: # if there is a match i.e re.search does not return None, add match to the list.
            result.append(s.group())
print result
['5', '6', '7']

要匹配注釋中的模式，您應該轉義類似的句點。 否則您將匹配1.2 + 3等字符串。“。” 具有特殊的意義。

所以re.search(r'(?<=Counting Numbers =\\s)\\d\\.\\d\\.\\d',s).group()僅返回1.2.3

如果使它更明確，則可以使用完整的X/Y=\\s模式使用s=re.search(r'(?<=X/Y=\\s)\\d+',line) 。

在注釋和更新的行中使用原始行將返回：

['5', '6', '7', '5', '5']

(?<=Y=\\s ）稱為肯定隱式斷言 。

(?<=...)

如果字符串中的當前位置前面有匹配項...的匹配項，則匹配項在當前位置結束時匹配

re文檔中有很多不錯的示例。 括號中的項目不返回。

Answer 2

由於似乎所有實體都在一行上，所以我建議在loop使用readline逐行讀取文件，然后使用regex從該行中解析出您要查找的組件。

編輯回復：OP的評論：

在這種情況下，可以使用一種正則表達式模式捕獲給定指定格式的數字： X/Y\\s*=\\s*(.+),

從Python中的文本文件中提取數值數據

問題描述

2 個解決方案

解決方案1
2 已采納 2014-05-29 00:13:21

解決方案2
1 2014-05-28 23:59:03

從Python中的文本文件中提取數值數據

問題描述

2 個解決方案

解決方案1 2 已采納 2014-05-29 00:13:21

解決方案2 1 2014-05-28 23:59:03

解決方案1
2 已采納 2014-05-29 00:13:21

解決方案2
1 2014-05-28 23:59:03