將函數應用於文件中每一行的特定表達式

Question

我目前正在將文件的內容讀取到每個行符合特定條件的案例實例的新文件中。 看下面的代碼

from string import punctuation

fpath = open('Redshift_twb_1.txt', 'r')
lines = fpath.readlines()

fpath_write = open('Redshift_1_new.txt', 'w+')

# filter the list; with the string 'apple'
# replace 'apple' with whatever string you want to find
temp_out_lines = [line for line in lines if '<column caption' in line]
out_lines = [line for line in temp_out_lines if 'param-domain-type' not in line]

# Lambda function that maps .lower() function to every element of the list out_lines
lower_lines = map(lambda x:x.lower(), out_lines)

# Join the lines into a single string
output = '\n'.join(lower_lines)

# write it
fpath_write.write(output)

fpath.close()
fpath_write.close()

我的目標是實現可以在將該行寫入新文件之前讀取一行並小寫或小寫特定參數的功能。

目前，該過程接受一行，檢查它是否匹配<column caption ，然后檢查它是否不包含param-domain-type 。 如果這兩個都通過，則該行將添加到新的 txt 文件中。

示例行如下：

<column caption='Section' datatype='string' name='[SECTION]' role='dimension' type='nominal'>

目標是在將每一行添加到新的 txt 文件之前檢查每一行，並且對於name='[****]'每個實例，將[]的值設為小寫。 目前，它們是大寫的。

注意：只有[]中參數name=值可以小寫。 該行中還有其他參數必須保持大寫。

謝謝！

編輯：另一種選擇是進行臨時查找和替換，以找到所有具有name='[ABC]'實例，並將其替換為name='[abc]' 。 但是，我仍然不知道如何自己解決這個問題。

Edit2：在實現 Regex 時，我還使用了 for 循環來循環遍歷 txt 文件的每個實例...請參閱下面的代碼。

for x in range(len(out_lines)):
    print(out_lines[x])
    test = str(out_lines[x])
    out_lines[x] = re.sub(r"(name='([.*?])')", lambda m: m.group(1).lower(), test)
    print(out_lines[x])

但是，當我這樣做時，我仍然得到相同的輸出：

<column caption='Location' datatype='string' name='[MANAGEMENT_LOCATION]' role='dimension' type='nominal' />

<column caption='Location' datatype='string' name='[MANAGEMENT_LOCATION]' role='dimension' type='nominal' />

Answer 1

您可以使用 re python 模塊來替換必要的子字符串。

import re
re.sub(r"(name='(\[.*?\])')", lambda m: m.group(1).lower(), <YOUR TEXT>)

將函數應用於文件中每一行的特定表達式

問題描述

1 個解決方案

解決方案1
1 已采納 2020-01-31 18:56:17

將函數應用於文件中每一行的特定表達式

問題描述

1 個解決方案

解決方案1 1 已采納 2020-01-31 18:56:17

解決方案1
1 已采納 2020-01-31 18:56:17