将函数应用于文件中每一行的特定表达式

Question

我目前正在将文件的内容读取到每个行符合特定条件的案例实例的新文件中。 看下面的代码

from string import punctuation

fpath = open('Redshift_twb_1.txt', 'r')
lines = fpath.readlines()

fpath_write = open('Redshift_1_new.txt', 'w+')

# filter the list; with the string 'apple'
# replace 'apple' with whatever string you want to find
temp_out_lines = [line for line in lines if '<column caption' in line]
out_lines = [line for line in temp_out_lines if 'param-domain-type' not in line]

# Lambda function that maps .lower() function to every element of the list out_lines
lower_lines = map(lambda x:x.lower(), out_lines)

# Join the lines into a single string
output = '\n'.join(lower_lines)

# write it
fpath_write.write(output)

fpath.close()
fpath_write.close()

我的目标是实现可以在将该行写入新文件之前读取一行并小写或小写特定参数的功能。

目前，该过程接受一行，检查它是否匹配<column caption ，然后检查它是否不包含param-domain-type 。 如果这两个都通过，则该行将添加到新的 txt 文件中。

示例行如下：

<column caption='Section' datatype='string' name='[SECTION]' role='dimension' type='nominal'>

目标是在将每一行添加到新的 txt 文件之前检查每一行，并且对于name='[****]'每个实例，将[]的值设为小写。 目前，它们是大写的。

注意：只有[]中参数name=值可以小写。 该行中还有其他参数必须保持大写。

谢谢！

编辑：另一种选择是进行临时查找和替换，以找到所有具有name='[ABC]'实例，并将其替换为name='[abc]' 。 但是，我仍然不知道如何自己解决这个问题。

Edit2：在实现 Regex 时，我还使用了 for 循环来循环遍历 txt 文件的每个实例...请参阅下面的代码。

for x in range(len(out_lines)):
    print(out_lines[x])
    test = str(out_lines[x])
    out_lines[x] = re.sub(r"(name='([.*?])')", lambda m: m.group(1).lower(), test)
    print(out_lines[x])

但是，当我这样做时，我仍然得到相同的输出：

<column caption='Location' datatype='string' name='[MANAGEMENT_LOCATION]' role='dimension' type='nominal' />

<column caption='Location' datatype='string' name='[MANAGEMENT_LOCATION]' role='dimension' type='nominal' />

Answer 1

您可以使用 re python 模块来替换必要的子字符串。

import re
re.sub(r"(name='(\[.*?\])')", lambda m: m.group(1).lower(), <YOUR TEXT>)

将函数应用于文件中每一行的特定表达式

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-01-31 18:56:17

将函数应用于文件中每一行的特定表达式

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-01-31 18:56:17

解决方案1
1 已采纳 2020-01-31 18:56:17