[英]Using python how do I insert a string in select lines of a text file where the inserted string depends on the content of the line and a known mapping?
Background背景
I have a text file (it's a DAT file) that I want to import into a program formated as is, albeit with some minor additional strings inserted to select lines.我有一个文本文件(它是一个 DAT 文件),我想将其导入到按原样格式化的程序中,尽管插入了一些小的附加字符串来选择行。 The file is far too large to make the minor changes manually.
该文件太大而无法手动进行细微更改。
An arbitrary select line has the following defining properties:任意选择行具有以下定义属性:
select_string_
followed by a unique string $_
that can be detected using regex.select_string_
开头,后跟一个可以使用正则表达式检测到的唯一字符串$_
。 For each select line the exact string I want to insert depends on which one of these string members appears at the end of the line and a known mapping.对于每个选择行,我想要插入的确切字符串取决于这些字符串成员中的哪一个出现在该行的末尾以及一个已知的映射。
(The non-select lines contain arbitrary strings; they don't appear according to some simple order. Incidentally, for all select lines the above unique string $_
is followed by _blah_
which is regex detectable) (非选择行包含任意字符串;它们不会按照一些简单的顺序出现。顺便说一句,对于所有选择行,上述唯一字符串
$_
后跟_blah_
,这是正则表达式可检测的)
So we have, starting at line 1, something like as follows:所以我们有,从第 1 行开始,如下所示:
select_string_$__blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$__blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$__blah_string_B
select_string_$__blah_string_B
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$__blah_string_C
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$__blah_string_C
For a given select line the text I want to insert belongs after the $_
, and I want the specific string to be inserted to reflect the following simple (extensively defined) bijective function f :对于给定的选择行,我要插入的文本属于
$_
,并且我希望插入的特定字符串反映以下简单(广泛定义的)双射函数f :
f = {(string_A, f (string_A)), (string_B, f (string_B)), (string_C, f (string_C))) f = {(string_A, f (string_A)), (string_B, f (string_B)), (string_C, f (string_C)))
The following dictionary captures this mapping:以下字典捕获了此映射:
{'string_A' : '*f*(string_A)', 'string_B' : '*f*(string_B)', 'string_C' : '*f*(string_C)'}
So, take string_A
as an arbitrary example: all the select lines that end in string_A
should have f(string)
inserted after the $_
.因此,以
string_A
为例:所有以string_A
结尾的选择string_A
应该在$_
之后插入f(string)
。 Thus, I want all the select lines containing string_A
to look as follows:因此,我希望包含
string_A
所有选择行如下所示:
select_string_$_f(string_A)_blah_string_A
Generalizing from this arbitrary example my question is as follows:从这个任意示例中概括我的问题如下:
Question题
Using python 3, how do I generate the following text?使用 python 3,如何生成以下文本?
select_string_$_f(string_A)_blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_A)_blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_B)_blah_string_B
select_string_$_f(string_B)_blah_string_B
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_C)_blah_string_C
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_C)_blah_string_C
More generally: using python how do I insert a string in select lines of a text file where the inserted string depends on the content of the line and a known mapping?更一般地说:使用 python 如何在文本文件的选择行中插入字符串,其中插入的字符串取决于行的内容和已知映射?
Considering $_
is an apperent indicator in all lines you wish to change, we can check for the presence of $_
, and further check for the presence of string_a, b or c
.考虑到
$_
是您希望更改的所有行中的一个明显指示符,我们可以检查$_
的存在,并进一步检查string_a, b or c
的存在。
string_a = 'string_A'
string_b = 'string_B'
string_c = 'string_C'
testcases = ['select_string_$__blah_string_A', 'select_string_$__blah_string_B', 'select_string_$__blah_string_C', 'non_select_arbitrary_string']
result = []
for test in testcases:
if '$_' not in test:
result.append(test)
continue
check = test.split('$_')
if string_a in check[1]:
result.append(f'$_({string_a})'.join(check))
elif string_b in check[1]:
result.append(f'$_({string_b})'.join(check))
elif string_c in check[1]:
result.append(f'$_({string_c})'.join(check))
print(result)
#['select_string_$_(string_A)_blah_string_A', 'select_string_$_(string_B)_blah_string_B', 'select_string_$_(string_C)_blah_string_C', 'non_select_arbitrary_string']
From here you can write your result
back to the file.从这里您可以将
result
写回文件。
import re
fin = open("input.txt", "r")
fout = open("output.txt", "w")
for line in fin:
line = re.sub(r'^(select_string_\$_)(.*?(string_A|string_B|string_C))$', r'\1f(\3)\2', line)
fout.write(line)
Given your example, this produces:鉴于您的示例,这会产生:
select_string_$_f(string_A)_blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_A)_blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_B)_blah_string_B
select_string_$_f(string_B)_blah_string_B
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_C)_blah_string_C
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_C)_blah_string_C
Regex explanation:正则表达式解释:
^ # beginning of line
(select_string_\$_) # group 1, literally "select_string_$_"
( # group 2
.*? # 0 or more any character
(string_A|string_B|string_C) # group 3 one of string_A or string_B or string_C
) # end group 3
$ # end of line
Replacement:替代品:
\1 # content of group 1
f(\3) # f(, content of group 3, )
\2 # content of group 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.