简体   繁体   中英

Using python how do I insert a string in select lines of a text file where the inserted string depends on the content of the line and a known mapping?

Background

I have a text file (it's a DAT file) that I want to import into a program formated as is, albeit with some minor additional strings inserted to select lines. The file is far too large to make the minor changes manually.

An arbitrary select line has the following defining properties:

  • it starts with select_string_ followed by a unique string $_ that can be detected using regex.
  • it ends with a member of the following set of strings: {'string_A', 'string_B', 'string_C'}

For each select line the exact string I want to insert depends on which one of these string members appears at the end of the line and a known mapping.

(The non-select lines contain arbitrary strings; they don't appear according to some simple order. Incidentally, for all select lines the above unique string $_ is followed by _blah_ which is regex detectable)

So we have, starting at line 1, something like as follows:

select_string_$__blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$__blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$__blah_string_B
select_string_$__blah_string_B
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$__blah_string_C
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$__blah_string_C

For a given select line the text I want to insert belongs after the $_ , and I want the specific string to be inserted to reflect the following simple (extensively defined) bijective function f :

f = {(string_A, f (string_A)), (string_B, f (string_B)), (string_C, f (string_C)))

The following dictionary captures this mapping:

{'string_A' : '*f*(string_A)', 'string_B' : '*f*(string_B)', 'string_C' : '*f*(string_C)'}

So, take string_A as an arbitrary example: all the select lines that end in string_A should have f(string) inserted after the $_ . Thus, I want all the select lines containing string_A to look as follows:

select_string_$_f(string_A)_blah_string_A

Generalizing from this arbitrary example my question is as follows:

Question

Using python 3, how do I generate the following text?

select_string_$_f(string_A)_blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_A)_blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_B)_blah_string_B
select_string_$_f(string_B)_blah_string_B
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_C)_blah_string_C
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_C)_blah_string_C

More generally: using python how do I insert a string in select lines of a text file where the inserted string depends on the content of the line and a known mapping?

Considering $_ is an apperent indicator in all lines you wish to change, we can check for the presence of $_ , and further check for the presence of string_a, b or c .

string_a = 'string_A'
string_b = 'string_B'
string_c = 'string_C'

testcases = ['select_string_$__blah_string_A', 'select_string_$__blah_string_B', 'select_string_$__blah_string_C', 'non_select_arbitrary_string']

result = []

for test in testcases:
    if '$_' not in test:
        result.append(test)
        continue

    check = test.split('$_')

    if string_a in check[1]:
        result.append(f'$_({string_a})'.join(check))
    elif string_b in check[1]:
        result.append(f'$_({string_b})'.join(check))
    elif string_c in check[1]:
        result.append(f'$_({string_c})'.join(check))

print(result)

#['select_string_$_(string_A)_blah_string_A', 'select_string_$_(string_B)_blah_string_B', 'select_string_$_(string_C)_blah_string_C', 'non_select_arbitrary_string']

From here you can write your result back to the file.

import re

fin = open("input.txt", "r")
fout = open("output.txt", "w")

for line in fin:
    line = re.sub(r'^(select_string_\$_)(.*?(string_A|string_B|string_C))$', r'\1f(\3)\2', line)
    fout.write(line)

Given your example, this produces:

select_string_$_f(string_A)_blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_A)_blah_string_A
non_select_arbitrary_string
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_B)_blah_string_B
select_string_$_f(string_B)_blah_string_B
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_C)_blah_string_C
non_select_arbitrary_string
non_select_arbitrary_string
select_string_$_f(string_C)_blah_string_C

Regex explanation:

^                                   # beginning of line
  (select_string_\$_)               # group 1, literally "select_string_$_"
  (                                 # group 2
    .*?                             # 0 or more any character
    (string_A|string_B|string_C)    # group 3 one of string_A or string_B or string_C
  )                                 # end group 3
$                                   # end of line

Replacement:

\1              # content of group 1
f(\3)           # f(, content of group 3, )  
\2              # content of group 2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM