简体   繁体   中英

how to add wildcards to the re.sub and remove all characters before the _

i have the following code

    infile = botslib.opendata(ta_from.filename,'r')
    tofile = botslib.opendata(str(ta_to.idta),'wb')
    start = infile.readline()
    import textwrap
    import re
    lines= "\r\n".join(textwrap.wrap(start, 640))
    for line in lines:
        re.sub('^\...[_]*', '',line)
        tofile.write(line.split('_')[-1])
    infile.close()
    tofile.close()

the input is

Ichg_UNBUNOA3 14 2090100000015 14 1304221445000001

the ouput now is IchgUNBUNOA3 14 2090100000015 14 1304221445000001

but i expect it to be UNBUNOA3 14 2090100000015 14 1304221445000001

ichg can also be grp1 grp12

what am i doing wrong?

try this:

print re.sub('^[^_]*_', '', 'Ichg_UNBUNOA3 14 2090100000015 14 1304221445000001')

explanation:

^ begin of the line
[^_]* all characters that are not underscore 0 or more time
_ underscore

you must make a choice between split (best way in my opinion) or regex (you have written the two!)

for line in lines:
    tofile.write(line.split('_')[-1])

or

for line in lines:
    tofile.write(re.sub('^[^_]*_', '', line))

The main problem is that your lines variable isn't actually a list of lines - it's a single string containing the wrapped lines joined together. As a result, you're looping over the string one character at a time rather than processing one line at a time.

You need to get rid of the "\\r\\n".join call around the textwrap call and you should have the lines in a list as intended.

As for the regex: besides being wrong, that code serves no purpose, since you never actually assign the result of the re.sub call to anything. But it's not needed anyway, since the split call below achieves the same thing.

In short, your code should just look more like this:

infile = botslib.opendata(ta_from.filename,'r')
tofile = botslib.opendata(str(ta_to.idta),'wb')
start = infile.readline()
import textwrap
lines= textwrap.wrap(start, 640)
for line in lines:
    tofile.write(line.split('_')[-1])
infile.close()
tofile.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM