简体   繁体   中英

Converting a sed regular expression to python code

I can understand the following sed regular expression.

 sed 's/.*\(SNAP=[^|]*\) |.*/\1/' | sort | uniq -c > $log.snaps

I have the task of converting this bash line to Python code. What is best way to do this? Simply invoke os.system(cmd) with above as cmd ? Or use the Python re module ? Any pseudo code is most welcome. :)

You asked for the best way, I'm just giving you a simple one. You could surely optimize it. But still, it is worth testing with your constraints, since invoking a shell takes some time.
It should be worth noting that pipes in shell might be a great way to have faster code, since sed can start to work whithout waiting for cat to finish. sort will also be able to begin its work but obviously will only output when sed is done working. So it is a great way to use your CPU during your IOs and should be considered as a low effort/good performance solution.
I've tried with a simple example, but you will get the idea :

In test :

love
lol
loki
loki
ki
loutre
poutre

Simple bash command, looking like yours :

cat test | sed 's/lo\(.*\)$/\1/' | sort | uniq

Outputs :

ki
l
poutre
utre
ve

Now let's try to do the same in python :

#!/usr/bin/python

import re

s = """love
lol
loki
loki
ki
loutre
poutre"""

arr = s.split('\n')                                             # sed iterates on each line
arr = map((lambda line: re.sub(r'lo(.*)$', r'\1', line)), arr)  # sed
arr = set(arr)                                                  # uniq
arr = sorted(list(arr))                                         # sort

print '\n'.join(arr)                                            # output it

This could also be written in a ugly line of code :

print '\n'.join(sorted(list(set(map((lambda line: re.sub(r'lo(.*)$', r'\1', line)), s.split('\n'))))))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM