I'm parsing this line
-
0386 ; Greek # L& GREEK CAPITAL LETTER ALPHA WITH TONOS
Basically, I need -
point = 0386
script = Greek
And I'm doing it like this,
point = line.split(";")[0].replace(" ","")
script = line.split("#")[0].split(";")[1].replace(" ","")
I'm not convinced that what I'm doing is the most pythonic way of doing it, is there a more elegant way of doing this? Maybe a regex one-liner?
If you want a regex one liner:
point, script = re.search("^(\d+)\s*;\s*(\S+)\s*.*$",s).groups()
where s
is your string, and of course you need to import re
>>> code, desc = line[:line.rfind('#')].split(';')
>>> code.strip()
'0386'
>>> desc.strip()
'Greek'
Using map
with unbound method str.strip
:
>>> line = '0386 ; Greek # L& GREEK CAPITAL LETTER ALPHA WITH TONOS'
>>> point, script = map(str.strip, line.split('#')[0].split(';'))
>>> point
'0386'
>>> script
'Greek'
Using list comprehension:
>>> point, script = [word.strip() for word in line.split('#')[0].split(';')]
>>> point
'0386'
>>> script
'Greek'
This is how I would've done it:
>>> s = "0386 ; Greek # L& GREEK CAPITAL LETTER ALPHA WITH TONOS"
>>> point = s.split(';')[0].strip()
>>> point
'0386'
>>> script = s.split(';')[1].split('#')[0].strip()
>>> script
'Greek'
Note that you can re-use s.split(';')
. So perhaps saving it to a var
would be a good idea:
>>> var = s.split(';')
>>> point = var[0].strip() # Strip gets rid of all the whitespace
>>> point
'0386'
>>> script = var[1].split('#')[0].strip()
>>> script
'Greek'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.