I have a dictionary variable "d" with key ,an integer, and value as a list of strings.
368501900 ['GH131.hmm ', 'CBM1.hmm ']
368499531 ['AA8.hmm ']
368500556 ['AA7.hmm ']
368500559 ['GT2.hmm ']
368507728 ['GH16.hmm ']
368496466 ['AA2.hmm ']
368504803 ['GT21.hmm ']
368503093 ['GT1.hmm ', 'GT4.hmm ']
The code is like this:
d = dict()
for key in d:
dictValue = d[key]
dictMerged = list(sorted(set(dictValue), key=dictValue.index))
print (key, dictMerged)
However, I want to remove string after the numbers in the lists so I can have a result like this:
368501900 ['GH', 'CBM']
368499531 ['AA']
368500556 ['AA']
368500559 ['GT']
368507728 ['GH']
368496466 ['AA']
368504803 ['GT']
368503093 ['GT']
I think the code should be inserted between dictValue and dictMerged, but I cannot make a logic. Please, any ideas?
import this at the beginning
import re
now use this line between dictValue and dictMerged
new_dict_value = [re.sub(r'\d.*', '', x) for x in dictValue]
and then use new_dict_value in the next line
String objects have a nice .isdigit()
method. Here are some non- re
solutions for cleaning your data.
Plain old loop:
values = ['GT1.hmm ', 'GT4.hmm ']
clean_values = []
for item in values:
clean_item = []
for c in item:
if c.isdigit():
break
clean_item.append(c)
clean_values.append("".join(clean_item))
list comprehension using a StopIteration
exception to act as a break
inside of a generator expression: ( Note using this stop()
method in a list comprehension doesn't work, it requires a generator expression, normally denoted by ()
, but inside of a .join()
these are optional.
def stop():
raise StopIteration
values = ['GT1.hmm ', 'GT4.hmm ']
clean_values = ["".join(c if not c.isdigit() else stop() for c in item) for item in values]
list comprehension using itertools.takewhile
:
from itertools import takewhile
values = ['GT1.hmm ', 'GT4.hmm ']
clean_values = ["".join(takewhile(lambda c: not c.isdigit(),item)) for item in values]
Examples derived from:
http://tech.pro/tutorial/1554/four-tricks-for-comprehensions-in-python#breaking_the_loop
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.