简体   繁体   中英

Python - how to remove specific words from list?

I have a list like:

 defaultdict(<class 'list'>, {'Web': ['site: www.domain.com'], 'Phone': ['(111) 222-333', '(222) 333-444'], 'VAT': ['987654321'], 'Fax': ['(444) 555-666', '(777) 888-999'], 'E-mail': ['adress: mail@domain.com', 'address: mail2@domain.com'], 'ID': ['number:1234567890']})

I want to clean words like: site: , adress: number: .

Output should be:

 defaultdict(<class 'list'>, {'Web': ['www.domain.com'], 'Phone': ['(111) 222-333', '(222) 333-444'], 'VAT': ['987654321'], 'Fax': ['(444) 555-666', '(777) 888-999'], 'E-mail': ['mail@domain.com', 'mail2@domain.com'], 'ID': ['1234567890']})

I know that I can remove words from specific list item like:

for em in d["E-mail"]:
    print(em.replace("address: ","",1))

but I'm looking for something that would clean whole list.

You just want the substring after the : , so either splitting will get us the substring or nothing will be removed if there is no : in the string:

for k,v in d.items():
    d[k] = [s.split(":", 1)[-1].lstrip() for s in v ]



{'E-mail': ['mail@domain.com', 'mail2@domain.com'], 'Phone': ['(111) 222-333', '(222) 333-444'], 'ID': ['1234567890'], 'Web': ['www.domain.com'], 'VAT': ['987654321'], 'Fax': ['(444) 555-666', '(777) 888-999']}

Using [-1] as the index will mean we either get the second of two or the only string if there is nothing split. We also need to lstrip any leading whitespace from the substring after splitting.

You could also apply the same logic as you add the data to your defaultdict to avoid having to iterate over and mutate the dict values after they have already been assigned.

for em in dict:

   if ":" in dict[em]

Try this code here

Similar to Padraic Cunningham response but with regex:

In [39]: import re

In [40]: s = re.compile('[a-zA-Z]+:\s?')

In [41]: d={'Web': ['site: www.domain.com'], 'Phone': ['(111) 222-333', '(222) 333-444'], 'VAT': ['987654321'], 'Fax': ['(444) 555-666', '(777) 888-999'], 'E-mail': ['adress: mail@domain.com', 'address: mail2@domain.com'], 'ID': ['number:1234567890']}

In [42]: def clean(dict_):
   ....:     for k, v in dict_.items():
   ....:         dict_[k] = map(lambda x: s.sub('', x), v)

In [43]: clean(d)
{'E-mail': ['mail@domain.com', 'mail2@domain.com'],
 'Fax': ['(444) 555-666', '(777) 888-999'],
 'ID': ['1234567890'],
 'Phone': ['(111) 222-333', '(222) 333-444'],
 'VAT': ['987654321'],
 'Web': ['www.domain.com']}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM