简体   繁体   中英

Remove a specific phrase from a list of lists

I have stored data in a list of lists (I couldn't use a dict because I need to have duplicate keys). The list is like:

data = [[1, "name email@email.com address"], [2, "name2 email@@email2.com address"], ...]

My goal is to remove the email addresses from the data list (ie the list of lists). Unfortunately, the email addresses are all different. They only share one common trait: they all contain the symbol "@".

I tried to use list comprehensions. However, I can only do it so that the entire element gets removed, ie "name email@email.com address" gets removed entirely:

newlist = [element for element in data.split() if "@" not in elment]

I thought of splitting "name email@email.com address" into sublists using " " as the delimiter. However, that presents a problem as well: It ruins the format. It would be difficult for me to group the lists together to the initial format, because sometimes "name email@email.com address" contains more than three words. For example, it could be ""name1 name2 name3 email@email.com email2 email3 address1 address2 address3".

What is the best way of doing this?

EDIT:

To answer Adam Smith's question, I'm looking for

data = [[1, "name address"], [2, "name2 address"], ...]

as my output. In other words, the original format (list of lists, where the sublists contain two elements, one being the number and the other one being "name, address, address1, etc") is preserved without the email addresses.

data = [[1, "name email@email.com address"], [2, "name2 email@@email2.com address"],[3, "name1 name2 name3 email@email.com email2 email3 address1 address2 address3"]]

for ind,d in enumerate(data):
         data[ind]=[d[0]," ".join([x for x in d[1].split() if "@" not in x])] # add the int first then change elements from  index 1. 
print data

[[1, 'name address'], [2, 'name2 address'], [3, 'name1 name2 name3 email2 email3 address1 address2 address3']]

I think you should split on the '@' character and then iterate through the list of strings generated by the split pairing the first element from its end using rfind to look for a space character and second element from the start up until the first space. Then, remove those substrings. If it's the case that there will possibly be more than one email address, you would need to do the same for all remaining elements (pairing the second and third elements, pairing the third and fourth elements, etc.) to see if there are any other substrings to remove.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM