简体   繁体   中英

How to remove characters in a string AFTER a given character?

I have a list of tuples that I want to remove the url extensions from. Here's what it looks like

['google.com', 'google.ru', 'google.ca']

Basically, I want to remove everything after the " . " in each one so that I'm returned with something like this

['google', 'google', 'google']

My instructions specifically tell me to use the split() function, but I'm confused with that as well. If it's also possible, I need to remove duplicates, so my final result would be:

['google']

Thanks for the help, sorry if my specifications are odd.

This def removes url extensions:

def removeurlextensions(L):
    L2 = []
    for x in range(len(L)):
        L2.append(L[x].split('.')[0])
    return L2

To print your list:

L = ['google.com', 'google.ru', 'google.ca']
print(removeurlextensions(L))
#prints ['google', 'google', 'google']

To remove duplicates you can use list(set()) :

L = ['google.com', 'google.ru', 'google.ca']
print(list(set(removeurlextensions(L))))
#prints ['google']

This will only work if all items are strings:

for i in range(len(my_list)):
    my_list[i] = my_list[I].split('.')[0]
already_in_list = []
for item in my_list:
    if item in already_in_list:
        my_list.pop(item)
    else:
        already_in_list.append(item)
print(my_list)

I did do this from memory so if there is a bug please let me know.

You can simply use split .

ls = ['google.com', 'google.ru', 'google.ca']
print([i.split('.', 1)[0] for i in ls])
# result = ['google', 'google', 'google']

And to remove the duplicate, you might want to use set .

mod = [i.split('.', 1)[0] for i in ls]
print(list(set(mod)))
# result = ['google']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM