I have a list of tuples that I want to remove the url extensions from. Here's what it looks like
['google.com', 'google.ru', 'google.ca']
Basically, I want to remove everything after the " . " in each one so that I'm returned with something like this
['google', 'google', 'google']
My instructions specifically tell me to use the split()
function, but I'm confused with that as well. If it's also possible, I need to remove duplicates, so my final result would be:
['google']
Thanks for the help, sorry if my specifications are odd.
This def
removes url extensions:
def removeurlextensions(L):
L2 = []
for x in range(len(L)):
L2.append(L[x].split('.')[0])
return L2
To print your list:
L = ['google.com', 'google.ru', 'google.ca']
print(removeurlextensions(L))
#prints ['google', 'google', 'google']
To remove duplicates you can use list(set())
:
L = ['google.com', 'google.ru', 'google.ca']
print(list(set(removeurlextensions(L))))
#prints ['google']
This will only work if all items are strings:
for i in range(len(my_list)):
my_list[i] = my_list[I].split('.')[0]
already_in_list = []
for item in my_list:
if item in already_in_list:
my_list.pop(item)
else:
already_in_list.append(item)
print(my_list)
I did do this from memory so if there is a bug please let me know.
You can simply use split
.
ls = ['google.com', 'google.ru', 'google.ca']
print([i.split('.', 1)[0] for i in ls])
# result = ['google', 'google', 'google']
And to remove the duplicate, you might want to use set
.
mod = [i.split('.', 1)[0] for i in ls]
print(list(set(mod)))
# result = ['google']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.