简体   繁体   English

只返回一个字符串而不是两个几乎相同的

[英]Return only a string instead of two almost identical

I'm trying to get several links from a webpage, but when I print the result I get:我试图从网页中获取多个链接,但是当我打印结果时,我得到:

/t54-EXAMPLE-fansub
/t54-EXAMPLE-fansub#55

How can I only get only one of those in the output instead of both?我怎样才能在输出中只得到其中一个而不是两者?

You could do this:你可以这样做:

>>> '/t54-EXAMPLE-fansub#55'.split('#')  # just to show you the list output
['/t54-EXAMPLE-fansub', '55']
>>> '/t54-EXAMPLE-fansub#55'.split('#')[0]
'/t54-EXAMPLE-fansub'
>>> '/t54-EXAMPLE-fansub'.split('#')[0]
'/t54-EXAMPLE-fansub'

I am assuming you will have a list called "links" that contains all the links you scraped.我假设您将有一个名为“链接”的列表,其中包含您抓取的所有链接。

links = ["/t54-EXAMPLE-fansub#55","/t54-EXAMPLE-fansub","/t55-EXAMPLE-fansub"]

links = set(map(lambda x:x[:x.index('#')] if '#' in x else x, links))
for link in links:
    print(link)

This will change the type of links to a set, be careful about that.这会将链接类型更改为集合,请注意这一点。 This code is just an example implementation of what you can do: Go through the links, strip the part after the first '#' , create a set so that you can keep track of what you already encountered.此代码只是您可以执行的操作的示例实现:浏览链接,删除第一个 '#' 之后的部分,创建一个集合,以便您可以跟踪已遇到的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM