使用Python和Beautiful汤搜索字符串的一部分

Question

I am currently using Beautiful Soup to try and find link text on a website and then to pull the links. 我目前正在使用Beautiful Soup尝试在网站上查找链接文本，然后拉出链接。 I am using the following code: 我正在使用以下代码：

source = requests.get('http://www.website').text
page = BeautifulSoup(source, 'lxml')
for article in page.find_all('article'):
    for a in article.find_all('a', string=['something']) and article.find_all('a', string=['something']):
        link = a['href']
        print(link)

The issue is, Beautiful Soup only finds the links if I have the exact link text, which is not always possible. 问题是，Beautiful Soup仅在我具有确切的链接文本的情况下才能找到链接，这并不总是可能的。 Is there a way I can search for a link by a portion of its link text? 有什么方法可以通过链接文本的一部分来搜索链接吗？

Answer 1

regex example: 正则表达式示例：

import re
r = re.compile('something|somethingelse')
for a in article.find_all('a', string=r):
    print(a['href'])

from the version you have: 从您拥有的版本中：

from itertools import chain
c = chain(article.find_all('a', string=['something']), 
          article.find_all('a', string=['somethingelse']))
for a in c:
    print(a['href'])

function example: 功能示例：

def any_string(s):
    ok = ['something', 'somethingelse']
    return (s in ok)

for a in article.find_all('a', string=any_string):
    print(a['href'])

使用Python和Beautiful汤搜索字符串的一部分

问题描述

1 个解决方案

解决方案1
0 2018-04-18 19:16:30

使用Python和Beautiful汤搜索字符串的一部分

问题描述

1 个解决方案

解决方案1 0 2018-04-18 19:16:30

解决方案1
0 2018-04-18 19:16:30