简体   繁体   English

如何排除所有标题与查找?

[英]how to exclude all title with find?

i have function that get me all the titles from my website i dont want to get the title from some products is this the right way ? 我具有从我的网站获取所有标题的功能,我不想从某些产品获取标题,这是正确的方法吗? i dont want titles from products with the words "OLP NL" or "Arcserve" or "LicSAPk" or "symantec" 我不希望产品的标题为“ OLP NL”或“ Arcserve”或“ LicSAPk”或“ symantec”

def get_title ( u ):
html = requests.get ( u )
bsObj = BeautifulSoup ( html.content, 'xml' )
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', 
'' )
if (title.find ( 'Arcserve' ) or title.find ( 'OLP NL' ) or title.find ( 
'LicSAPk' ) or title.find (
        'Symantec' ) is not -1):
    return 'null'
else:
    return title

            if (title != 'null'):
            ws1 [ 'B1' ] = title
            meta_desc = get_metaDesc ( u )
            ws1 [ 'C1' ] = meta_desc
            meta_keyWrds = get_metaKeyWrds ( u )
            ws1 [ 'D1' ] = meta_keyWrds
            print ( "writing product no." + str ( i ) )
        else:
            print("skipped product no. " + str ( i ))
            continue;

the problem is that the program exclude all my products and all i'm seeing is "skipped product no." 问题在于该程序排除了我所有的产品,而我所看到的只是“跳过的产品编号”。 ? whay ? ay? not all of them have these words ... 并非所有人都有这些话...

You can change the if statement for (title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1) or you can create a function to evaluate the terms that you want to find 您可以更改(title.find ( 'Arcserve' )!=-1 or title.find ( 'OLP NL' )!=-1 or title.find ('LicSAPk' )!=-1 or title.find ('Symantec' )!=-1) ,也可以创建一个函数来评估要查找的术语

def TermFind(Title):
    terms=['Arcserve','OLP NL','LicSAPk','Symantec']
    disc=False
    for val in terms:
        if Title.find(val)!=-1:
            disc=True
            break
    return disc

When I used the if statement always returned True regardless of the title value. 当我使用if语句时,无论标题值如何,始终返回True。 I couldn't find an explanation for such behavior, but you can try checking this [ Python != operation vs "is not" and [ nested "and/or" if statements . 我找不到这种行为的解释,但是您可以尝试检查此[ Python!=操作vs“不是”和[ 嵌套“和/或” if语句 Hope it helps. 希望能帮助到你。

A similar idea using any 使用any类似的想法

import requests 
from bs4 import BeautifulSoup

url = 'https://www.cdsoft.co.il/index.php?id_product=300610&controller=product'
html = requests.get(url)
bsObj = BeautifulSoup(html.content, 'lxml')
title = str ( bsObj.title ).replace ( '<title>', '' ).replace ( '</title>', '' )
items = ['Arcserve','OLP NL','LicSAPk','Symantec']

if not any(item in title for item in items):
    print(title)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM