简体   繁体   English

如何在 HTML 标签中打印第一个元素

[英]how to print 1st element in HTML tag

My code gets links/HTML from different "sections" of a page.我的代码从页面的不同“部分”获取链接/HTML。

It prints 2 links per section, however I only want the first one printed.它每部分打印 2 个链接,但我只希望打印第一个。

Expected output should not contain the links ending with "video", as it does with my code.预期的 output 不应包含以“视频”结尾的链接,就像我的代码一样。

from selenium import webdriver
from bs4 import BeautifulSoup
import time
driver = webdriver.Chrome()
jam=[]
baseurl='https://meetinglibrary.asco.org'
driver.get('https://meetinglibrary.asco.org/results?meetingView=2020%20ASCO%20Virtual%20Scientific%20Program&page=1')
time.sleep(3)
page_source = driver.page_source
soup = BeautifulSoup(page_source,'html.parser')
productlist=soup.find_all('a',class_='ng-star-inserted')
for item in productlist:
    for link in item.find_all('a',href=True):
        jam.append(baseurl+link['href'])
print(jam)

Use os.path.basename to get the end of string.And use in operator to check whether "video" exists:使用os.path.basename获取字符串的结尾。并使用in运算符检查"video"是否存在:

from selenium import webdriver
from bs4 import BeautifulSoup
import time
import os

driver = webdriver.Chrome()
jam = []
baseurl = 'https://meetinglibrary.asco.org'
driver.get('https://meetinglibrary.asco.org/results?meetingView=2020%20ASCO%20Virtual%20Scientific%20Program&page=1')
time.sleep(3)
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser')
productlist = soup.find_all('a', class_='ng-star-inserted')
for item in productlist:
    for link in item.find_all('a', href=True):
        url = link['href']
        if "video" not in os.path.basename(url):
            jam.append(baseurl + url)
print(jam)

result:结果:

['https://meetinglibrary.asco.org/record/185955/abstract',
 'https://meetinglibrary.asco.org/record/185955/slide',
 'https://meetinglibrary.asco.org/record/185954/abstract',
 'https://meetinglibrary.asco.org/record/186048/abstract',
 'https://meetinglibrary.asco.org/record/186048/slide',
 'https://meetinglibrary.asco.org/record/190197/slide',
 'https://meetinglibrary.asco.org/record/192623/slide',
 'https://meetinglibrary.asco.org/record/185414/abstract',
 'https://meetinglibrary.asco.org/record/185414/slide',
 'https://meetinglibrary.asco.org/record/185415/abstract',
 'https://meetinglibrary.asco.org/record/185415/slide',
 'https://meetinglibrary.asco.org/record/185473/abstract',
 'https://meetinglibrary.asco.org/record/185473/slide',
 'https://meetinglibrary.asco.org/record/187584/slide',
 'https://meetinglibrary.asco.org/record/188561/slide',
 'https://meetinglibrary.asco.org/record/186710/abstract',
 'https://meetinglibrary.asco.org/record/186710/slide',
 'https://meetinglibrary.asco.org/record/186699/abstract',
 'https://meetinglibrary.asco.org/record/186699/slide',
 'https://meetinglibrary.asco.org/record/186698/abstract',
 'https://meetinglibrary.asco.org/record/186698/slide',
 'https://meetinglibrary.asco.org/record/187720/slide',
 'https://meetinglibrary.asco.org/record/187480/abstract',
 'https://meetinglibrary.asco.org/record/187480/slide',
 'https://meetinglibrary.asco.org/record/191961/slide',
 'https://meetinglibrary.asco.org/record/192626/slide',
 'https://meetinglibrary.asco.org/record/186983/abstract',
 'https://meetinglibrary.asco.org/record/186983/slide',
 'https://meetinglibrary.asco.org/record/188580/abstract',
 'https://meetinglibrary.asco.org/record/188580/slide',
 'https://meetinglibrary.asco.org/record/189047/abstract',
 'https://meetinglibrary.asco.org/record/189047/slide',
 'https://meetinglibrary.asco.org/record/190223/slide',
 'https://meetinglibrary.asco.org/record/190273/slide',
 'https://meetinglibrary.asco.org/record/184812/abstract',
 'https://meetinglibrary.asco.org/record/184812/slide',
 'https://meetinglibrary.asco.org/record/184927/slide',
 'https://meetinglibrary.asco.org/record/184805/abstract',
 'https://meetinglibrary.asco.org/record/184805/slide',
 'https://meetinglibrary.asco.org/record/184811/abstract',
 'https://meetinglibrary.asco.org/record/184811/slide',
 'https://meetinglibrary.asco.org/record/185576/slide',
 'https://meetinglibrary.asco.org/record/190147/slide']

You can use the condition function before appending the script.您可以在附加脚本之前使用条件 function。

...
for item in productlist:
    ahrefs = item.find_all('a', href=True)
    for index in range(len(ahrefs)):
        if (index % 2 == 0) and ('video' not in ahrefs[index]['href']):
            jam.append(baseurl+ahrefs[index]['href'])
print(jam)
...

Let me know after trying.尝试后告诉我。 Good luck祝你好运

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在两个 arguments function 中仅打印第一个参数? - how to print only 1st argument in two arguments function? 拆分后如何 select 列的第一个元素? - How to select the 1st element of the column after splitting? 如何在 python 中准确获取字典第一行的元素 - how to get exactly element of 1st row of dictionary in python Python如何获取日期令牌的第一个元素 - Python How to get 1st element of date token for循环跳过第一和第二元素 - for loop skipping 1st and 2nd element BeautifulSoup 仅检索第一个元素 - BeautifulSoup Only Retrieves 1st element 如何在python中的2D列表中比较第一个元素并操作第二个元素 - How to compare 1st element and operate 2nd element in 2D list in python TensorFlow:如何将张量的行与具有相同第一个元素的张量的第二个元素相加来合并? - TensorFlow: How to combine rows of tensor with summing the 2nd element of tensor which has the same 1st element? 如何遍历列表直到到达最后一个元素,然后向后遍历直到到达第一个元素,然后重复? - How to traverse a list until it reaches the last element, then traverse it backwards until it reaches the 1st element, then repeat? 如何打印第 1 名、第 2 名等直至第 5 名? - How do I print 1st place, 2nd place, etc. up to 5th place?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM