简体   繁体   English

如何 web 抓取元内容 - Python web 抓取问题

[英]How to web scrape meta content - Python web scraping question

I want to only scrape the word "Automobile" not the entire line with the meta brackets.我只想用元括号刮掉“汽车”这个词而不是整行。

Desired output: "Automobile"所需 output:“汽车”

Can you please tell me how to fix this?你能告诉我如何解决这个问题吗? Thanks!谢谢!

from bs4 import BeautifulSoup
import requests
import csv

URL = 'https://www.electrive.com/2022/02/13/skoda-reveals-uk-pricing-for-enyaq-coupe-iv-vrs/'

(response := requests.get(URL)).raise_for_status()
soup = BeautifulSoup(response.text, 'lxml')
category2 = soup.find('meta', property='article:section')
print(category2)

Output: Output:

<meta content="Automobile" property="article:section"/>

Just add ['content'] to your soup object.只需将['content']添加到您的soup object。

import requests
from bs4 import BeautifulSoup

URL = 'https://www.electrive.com/2022/02/13/skoda-reveals-uk-pricing-for-enyaq-coupe-iv-vrs/'

(response := requests.get(URL)).raise_for_status()
soup = BeautifulSoup(response.text, 'lxml')
category2 = soup.find('meta', property='article:section')['content']
print(category2)

Output: Output:

Automobile

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM