简体   繁体   中英

Scraping text data from different domain urls using Python

Is there any way to scrape only the text data from different domain urls in Python?

For example in this website the text is in a different block than in this page. I would like to write a single function that would allow me to scrape the text from both these websites at the same time. Is that possible in Python?

The only possible thing in python is to scrape the whole text of a page. You can do that using that code.

import requests
from bs4 import BeautifulSoup
r = requests.get('https://www.businessinsider.in/tech/news/airbnb-is-getting-ripped-apart-for-asking-renters-to-donate-money-to-landlords/articleshow/76968577.cms')
soup = BeautifulSoup(r.text, 'html.parser')
texet = soup.find('html').text
print(texet)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM