简体   繁体   中英

JavaScript Disabled error while web scraping twitter in Python in BeautifulSoup

I am new to this world of web scraping . I was trying to scrape twitter with BeautifulSoup in Python.

Here's my code:

from bs4 import BeautifulSoup
import requests

request = requests.get("https://twitter.com/mybmc").text

soup = BeautifulSoup(request, 'html.parser')

print(soup.prettify())

But I am getting a large output which is not the twitter page which I am looking for but there is a error container: Output Image

which says JavaScript is disabled in this browser. I tried changing my default browsers to Chrome, Firefox and Microsoft Edge but the out was same.

What should I do in this case?

Twitter here seem to be specifically trying to prevent scrapers of the front end, probably with the view that you should use their REST API to fetch that same data. It is not to do with your default browsers, but that requests.get will be providing a python requests user agent, which specifically doesn't support Javascript.

I'd suggest using a different page to practice on, or if it must be the twitter front page, consider using selenium perhaps with a standalone container to scrape.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM