I am try to build a web scraper to that tells me the number of times a hashtag is used on Instagram but I keep getting either a error code on different iterations or "None" for current the response. Here is my code and the html.
Python
import requests
from bs4 import BeautifulSoup
url = 'https://www.instagram.com/explore/tags/savethekids/'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
tag = soup.find("span", {"class": "g47SY "})
print(tag)
Thats the code I made
HTML
<span class="-nal3 ">
<span class="g47SY ">22,922</span>
" posts"
</span>
That is the HTML from Instagram
If anyone who actually knows what they are doing could point out what i'm doing wrong and how to fix it that would be great.
Try this,
import requests
url = 'https://www.instagram.com/explore/tags/savethekids/?__a=1'
response = requests.get(url)
count = response.json().get('graphql', {}).get('hashtag', {}).get('edge_hashtag_to_media', {}).get('count')
print(count)
Output:
22924
See it in action here
The issue when using requests is that the html is not rendered yet. Try following tutorial on scraping instagram.
This uses a tool called selenium to get the actual html from instagram.
The following code should get the element you are looking for when you have the selnium webdriver working.
from selenium.webdriver import Chrome
browser = Chrome()
url = 'https://www.instagram.com/explore/tags/savethekids/'
browser.get(url)
print(browser.find_element_by_class_name('g47SY'))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.