Python Instagram Web scraper troubles

Question

I am try to build a web scraper to that tells me the number of times a hashtag is used on Instagram but I keep getting either a error code on different iterations or "None" for current the response. Here is my code and the html.

Python

import requests
from bs4 import BeautifulSoup
url = 'https://www.instagram.com/explore/tags/savethekids/'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
tag = soup.find("span", {"class": "g47SY "})
print(tag)

Thats the code I made

HTML

<span class="-nal3 ">
  <span class="g47SY ">22,922</span> 
   " posts"
</span>

That is the HTML from Instagram

If anyone who actually knows what they are doing could point out what i'm doing wrong and how to fix it that would be great.

Answer 1

Try this,

import requests

url = 'https://www.instagram.com/explore/tags/savethekids/?__a=1'

response = requests.get(url)

count = response.json().get('graphql', {}).get('hashtag', {}).get('edge_hashtag_to_media', {}).get('count')

print(count)

Output:

See it in action here

Answer 2

The issue when using requests is that the html is not rendered yet. Try following tutorial on scraping instagram.

This uses a tool called selenium to get the actual html from instagram.

The following code should get the element you are looking for when you have the selnium webdriver working.

from selenium.webdriver import Chrome
browser = Chrome()
url = 'https://www.instagram.com/explore/tags/savethekids/'
browser.get(url)
print(browser.find_element_by_class_name('g47SY'))

Python Instagram Web scraper troubles

Question

2 answers

solution1
1 2020-03-03 14:50:57

solution2
-1 2020-03-03 03:36:46

Python Instagram Web scraper troubles

Question

2 answers

solution1 1 2020-03-03 14:50:57

solution2 -1 2020-03-03 03:36:46

solution1
1 2020-03-03 14:50:57

solution2
-1 2020-03-03 03:36:46