Python - parsing comments from Steam

Question

I want to practice how to parse values from website. However, when I parse the comments from Steam, I only can parse the first page of comment. How do I crawl all the comments?

Here is my code:

from bs4 import BeautifulSoup
import urllib.request

url = 'http://steamcommunity.com/games/dota2/announcements/detail/1449457773770927103'
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html, 'lxml')
for t in soup.body.find_all('div', attrs = {'class':'commentthread_comment_text'}):    
    print(t.text)

Answer 1

If you open up your dev console, click on network, then click on the next button, you'll see that the page is making a request to the following url:

https://steamcommunity.com/comment/ClanAnnouncement/render/103582791433224455/1449457773770927103/

EDIT:

In the response body you'll see the following 3 properties: start , pagesize , total_count . If you keep attaching query parameters, you'll be able to fetch all comments: https://steamcommunity.com/comment/ClanAnnouncement/render/103582791433224455/1449457773770927103/?start=10

https://steamcommunity.com/comment/ClanAnnouncement/render/103582791433224455/1449457773770927103/?start=20

Python - parsing comments from Steam

Question

1 answers

solution1
0 ACCPTED 2017-11-28 07:33:17

Python - parsing comments from Steam

Question

1 answers

solution1 0 ACCPTED 2017-11-28 07:33:17

solution1
0 ACCPTED 2017-11-28 07:33:17