[英]Python post requests - web scraping
所以我正在嘗試訪問 web 抓取的一些數據。 但是,當我要從該站點提取圖表時,我想編輯下面代碼中觀察到的數據時間段時遇到了困難。 有什么方法可以提取或更改此代碼段從激活 data-timeperiod="today" 到 data-timeperiod="week"?
對於一些額外的信息,我嘗試訪問 chrome 中的網絡選項卡以通過發布請求更改此設置,但每次我都被拒絕訪問。
<div class="fLeft">
<ul class="chartsTimeperiod cleanList floatList clearFix buttonPane">
<li class="active">
<a href="#" data-timeperiod="today" class="active default">
1 d.</a>
</li>
<li class="">
<a href="#" data-timeperiod="week" class="">
1 v.</a>
</li>
<li class="">
<a href="#" data-timeperiod="month" class="">
1 mån.</a>
</li>
<li class="">
<a href="#" data-timeperiod="three_months" class="">
3 mån.</a>
</li>
<li class="">
<a href="#" data-timeperiod="this_year" class="">
i år</a>
</li>
<li class="">
<a href="#" data-timeperiod="year" class="">
1 år</a>
</li>
<li class="last">
<a href="#" data-timeperiod="three_years" class="">
3 år</a>
</li>
</ul>
</div>
我可以通過 Network 選項卡看到有一個包含以下數據的請求有效負載。 這是我應該用來訪問數據的東西還是我走錯了路?
{"orderbookId":842107,"chartType":"AREA","widthOfPlotContainer":558,"chartResolution":"MINUTE","navigator":true,"percentage":false,"volume":false,"owners":false,"timePeriod":"week","ta":[],"compareIds":[19002]}
問題 2 - 示例:基於此
<form method="get" class="forumPagerForm">
<label for="pageSizeSelect" class="fLeft marginTop5px">Visa antal inlägg:</label>
<select id="pageSizeSelect" class="pageSizeSelect">
<option >15</option>
<option >25</option>
<option >50</option>
<option >75</option>
<option >100</option>
<option selected="selected">200</option>
</select>
</form>
嘗試:
import requests
janson = {
"orderbookId": '842107',
"chartType": "AREA",
"widthOfPlotContainer": '558',
"chartResolution": "MINUTE",
"navigator": 'true',
"percentage": 'false',
"volume": 'false',
"owners": 'false',
"timePeriod": "week",
"ta": [],
"compareIds": ['19002']
}
s = requests.Session()
s.get('https://www.avanza.se/aktier/om-aktien.html/842107/gabather')
p = s.post('https://www.avanza.se/ab/component/highstockchart/getchart/orderbook', json=janson)
print(p)
然后從變量p中抓取
你想從圖表中得到點,是嗎? 如果您將圖形分辨率從“周”更改為“月”,然后查看網絡流量記錄器,您可以看到瀏覽器向https://www.avanza.se/ab/component/highstockchart/getchart/orderbook
發出 HTTP POST 請求https://www.avanza.se/ab/component/highstockchart/getchart/orderbook
。
簡單地模仿那個請求。 在這里,圖形分辨率設置為"week"
,但您應該可以將其更改為"month"
等。然后我提出請求並打印前十點:
def main():
import requests
url = "https://www.avanza.se/ab/component/highstockchart/getchart/orderbook"
data = {
"chartResolution": "MINUTE",
"chartType": "AREA",
"compareIds": [19002],
"navigator": True,
"orderbookId": 842107,
"owners": False,
"percentage": False,
"ta": [],
"timePeriod": "week",
"volume": False,
"widthOfPlotContainer": 558
}
response = requests.post(url, json=data)
response.raise_for_status()
data = response.json()
for y, x in data["dataPoints"][0:10]:
print(x, y)
return 0
if __name__ == "__main__":
import sys
sys.exit(main())
Output:
None 1594103400000
8.36 1594105200000
8.4 1594107000000
8.26 1594108800000
8.3 1594110600000
8.42 1594112400000
8.54 1594114200000
8.5 1594116000000
8.52 1594117800000
8.6 1594119600000
>>>
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.