简体   繁体   中英

Web scraping from js website

I want to scrape the form data from the https://www.investing.com/commodities/gold-historical-data , but this form generate by js. I tried to imacros to see the action and got this:

TAG POS=1 TYPE=DIV ATTR=ID:widgetFieldDateRange
    TAG POS=1 TYPE=A ATTR=TXT:20
    TAG POS=2 TYPE=A ATTR=TXT:13
    TAG POS=1 TYPE=A ATTR=ID:applyBtn

Can anyone tell me how to change this to python code which I can use in selenium?

It seems like you need a POST request (Ajax).

How did I find that?

Well, I inspected the XHR from the Network section

investing_ajax_post

The POST data you need is (replace with the dates you want):

curr_id=8830
smlID=300004
st_date=08/09/2017
end_date=08/21/2017
interval_sec=Daily
sort_col=date
sort_ord=DESC
action=historical_data

The IDs from the aobe POST data, probably are for this market only(gold-historical-data), so for others inspect the network again and see POST data everytime.

How do you implement this in Python?

You need a module called requests .

Specifically, read this

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM