简体   繁体   中英

python web scraping: onclick ajax request returns nothing with status 200

I am trying to scrape a table data from a website. The data I want is "hiding" behind an onclick event.

<a class="text" onclick="javascript:openPAOnSR_RS('some_sku', 'brandname','divId', 'some_args','OPC Page Details');cmTagAndLink('Open Link','OPC Page Details',null,null,null);">The Click</a>

After clicking, there is a post request and some of the details below.

Request URL:http://www.somewebsite.com/catalog/tables.do?some_sku=sku&brandKey=brandname&divId=divId
Request Method:POST
Status Code:200 OK
Remote Address:23.xxxxxxxxxxx
Referrer Policy:no-referrer-when-downgrade

So I wrote the code as below but it did not return anything.

from urllib.parse import urlencode
from requests.exceptions import RequestException
import requests


def get_page_index():
    string_param = {
        'some_sku': 'sku',
        'brandKey': 'brandname',
        'divId': 'divId'
    }

    url = "http://www.somewebsite.com/catalog/tables.do?" + urlencode(string_param)
    try:
        response = requests.post(url=url, data=string_param)
        if response.status_code == 200:
            print(response.url, response.content)
            return response.text
        return None
    except RequestException as e:
        print(e)

I am getting no output and the status shows 200. How should I get the data "behind" on click event?

urllib will only respond you with the html content, so you can't interfere with the JS stuff on that website, there are modules like robobrowser , scrapy but they only click the html check boxes or buttons.
so other options with are preferable are.

1) Selenium by using a headless browser using Phantom .

2) Using Scrapy + splash

can i ask that after what steps you are doing before clicking the button?
are you clicking on the button after putting some info. or you are just clicking the button as the website appears?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM