简体   繁体   中英

scraping data using scrapy

I want to scrap data from this link http://money.moneygram.com.au/ ( the link can be ( https://www.moneygram.com/wps/portal/moneygramonline/home/estimator?LC=en-GB )) when I am opening the html page to scrap data ie is the exchange rate as mentioned on this page I am getting selected option for first drop bown button is aud(austallian currency) to usd(us dollar) but how to select inr(indian ruppee) in the first option. When I am select and using this url it is by default picking aud(australlian currency). The code that I am using is ...

 from __future__ import absolute_import
    #import __init__
 from scrapy.spider import BaseSpider
 from scrapy.selector import HtmlXPathSelector
 import MySQLdb

class DmozSpider(BaseSpider):
    name = "moneygram"
    allowed_domains = ["moneygram.com"]
    start_urls = ["http://money.moneygram.com.au/"]

    def parse(self, response):
        filename = response.url.split("/")[-2]
        open(filename, 'wb').write(response.body)
        hxs = HtmlXPathSelector(response)

and the htmlpart where I want inr to be selected is..

<div class="firstSelector">
     <select id="FromCurrency_dropDown" name="FromCurrency_dropDown" style="width:100%">
     <option value="AED">AED (UAE Dirham)</option>
     <option value="ARS">ARS (Argentine Peso)</option>
     <option selected="selected" value="AUD">AUD (Australian Dollar)</option>  
     <option value="BGN">BGN (Bulgarian Lev)</option>
     <option value="BND">BND (Brunei Dollar)</option>
     <option value="BRL">BRL (Brazilian Real)</option>
     <option value="CAD">CAD (Canadian Dollar)</option>
     <option value="CHF">CHF (Swiss Franc)</option>
     <option value="CLP">CLP (Chilean Peso)</option>
     <option value="CNH">CNH (Chinese Renminbi Off-Shore)</option>
     <option value="CNY">CNY (Chinese Yuan)</option>
     <option value="CZK">CZK (Czech Koruna)</option>
     <option value="DKK">DKK (Danish Kroner)</option>
     <option value="EGP">EGP (Egyptian Pound)</option>
     <option value="EUR">EUR (Euro)</option>
     <option value="FJD">FJD (Fiji Dollar)</option>
     <option value="GBP">GBP (British Pound)</option>
     <option value="HKD">HKD (Hong Kong Dollar)</option>
     <option value="HUF">HUF (Hungarian Forint)</option>
     <option value="IDR">IDR (Indonesian Rupiah)</option>
     <option value="ILS">ILS (Israeli New Shekel)</option>
     <option value="INR">INR (Indian Rupee)</option> ////////////"i want this to be selected"///////
     <option value="ISK">ISK (Icelandic Krona)</option>
     <option value="JPY">JPY (Japanese Yen)</option>
     <option value="KRW">KRW (Korean Won)</option>
     <option value="KWD">KWD (Kuwaiti Dinar)</option>
     <option value="LKR">LKR (Sri Lanka Rupee)</option>
     <option value="MAD">MAD (Moroccan Dirham)</option>
     <option value="MGA">MGA (Malagasy Ariary)</option>
     <option value="MXN">MXN (Mexican Peso)</option>
     <option value="MYR">MYR (Malaysian Ringgit)</option>
     <option value="NOK">NOK (Norway Kroner)</option>
     <option value="NZD">NZD (New Zealand Dollar)</option>
     <option value="OMR">OMR (Omani Rial)</option>
     <option value="PEN">PEN (Peruvian Nuevo Sol)</option>
     <option value="PGK">PGK (Papua New Guinea Kina)</option>

</div>

It is a ajax problem. The js script post parameters into server, then server give the data back.

Use the chrome tools, and can explore the detail url for post data.

The detail url is " http://money.moneygram.com.au/forex-tools/currency-converter-widget-part ".

And the post form parameters : "FromCurrency=AED&ToCurrency=VND&FromCurrency_dropDown=AED&ToCurrency_dropDown=VND&FromAmount=2561&ToAmount=&X-Requested-With=XMLHttpRequest".

So you can use scrapy POST parameter to this url to get html data, and parse to get what you want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM