简体   繁体   中英

Scraping data from a XHR with request

I want to scrape the data of this website . But the results I get are different from those posted on the website. For example, when I run the code, for 14000 and a duration of 48 months I get 7.03 for the TAE while the value on the site is 6.44. I think the params is wrongly set. Could someone help me?

I changed the params in several ways without it working. I do not know how to find the right params.

import requests
from bs4 import BeautifulSoup
import re
import json
import pandas as pd

#Let's first collect few auth vars
r = requests.Session()
response = r.get("https://simuladores.bancosantander.es/SantanderES/loansimulatorweb.aspx?por=webpublica&prv=publico&m=100&cta=1&ls=0#/t0")
soup = BeautifulSoup(response.content, 'html')
key = soup.find_all('script',text=re.compile('Afi.AfiAuth.Init'))
pattern = r"Afi.AfiAuth.Init\((.*?)\)"

WSSignature = re.findall(pattern,key[0].text)[0].split(',')[-1].replace('\'','')
WSDateTime = re.findall(pattern,key[0].text)[0].split(',')[1].replace('\'','')

headers = {
    'Origin': 'https://simuladores.bancosantander.es',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36',
    'Content-Type': 'application/json;charset=UTF-8',
    'Accept': 'application/json, text/plain, */*',
    'WSSignature': WSSignature,
    'Referer': 'https://simuladores.bancosantander.es/SantanderES/loansimulatorweb.aspx?por=webpublica&prv=publico&m=100&cta=1&ls=0',
    'WSDateTime': WSDateTime,
    'WSClientCode': 'SantanderES',
}

#Those are the standard params of a request
params = {'wsInputs': {'finality': 'prestamo coche',
  'productCode': 'p100',
  'capitalOrInstallment': 5000,
  'monthsTerm': 96,
  'mothsInitialTerm': 12,
  'openingCommission': 1.5,
  'minOpeningCommission': 60,
  'financeOpeningCommission': True,
  'interestRate': 5.5,
  'interestRateReferenceIndex': 0,
  'interestRateSecondaryReferenceIndex': 0,
  'interestRateSecondaryWithoutVinculation': 6.5,
  'interestRateSecondaryWithAllVinculation': 0,
  'interestRateSecondary': 6.5,
  'loanDate': '2019-06-13',
  'birthDate': '2001-06-13',
  'financeLoanProtectionInsurance': True,
  'percentageNotaryCosts': 0.003,
  'loanCalculationMethod': 0,
  'calculationBase': 4,
  'frecuencyAmortization': 12,
  'frecuencyInterestPay': 12,
  'calendarConvention': 0,
  'taeCalculationBaseType': 4,
  'lackMode': 0,
  'amortizationCarencyMonths': 0,
  'typeAmortization': 1,
  'insuranceCostSinglePremium': 0,
  'with123': False,
  'electricVehicle': False}}
#The scraping function
def scrape(amount, duration, params):

    params['wsInputs']['capitalOrInstallment'] = amount
    params['wsInputs']['monthsTerm'] = duration
    response = r.post('https://simuladores.bancosantander.es/WS/WSSantanderTotalLoan.asmx/Calculate', headers=headers, data=json.dumps(params))
    return json.loads(response.content)['d']


Amounts = [13000]
Durations = [ 48, 60, 72, 84, 96]
results = []
for amount in Amounts:
    for duration in Durations:
        result = scrape(amount, duration, params)
        result['Amount'] = amount
        result['Duration'] = duration
        results.append(result)

df = pd.DataFrame(results)

First, as said by @Richard, there is nothing wrong with your code.

The reason why you get 7.03% instead of 6.44% is because the loan simulator that you are using somehow cheats (to appear more competitive). The difference you have lies in the consideration of the Comisión de apertura financiada . This means that if you set the standard parameter 'openingCommission' to 0, you will get 6.45% . What about getting exactly 6.44% ? A suggestion follows.


Explanation ( using french terminology )

If I compute the TEG and the TAE asociated to the hyperparameter {14k€, 48months, 330.47€/m}, I get 6.26% and 6.44% .

But if I do the same calculation, including the Comisión de apertura financiada of 210€ , I get ~7.03% and 7.23% .

在此处输入图片说明

where i above (and below) stands fot the internal rate of return (IRR) , ie the rate that annulates equation (E1):

在此处输入图片说明


Which means that you should consider integrating an IRR-solver within your workflow, using the available pieces of information (mensualities, duration, total amount and even fees) to recompute the TAE.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM