简体   繁体   中英

Beautiful Soup returns only javaScript Code?

I want to scrape data from the following website. https://dell.secure.force.com/FAP/?c=de&l=de&pt=findareseller I tried to get data from the network tab but it returns nothing. Then I tried BeautifulSoup to get some data but it returns only Javascript with empty tbody tags. But in inspect element, it shows the data in a table.

import requests
from bs4 import BeautifulSoup
url = 'https://dell.secure.force.com/FAP'
headers = {
   'Connection': 'keep-alive'
   }
data = {
'pt': "findareseller"
   }
page = requests.get(url, params= data)
soup = BeautifulSoup(page.text, 'html.parser')
soup.find_all('table') # returns only javascript code.

Can someone help, how can I scrape the data?

soup.find_all('table') returns a list with all table elements.

So to find your specific element you should try to find some distinct properties that makes it different from all the other tables (like an id or class).

To access the elements attributes use t[0].attrs to get a list of them and for example: t[0]["width"] to access them.

Also: By using soup.select('table') instead, you can use css selectors as the string input, so you won't have to deal with beautifulsoups functions.

Thank you all. I figure out the answer. I use network search to fetch any search requests. I found the search URL, to confirm if the URL was right, I double-clicked it and it returns the exact same page. so I copy the bash code and past it into POSTMAN as import "RAW TEXT". I find out it actually uses post requests. After using post request, I was able to fetch the data I needed. below is the request with POST.

response = requests.request("POST", url, headers=headers, data=payload)

then I use BeautifulSoup as soup.

st = soup.find('input')['value'] # returns data I needed

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM