简体   繁体   中英

How to scrape text from tooltips generated with javascript

I've written the following code to get the positions of all blue markers in the map.

from bs4 import BeautifulSoup
from requests_html import HTMLSession
session = HTMLSession()

url="https://emf2.bundesnetzagentur.de/karte/Default.aspx?lat=52.4107723&lon=14.2930953&zoom=14"
r = session.get(url)
r.html.render(sleep = 3)
data = r.html.html

soup=BeautifulSoup(data,'html.parser')
BlueTriangles = soup.find_all(src="images/funk_hf.png")
for Triangle in BlueTriangles[1:]:
    TriangleStyle = Triangle['style']
    PixelPosition = TriangleStyle.split('transform: translate3d(')[1].split(', 0px); z')[0]
    print(PixelPosition)

r.session.close()

When I open the URL using a web browser, I see that each blue marker has a unique ID that is shown in a tooltip on mouseover:

在此处输入图片说明

The html code of the tooltip appears to be rendered triggered by a mouseover event:

在此处输入图片说明

Is there any way of scraping the the ID from the tooltip? I was wondering whether it is possible use the script parameter of render to force a mouseover event. But I couldn't find a way to integrate it in the code:

$('#foo').trigger('mouseover');

Points on the map are rendered by request to the endpoint https://emf2.bundesnetzagentur.de/karte/Standortservice.asmx/GetStandorteFreigabe with box coordinates (in this case {"Box":{"sued":52.39231101879802,"west":14.248666763305664,"nord":52.42927461241364,"ost":14.337587356567385}} ).

Response is json. Locations' data is encrypted by AES. Decryption code is available in js script loading with page (functions CryptParams and DecryptData ).

After decryption we get this nice data: "[{"Titel":"018126","Lng":14.311666,"Lat":52.428888,"fID":1076,"sonderseite":false},{"Titel":"011720","Lng":14.259722,"Lat":52.423054,"fID":2196,"sonderseite":false},{"Titel":"87011082","Lng":14.275832,"Lat":52.401666,"fID":560919,"sonderseite":false}]"

You have two ways.

  1. Use selenium or similar software to render JS and try to parse resulting DOM;

  2. Write parser to send request to GetStandorteFreigabe endpoint and decode it's response (convert code from js to python),

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM