简体   繁体   English

使用beautifulsoup刮取地图坐标

[英]scrape map coordinates using beautifulsoup

I am trying to scrape the coordinates where a flicker photo was taken. 我正在尝试刮擦拍摄闪烁照片的坐标。 I tried catching this 'a' block: 我尝试捕获此“ a”块:

<a class="static-maps" href="https://www.flickr.com/map/?fLat=13.387866&amp;fLon=77.699174&amp;zl=13&amp;everyone_nearby=1" data-rapid_p="163"> 

using the following code: 使用以下代码:

url='https://www.flickr.com/photos/hellosaurav/8739282947/in/photolist-ayo8gy-brAbpk-nREjXv-eyQCtp-ovie9F-rdhF3m-eB8g6z-a3jhb9-9jUqhk-evcaBQ-j7iARL-oFd27B-cZ4VaN-mfP6NR-odhcpL-hy2vMX-mHGWoM-n9ARnM-9rxT1W-oqPqDQ-6tmgQ1-oNbZXw-pogsa7-eAeMz9-asB1Qu-o3qgcx-pr6ZGC-dfTh3p-pRuMsf-9yqjrG-bS4AkB-5iDTpA-pSVfhM-ejg7mc-oKWkZX-vDvqdR-nvb2zt-oYDWki-chB5ZY-p14ReR-oJSier-n9MyRk-rGAdSf-exgySN-sFkcTb-hE2tfg-ryeRC5-rqYLen-7zAafa-p3vS3U/'
r=requests.get(url)
url=r.content

soup = BeautifulSoup(url,'html.parser')

#header
header=soup.find("div",{"class":"title-desc-block"}).find("h1")
if(header==None):
    return
else:
    header=header.text.encode("utf-8").strip().replace(',','|')

amap=soup.find("a",{"class":"static-maps"})

print amap

The code print "None". 代码打印“无”。

Anyone has an idea why beautifulsoup can't find this link? 任何人都有一个想法,为什么beautifulsoup找不到此链接?

This link is created by JavaScript. 该链接由JavaScript创建。 BS doesn't render pages so it can't run JavaScript. BS不会呈现页面,因此它无法运行JavaScript。

Coordinates are in this file as JavaScript/Text so you can try to find it. 此文件中的坐标为JavaScript /文本,因此您可以尝试查找它。
But BS can't help you in this job. 但是BS不能帮助您完成这项工作。 Use regular expressions. 使用正则表达式。

import re

print re.findall('"latitude":(.+?),', r.content)
print re.findall('"longitude":(.+?),', r.content)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM