简体   繁体   English

Python网站源码提取

[英]Python website source extraction

i'm using urllib2 to download the source of a website, but something is going wrong.我正在使用urllib2下载网站的源代码,但出了点问题。 The source comes from this website: http://www.starfm.com/ All i want to do is to download the entire html , and then parse it extracting the "Now playing" section from the website.来源来自这个网站: http : //www.starfm.com/我想要做的就是下载整个 html ,然后解析它从网站上提取“正在播放”部分。

But, when i download the source with this code但是,当我使用此代码下载源代码时

response = urllib2.urlopen('http://www.starfm.com/')
html = response.read()
a = open("C:\\users\\Leonardo\\Desktop\\source.txt","w")
a.write(html)
a.close()

the final source does not show the current artist in the website's "Now playing" section.最终来源不会在网站的“正在播放”部分中显示当前艺术家。

Why?为什么? What should i do?我该怎么办?

Thanks so much in advance.非常感谢。

-Leonardo -莱昂纳多

"now playing" comes from javascript, maybe it loads that info on the onload() event , in this case your code is just reading the content. “正在播放”来自 javascript,也许它会在 onload() 事件中加载该信息,在这种情况下,您的代码只是读取内容。

maybe this question will help you也许这个问题会帮助你

Get page generated with Javascript in Python 在 Python 中获取使用 Javascript 生成的页面

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM