Python网站源码提取

Question

i'm using urllib2 to download the source of a website, but something is going wrong.我正在使用urllib2下载网站的源代码，但出了点问题。 The source comes from this website: http://www.starfm.com/ All i want to do is to download the entire html , and then parse it extracting the "Now playing" section from the website.来源来自这个网站： http : //www.starfm.com/我想要做的就是下载整个 html ，然后解析它从网站上提取“正在播放”部分。

But, when i download the source with this code但是，当我使用此代码下载源代码时

response = urllib2.urlopen('http://www.starfm.com/')
html = response.read()
a = open("C:\\users\\Leonardo\\Desktop\\source.txt","w")
a.write(html)
a.close()

the final source does not show the current artist in the website's "Now playing" section.最终来源不会在网站的“正在播放”部分中显示当前艺术家。

Why?为什么？ What should i do?我该怎么办？

Thanks so much in advance.非常感谢。

-Leonardo -莱昂纳多

Answer 1

"now playing" comes from javascript, maybe it loads that info on the onload() event , in this case your code is just reading the content. “正在播放”来自 javascript，也许它会在 onload() 事件中加载该信息，在这种情况下，您的代码只是读取内容。

maybe this question will help you也许这个问题会帮助你

Get page generated with Javascript in Python 在 Python 中获取使用 Javascript 生成的页面

Python网站源码提取

问题描述

1 个解决方案

解决方案1
0 已采纳 2014-03-30 21:35:25

Python网站源码提取

问题描述

1 个解决方案

解决方案1 0 已采纳 2014-03-30 21:35:25

解决方案1
0 已采纳 2014-03-30 21:35:25