简体   繁体   中英

Saving dynamic content from web page?

Is it possible to save dynamic text from a website and dump it into a file on my server? The specific case that I'm interested in is saving the song title from this page http://www.z1035.com/player.php and saving all the song titles in a file on my server. Is this possible? What methods could I use to do this?

What you're referring to is generally known as 'scraping'. Here's an article about one way to do it with PHP:

http://www.developertutorials.com/blog/php/easy-screen-scraping-in-php-simple-html-dom-library-simplehtmldom-398/

Python's URLLib library makes scraping pretty easy, in my opinion.

import urllib, re

url = "http://www.z1035.com/player.php"
f = urllib.urlopen(url)
t = f.read()
#  use regular expression here 
m = re.search(t, "some pattern")
print m.group(1)

This will load the external resource as if it were a local file, and allow you to parse it as necessary.

Once upon a time I wanted to save all the tracklistings for a radio show I listened to. I used Python to download a list of all the tracklistings, and then to programmatically visit each and append the contents to a file. It was very handy, and took probably 20 lines.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM