简体   繁体   中英

python save url list in txt file

Hello I am trying to make a python function to save a list of URLs in.txt file

Example: visit http://forum.domain.com/ and save all viewtopic.php?t= word URL in.txt file

http://forum.domain.com/viewtopic.php?t=1333
http://forum.domain.com/viewtopic.php?t=2333

I use this function but not save I am very new in python can someone help me to create this

web_obj = opener.open('http://forum.domain.com/')
data = web_obj.read()

fl_url_list = open('urllist.txt', 'r')
url_arr = fl_url_list.readlines()
fl_url_list.close()

This is far from trivial and can have quite a few corner cases (I suppose the page you're referring to is a web page)

To give you a few pointers, you need to:

  • download the web page: you're already doing it (in data )
  • extract the URLs: this is hard, most probably, you'll want to usae an html parser, extract <a> tags, fetch the href attribute and put that into a list. then filter that list to have only the url formatted like you like (say with viewtopic). Let's say you got it into urlList
  • then open a file for Writing Text (thus wt , not r ).
  • write the content f.write('\n'.join(urlList))
  • close the file

I advise to try to follow these steps and ask relevant questions when you're stuck on a particular issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM