简体   繁体   中英

how to extract text from a list of url and save them separately

I have a list of urls. There are 100 urls in that list and all those urls contains text. I want to extract text from those urls and save those text in text1, text2, text3 and so on. I am only able to do this.

list_of_urls = ['abc.com', 'def.com', 'sssj.com', ... and so on]
import urllib

text = []
data = urllib.request.urlopen('abc.com')
for line in data:
    line = line.decode('utf-8')
    text.append(line)

this above code only work for one url. But I want to loop over all urls in my list and store there output in text1, text2, text3 and so on.

I'm not sure how exactly you want to store the separate texts, but this code will create a dict where the keys are the text1, text2, ... and the values are lists with the sentences from that text.

import urllib
list_of_urls = ['abc.com', 'def.com', 'sssj.com', ... and so on]

result = {}
for idx, url in enumerate(list_of_urls):
    data = urllib.request.urlopen(url)
    text = []
    for line in data:
        line = line.decode('utf-8')
        text.append(line)
        
    result[f"text{idx}"] = text

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM