简体   繁体   中英

Open online txt file using python codecs.open

I am trying to open an online txt file using codecs.open. The code I have now is:

url = r'https://www.sec.gov/Archives/edgar/data/20/0000893220-96-000500.txt'
soup = BeautifulSoup(codecs.open(url, 'r',encoding='utf-8'), "lxml")

However, Python keeps reminding OSError:

OSError: [Errno 22] Invalid argument: 'https://www.sec.gov/Archives/edgar/data/20/0000893220-96-000500.txt'

I tried to replace "/" with "\\". It still does not work. Is there any way to solve it? Since I have more than thousands of links to open, I do not quite want to download the online text files into my local drive.

I will appreciate it very much if someone can help here.

Thanks!

Is it something like this you're thinking of?

`from urllib.request import urlopen
url = urlopen('https://www.sec.gov/Archives/edgar/data/20/0000893220-96- 000500.txt')
 html = url.read().decode('utf-8')
 file = open('yourfile.txt', 'r')
 file.read(html)
 file.close`

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM