I am trying to open an online txt file using codecs.open. The code I have now is:
url = r'https://www.sec.gov/Archives/edgar/data/20/0000893220-96-000500.txt'
soup = BeautifulSoup(codecs.open(url, 'r',encoding='utf-8'), "lxml")
However, Python keeps reminding OSError:
OSError: [Errno 22] Invalid argument: 'https://www.sec.gov/Archives/edgar/data/20/0000893220-96-000500.txt'
I tried to replace "/" with "\\". It still does not work. Is there any way to solve it? Since I have more than thousands of links to open, I do not quite want to download the online text files into my local drive.
I will appreciate it very much if someone can help here.
Thanks!
Is it something like this you're thinking of?
`from urllib.request import urlopen
url = urlopen('https://www.sec.gov/Archives/edgar/data/20/0000893220-96- 000500.txt')
html = url.read().decode('utf-8')
file = open('yourfile.txt', 'r')
file.read(html)
file.close`
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.