简体   繁体   中英

Read csv file directly from a website in Python 3

i am trying to read a CSV file directly from a website (from a downloadable link) and then fetch one of its column as a list, so that I can further work with it. I am not able to code it properly. The nearest I have reached is

import csv
import urllib.request as urllib
import urllib.request as urlRequest
import urllib.parse as urlParse

url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urlRequest.Request(url, headers = headers)
# open the url 
x = urlRequest.urlopen(req)
sourceCode = x.read()

You are pretty close to the goal.

Just split the read CSV data by lines and pass it to the csv.reader():

import csv
import urllib.request as urllib
import urllib.request as urlRequest
import urllib.parse as urlParse

url = "https://www.nseindia.com/content/indices/ind_nifty50list.csv"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urlRequest.Request(url, headers = headers)
# open the url 
x = urlRequest.urlopen(req)
sourceCode = x.read()

cr = csv.DictReader(sourceCode.splitlines())
l = [row['Series'] for row in cr]

But note that x.read() returns bytearray , so if csv contains non-ASCII symbols, don't forget to add:

 x.read().decode('utf-8') # or another encoding you need

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM