简体   繁体   English

使用urllib2.urlopen()的未知URL类型

[英]Unknown URL type using urllib2.urlopen()

I am trying to do the following: 我正在尝试执行以下操作:

  • open a CSV file containing a list with URLs (GET-Requests) 打开一个包含URL列表的CSV文件(GET请求)
  • read the CSV file and write the entries to a list 读取CSV文件并将条目写入列表
  • open every single URL and read the answer 打开每个URL并阅读答案
  • write the answers back to a new CSV file 将答案写回到新的CSV文件

I get the following error: 我收到以下错误:

Traceback (most recent call last):
  File "C:\Users\l.buinui\Desktop\request2.py", line 16, in <module>
    req = urllib2.urlopen(url)
  File "C:\Python27\lib\urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 404, in open
    response = self._open(req, data)
  File "C:\Python27\lib\urllib2.py", line 427, in _open
    'unknown_open', req)
  File "C:\Python27\lib\urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 1247, in unknown_open
    raise URLError('unknown url type: %s' % type)
URLError: <urlopen error unknown url type: ['http>

Here is the code I am using: 这是我正在使用的代码:

import urllib2
import urllib
import csv

# Open and read the source file and write entries to a list called link_list
source_file=open("source_new.csv", "rb")
source = csv.reader(source_file, delimiter=";")
link_list = [row for row in source]
source_file.close()

# Create an output file which contains the answer of the GET-Request
out=open("output.csv", "wb")
output = csv.writer(out, delimiter=";")
for row in link_list:
    url = str(row)
    req = urllib2.urlopen(url)
    output.writerow(req.read())

out.close()

What is going wrong there? 那里出了什么问题?

Thanks in advance for any hints. 预先感谢您的任何提示。

Cheers 干杯

Using the row variable will pass a list element (it contains only one element, the url) to urlopen, but passing row[0] will pass the string containing the url. 使用row变量会将一个列表元素(仅包含一个元素,即url)传递给urlopen,但是传递row [0]将传递包含url的字符串。

The csv.reader returns a list for each row it reads, no matter how many items are in the row. 无论该行中有多少项, csv.reader都会为其读取的每一行返回一个列表。

It's working now. 现在正在工作。 If I directly reference row[0] in the loop there are no problems. 如果我直接在循环中引用row[0] ,就没有问题。

import urllib2
import urllib
import csv

# Open and read the source file and write entries to a list called link_list
source_file=open("source.csv", "rb")
source = csv.reader(source_file, delimiter=";")
link_list = [row for row in source]
source_file.close()

# Create an output file which contains the answer of the GET-Request
out=open("output.csv", "wb")
output = csv.writer(out)
for row in link_list:
    req = urllib2.urlopen(row[0])
    answer = req.read()
    output.writerow([answer])


out.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM