使用urllib.request时出现HTTP错误

Question

I am trying to do a profanity check test. 我正在尝试进行亵渎性检查。 The code I have written so far is 到目前为止我写的代码是

import urllib.request  

def read_text ():
    file = open (r"C:\Users\Kashif\Downloads\abc.txt")
    file_print = file.read ()
    print (file_print)
    file.close ()
    check_profanity (file_print)

def check_profanity (file_print):
    connection = urllib.request.urlopen ("http://www.purgomalum.com/service/containsprofanity?text="+file_print)
    output = connection.read ()
    print ("The Output is "+output)
    connection.close ()
    read_text ()

But I get the error below 但是我得到下面的错误

urllib.error.HTTPError: HTTP Error 400: Bad Request urllib.error.HTTPError：HTTP错误400：错误的请求

I don't know what I am going wrong. 我不知道我要怎么做。

I am using Python 3.6.1 我正在使用Python 3.6.1

Answer 1

The HTTP error you're getting is usually a sign of something bad in the way you are requesting data to the server. 您收到的HTTP错误通常表明您向服务器请求数据的方式出现了问题。 According to the HTTP Spec : 根据HTTP规范：

400 Bad Request 400错误的要求

The request could not be understood by the server due to malformed syntax. 由于语法格式错误，服务器无法理解该请求。 The client SHOULD NOT repeat the request without modifications 客户端不应不加修改地重复请求

In concrete in your example, the problem seems to be with the lack of URL encoding of the data you're sending in the URL. 在您的示例中，具体来说，问题似乎出在您在URL中发送的数据缺少URL编码。 You should try using the method quote_plus from the urllib.parse module to make your request acceptable: 你应该尝试使用方法quote_plus从的urllib.parse模块，使您的要求可以接受的：

from urllib.parse import quote_plus

...

encoded_file_print = quote_plus(file_print)
url = "http://www.purgomalum.com/service/containsprofanity?text=" + encoded_file_print
connection = urllib.request.urlopen(url)

If that doesn't work then the problem might be with the contents of your file. 如果这不起作用，则问题可能出在文件的内容上。 You can try it first with a simple example, to verify your script works and then try using the file's content afterwards. 您可以先通过一个简单的示例进行尝试，以验证脚本是否正常运行，然后再尝试使用文件的内容。

Apart from the above, there's also a couple of other issues with your code: 除了上述内容之外，您的代码还存在其他一些问题：

No spaces needed between methods and brackets: file.close () or def read_text (): and so on. 方法和方括号之间不需要空格： file.close ()或def read_text ():等。
Decode the content after reading it to convert bytes to a string: output = connection.read().decode('utf-8') 读取内容后将其解码以将字节转换为字符串： output = connection.read().decode('utf-8')
The way you're calling the methods creates a circular dependency. 调用方法的方式会创建循环依赖项。 read_text calls check_profanity that in the end calls read_text that calls check_profanity , etc. Remove the extra method calls and just use return to return the output of a method: read_text调用check_profanity ，在结束通话read_text调用check_profanity等删除多余的方法调用，只需使用return返回一个方法的输出：
```
 content = read_text() has_profanity = check_profanity(content) print("has profanity? %s" % has_profanity) 
```

使用urllib.request时出现HTTP错误

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-04-15 15:08:46

使用urllib.request时出现HTTP错误

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-04-15 15:08:46

解决方案1
2 已采纳 2017-04-15 15:08:46