语法错误：非ASCII字符。蟒蛇

Question

有人可以告诉我以下哪个字符是非ASCII字符：

Columns（str）-逗号分隔的值列表。 仅当format为tab或xls时有效。 对于UnitprotKB，一些可能的列是：id，条目名称，长度，生物。 某些列名称后必须带有数据库名称（即'database（PDB）'）。 再次详细信息见UNIPROT网站。 有关column关键字的完整列表，另请参见_valid_columns。

从本质上讲，我定义一个类，并试图给它一个注释来定义它是如何工作的：

def test(self,uniprot_id):
    '''
    Same as the UniProt.search() method arguments:
    search(query, frmt='tab', columns=None, include=False, sort='score', compress=False, limit=None, offset=None, maxTrials=10)


    query (str) -- query must be a valid uniprot query. See http://www.uniprot.org/help/text-search, http://www.uniprot.org/help/query-fields See also example below
    frmt (str) -- a valid format amongst html, tab, xls, asta, gff, txt, xml, rdf, list, rss. If tab or xls, you can also provide the columns argument. (default is tab)
    include (bool) -- include isoform sequences when the frmt parameter is fasta. Include description when frmt is rdf.
    sort (str) -- by score by default. Set to None to bypass this behaviour
    compress (bool) -- gzip the results
    limit (int) -- Maximum number of results to retrieve.
    offset (int) -- Offset of the first result, typically used together with the limit parameter.
    maxTrials (int) -- this request is unstable, so we may want to try several time.
    Columns(str) -- comma-seperated list of values. Works only if format is tab or xls. For UnitprotKB, some possible columns are: id, entry name, length, organism. Some column names must be followed by a database name (i.e. ‘database(PDB)’). Again see uniprot website for more details. See also _valid_columns for the full list of column keyword. '

    '''        
    u = UniProt()
    uniprot_entry = u.search(uniprot_id)
    return uniprot_entry

没有第52行，即在引用的注释块中以“ columns”开头的行，这可以按预期工作，但是一旦我描述了“ columns”是什么，我将得到以下错误：

SyntaxError: Non-ASCII character '\xe2' in file /home/cw00137/Documents/Python/Identify_gene.py on line 52, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

有人知道发生了什么吗？

Answer 1

您在该行中使用“花式”卷曲引号：

>>> u'‘database(PDB)’'
u'\u2018database(PDB)\u2019'

开头是U + 2018左单引号，结尾是U + 2019右单引号。

使用ASCII引号（ U + 0027 APOSTROPHE或U + 0022 QUOTATION MARK ）或声明源以外的ASCII编码。

您也正在使用U + 2013 EN DASH ：

>>> u'Columns(str) –'
u'Columns(str) \u2013'

用U + 002D HYPHEN-MINUS代替。

所有这三个字符都以前导E2字节编码为UTF-8：

>>> u'\u2013 \u2018 \u2019'.encode('utf8')
'\xe2\x80\x93 \xe2\x80\x98 \xe2\x80\x99'

然后您会看到它反映在SyntaxError异常消息中。

您可能要首先避免使用这些字符。 可能是您的OS在您键入时替换了它们，或者您使用的是文字处理器而不是纯文本编辑器来编写代码，并且正在为您替换它们。 您可能要关闭该功能。

Answer 2

以前遇到相同的问题和相同的错误，python2将默认使用ASCII编码。 您可以尝试在py文件的第一行或第二行中声明以下注释：

# -*- coding: utf-8 -*-

语法错误：非ASCII字符。蟒蛇

问题描述

2 个解决方案

解决方案1
4 已采纳 2014-11-10 10:11:29

解决方案2
1 2016-12-03 15:40:23

语法错误：非ASCII字符。 蟒蛇

问题描述

2 个解决方案

解决方案1 4 已采纳 2014-11-10 10:11:29

解决方案2 1 2016-12-03 15:40:23

语法错误：非ASCII字符。蟒蛇

解决方案1
4 已采纳 2014-11-10 10:11:29

解决方案2
1 2016-12-03 15:40:23