简体   繁体   English

无法使用Flask Python读取浏览器URL中的西里尔符号

[英]Can't read cyrillic symbols in browser URL using Flask python

Good afternoon, everyone! 大家下午好!

Basically, I have a problem with reading Cyrillic letters while running my programming code in Python using Flask framework. 基本上,使用Flask框架在Python中运行编程代码时,我在读取西里尔字母时遇到问题。

What happens in URL Chrome Chrome浏览器网址会发生什么

" mamapapa " has to be in Cyrillic – " мамапапа " mamapapa ”必须用西里尔字母表示-“ мамапапа

Here's my code: 这是我的代码:

# -*- coding: utf-8 -*-
from flask import Flask, jsonify
from unidecode import unidecode
from venv import logger

app = Flask(__name__)
names1=""


@app.route('/hashtags/<string:names>', methods=['GET'])


def get_hashtags(names):

    names1 = "#" + 'names'
    np= logger.start(names1)
    return jsonify({'Сегментация хэштегов': unidecode(np)})



if __name__ == '__main__':
    app.run(port=9876)

As you can see, I've already used unidecode in order to translate Cyrillic letters into Latin ones in the row return jsonify({'Сегментация хэштегов': unidecode(np)}) , but that's not what I exactly wanted to do. 如您所见,在行return jsonify({'Сегментация хэштегов': unidecode(np)})的行中,我已经使用unidecode来将西里尔字母转换为拉丁字母,但这并不是我真正想要的。 My main purpose is to get precisely Cyrillic symbols (like привет or США ) when I get the output. 我的主要目的是在获得输出时精确获得西里尔字母符号(例如приветСША )。

As far as I've read, there's no possibility to use Cyrillic letters in browser URL? 据我所读,浏览器URL不能使用西里尔字母吗? Is it truth or is there any possible way to reach my aim and get that Cyrillic output? 是事实,还是有什么可能的方法可以达到我的目标并获得西里尔文的输出?

Maybe there's something with 'UTF-8' encoding/decoding? 也许有些东西使用'UTF-8'编码/解码?

Thanks in advance! 提前致谢!

You are right. 你是对的。 URL character are limited: plain latin letters (az, lower case and upper case) and marks : - , _ , . URL字符是有限的:普通拉丁字母(az,小写和大写字母)和标记- _. , ~ . ~ Other characters are escaped with % followed by two hex digit. 其他字符以%后面跟两个十六进制数字进行转义。 [source: RFC3986 appendix-A ]. [来源: RFC3986附录A ]。 Some people could uses creatively other characters, that really split path and arguments (eg + , parenthesis, etc.. 某些人可能会创造性地使用其他字符,这些字符实际上会分隔路径和参数(例如+ ,括号等)。

Historically URL were designed to be used by machines ( http://www. is not the most human readable prefix), so such escaping: you can put all bytes, but such bytes are encoded with % . 过去,URL被设计为供机器使用( http://www.不是最易读的前缀),因此,这种转义是:您可以放入所有字节,但是此类字节使用%编码。

Domains can uses other characters (outside ASCII), but also in this case, it is just an encoding standard (to ascii): DNS and protocols still use ASCII only characters. 域可以使用其他字符(ASCII以外的字符),但是在这种情况下,它也只是一种编码标准(ASCII):DNS和协议仍然仅使用ASCII字符。

Browsers can do the escaping automatically, and you may display the URL unescaped, but the real url is escaped. 浏览器可以自动进行转义,并且您可以显示未转义的URL,但是实际的URL被转义。 You should try for your (or your client) uses case, if such automatic escaping works with their mail clients/browsers. 如果这种自动转义可与他们的邮件客户端/浏览器一起使用,则应尝试使用(或您的客户端)用例。

In any case, with HTML you could display the URL differently as the real URL (on links, and you have some control with javascript) on the url to be displayed on URL bar. 在任何情况下,使用HTML时,您可以在要显示在URL栏上的url上以与实际URL不同的方式显示URL,而不是真实URL(在链接上,并且可以用JavaScript进行一些控制)。

Give at a try unquote. 尝试取消报价。 Example: 例:

#Python3
from urllib.parse import unquote
string = unquote('УРЛ')

#Python2
import urllib
url='УРЛ'
urllib.unquote(url).decode('utf8')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM