简体   繁体   English

Python 默认语言环境(不支持的语言环境设置)

[英]Python default locale (unsupported locale setting)

This seems like a weird problem, and it's causing my some heartburn, because i'm using a library that stashes the current locale, and tries to set it back to what it stashed.这似乎是一个奇怪的问题,它引起了我的一些胃灼热,因为我正在使用一个隐藏当前语言环境的库,并试图将它设置回它隐藏的内容。

$ docker run --rm -it python:3.6 bash
root@bcee8785c2e1:/# locale
LANG=C.UTF-8
LANGUAGE=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=
root@bcee8785c2e1:/# locale -a
C
C.UTF-8
POSIX
root@bcee8785c2e1:/# python
Python 3.6.9 (default, Jul 13 2019, 14:51:44) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> curr = locale.getlocale()
>>> curr
('en_US', 'UTF-8')
>>> locale.setlocale(locale.LC_ALL, curr)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/locale.py", line 598, in setlocale
    return _setlocale(category, locale)
locale.Error: unsupported locale setting
>>>

I'm not sure why getlocale is returning en_US ?我不确定为什么getlocale返回en_US It's not anywhere in my environment vars (and I'm not sure where else it could be in my shell?).它不在我的环境变量中的任何地方(我不确定它可能在我的 shell 中的其他地方?)。

In any case, I can't setlocale with the value from getlocale , which seems weird to me.在任何情况下,我不能setlocale从价值getlocale ,这似乎怪我。

Does anyone have any guidance here?有人在这里有任何指导吗?

Much appreciated!非常感激!

For the first part: Does it matter?第一部分:重要吗? As far I know, I never see differences until you call setlocale() , so we are on the second part:据我所知,在您调用setlocale()之前,我从未发现差异,所以我们在第二部分:

You should use:你应该使用:

import locale
curr = locale.getdefaultlocale()
locale.setlocale(locale.LC_ALL, curr)

so getdefaultlocale() and not just getlocale() .所以getdefaultlocale()而不仅仅是getlocale() I also do not fully understand the reason to have both.我也不完全理解两者兼而有之的原因。 Is it possible that it is a Python bug that fail to recognize C.xxx .是否有可能是无法识别C.xxx的 Python 错误。

C.UTF-8 — A recent non-portable debianism C.UTF-8 — 最近的不可移植 debianism

The intention of C.UTF-8 is good but the implementation not quite yet. C.UTF-8 的初衷是好的,但尚未实现 For now avoid till it stabilizes.现在避免,直到它稳定。

Some discussion of context一些上下文的讨论

A redhat discussion around including it.围绕包括它的红帽讨论。 Which means it's not quite there (at time of writing at least).这意味着它不完全存在(至少在撰写本文时)。 Note particularly, Nick Coghlan, a core python-dev, suggests that python doesn't get locales right in some contexts like this one.特别要注意的是,核心 python-dev 的 Nick Coghlan 建议 python 在某些上下文中无法正确设置语言环境,例如这种情况。

A haskell discussion showing that portable cross-platform stuff — in this case haskell-stack but by implication also docker — becomes harder and less reliable with C.UTF-8 usage.一个haskell讨论表明,使用 C.UTF-8 时,可移植的跨平台东西——在这种情况下是 haskell-stack,但也暗示了 docker——变得更难,更不可靠。

The Intention意向

Debian ( also ) initiated C.UTF-8 and the intention is correct. Debian)发起了 C.UTF-8 并且意图是正确的。

Today's Linux systems are intensively localized — a slew of locales, fine-grained choice of LC_* choices etc etc. But all this is not on by default: if the locale system is broken the system is broken.今天的 Linux 系统被高度本地化——大量的语言环境、LC_* 选项的细粒度选择等等。但是默认情况下所有这些都不是:如果语言环境系统损坏,系统就会损坏。 The reason a broken locale-system is not as drastic in effects as say a broken kernel or fstab or grub etc is...损坏的语言环境系统在效果上不如说损坏的内核或 fstab 或 grub 等严重的原因是......

The C locale C语言环境

The C locale (synonym POSIX) is guaranteed to always be available as a fallback if other things break. C 语言环境(同义词 POSIX)保证在其他事情中断时始终可用作后备。 So for example you won't see localized errors but English — not mojibake or empty rectangles or question-marks!因此,例如,您不会看到本地化错误,但会看到英语 - 不是 mojibake 或空矩形或问号!

By and large you get these kind of warnings not errors and otherwise things keep working.总的来说,您会收到这些警告而不是错误,否则事情会继续工作。

But C = POSIX implies the legacy ASCII not UTF-8 everywhere — an undesired side-effect of legacy.但是 C = POSIX 意味着所有的地方都是遗留的 ASCII 而不是UTF- 8——遗留的不良副作用。

Towards making that legacy less and less necessary even as a fallback, Debian introduced the always available C.UTF-8 locale.为了使这种遗产变得越来越没有必要,甚至作为后备,Debian 引入了始终可用的 C.UTF-8 语言环境。

The catch?渔获? It's always available...它总是可用...

Only in Debian仅在 Debian 中

Which means recent Debian, derivatives like Ubuntu also recent.这意味着最近的Debian,像 Ubuntu 这样的衍生产品也是最近的。 But not (yet) other systems.但不是(还)其他系统。

In short C.UTF-8 is not universal, not portable, fragile and therefore avoidable... at least for now, at least on client-server, virtualized (containerized) etc systems like docker.简而言之,C.UTF-8 不是通用的、不可移植的、脆弱的,因此是可以避免的……至少现在,至少在客户端-服务器、虚拟化(容器化)等系统(如 docker)上。 The....这....

Practical Upshot实际结果

You need to explicitly install old-fashioned locales like en_US.UTF-8.您需要显式安装旧式语言环境,例如 en_US.UTF-8。 (People wanting a reasonable international English locale and not wanting en_US may wish to check out en_DK.UTF-8 ). (想要合理的国际英语语言环境而不想要 en_US 的人可能希望查看en_DK.UTF-8 )。

Yeah that involves some amount of是的,这涉及到一些

Getting your hands dirty弄脏你的手

Here is a collection of references on docker oriented locale setup这是有关面向 docker 的语言环境设置的参考集合

I don't approve of one anti-pattern that repeats in the above but It's going too far afield (from this question) to expand on this, so in v short:我不赞成在上面重复的一种反模式,但它太远了(从这个问题)来扩展这个,所以简而言之:

Setting locale should usually only involve setting LANG .设置语言环境通常涉及设置LANG Setting LC_ALL , especially along with LANG is a no-no.设置LC_ALL ,尤其是与LANG一起设置是LC_ALL

From Debian wiki来自Debian 维基

⚠️ WARNING ⚠️警告

Using LC_ALL is strongly discouraged as it overrides everything.强烈建议不要使用 LC_ALL,因为它会覆盖所有内容。 Please use it only when testing and never set it in a startup file.请仅在测试时使用它,切勿在启动文件中设置它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM