简体   繁体   English

python - dateutil / 解析器中的语言环境

[英]python - locale in dateutil / parser

I set我设置

locale.setlocale(locale.LC_TIME, ('de', 'UTF-8'))

the string to parse is:要解析的字符串是:

Montag, 11. April 2016 19:35:57

I use:我用:

note_date = parser.parse(result.group(2))

but get the following error:但得到以下错误:

Traceback (most recent call last): File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1531, in globals = debugger.run(setup['file'], None, None, is_module) File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 938, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/Users/adieball/Dropbox/Multiverse/Programming/python/repositories/kindle/kindle2en.py", line 250, in main(sys.argv[1:]) File "/Users/adieball/Dropbox/Multiverse/Programming/python/repositories/kindle/kindle2en.py", line 154, in main note_date = parser.parse(result.group(2)) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/dateutil/parser.py", line 1164, in parse return DEFAULTPARSER.parse(timestr, **kwargs) File "/Library/Frameworks/Python.framework/Versions/3.5/回溯(最后一次调用):文件“/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py”,第 1531 行,在 globals = debugger.run(setup['file'], None, None, is_module ) File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 938, in run pydev_imports.execfile(file, globals, locals) # 执行脚本 File "/Applications/PyCharm.app/Contents /helpers/pydev/_pydev_imps/_pydev_execfile.py",第 18 行,在 execfile 中 exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/Users/adieball/Dropbox/Multiverse/Programming /python/repositories/kindle/kindle2en.py”,第 250 行,在 main(sys.argv[1:]) 文件中“/Users/adieball/Dropbox/Multiverse/Programming/python/repositories/kindle/kindle2en.py”,第 154 行,在主 note_date = parser.parse(result.group(2)) 文件“/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/dateutil/parser.py”中,第 1164 行,解析返回 DEFAULTPARSER.parse(timestr, **kwargs) 文件“/Library/Frameworks/Python.framework/Versions/3.5/ lib/python3.5/site-packages/dateutil/parser.py", line 555, in parse raise ValueError("Unknown string format") ValueError: Unknown string format lib/python3.5/site-packages/dateutil/parser.py",第 555 行,解析中引发 ValueError("Unknown string format") ValueError: Unknown string format

a debug show that parser is not using the "correct" dateutil values (german), it's still using the english ones.调试显示解析器没有使用“正确的” dateutil 值(德语),它仍在使用英语值。

在此处输入图像描述

I'm sure I'm missing something obvious here, but can't find it.我确定我在这里遗漏了一些明显的东西,但找不到。

Thanks.谢谢。

dateutil.parser doesn't use locale . dateutil.parser不使用locale You'll need to subclass dateutil.parser.parserinfo and construct a German equivalent:.您需要将dateutil.parser.parserinfo子类dateutil.parser.parserinfo并构造一个德语等效项:。

from dateutil import parser

class GermanParserInfo(parser.parserinfo):
    WEEKDAYS = [("Mo.", "Montag"),
                ("Di.", "Dienstag"),
                ("Mi.", "Mittwoch"),
                ("Do.", "Donnerstag"),
                ("Fr.", "Freitag"),
                ("Sa.", "Samstag"),
                ("So.", "Sonntag")]

s = 'Montag, 11. April 2016 19:35:57'
note_date = parser.parse(s, parserinfo=GermanParserInfo())

You'd need to extend this to also work for other values, such as month names.您需要将其扩展为也适用于其他值,例如月份名称。

In another answer, I answered a simple Locale aware parseinfo class .在另一个答案中,我回答了一个简单的Locale 感知 parseinfo class This isn't a complete solution for all languages in the world, but solved all my localization problems.这不是世界上所有语言的完整解决方案,但解决了我所有的本地化问题。

Here it is:这是:

import calendar
from dateutil import parser
    
class LocaleParserInfo(parser.parserinfo):
    WEEKDAYS = zip(calendar.day_abbr, calendar.day_name)
    MONTHS = list(zip(calendar.month_abbr, calendar.month_name))[1:]

And you can use:你可以使用:

In [1]: import locale;locale.setlocale(locale.LC_ALL, "pt_BR.utf8")
In [2]: from localeparserinfo import LocaleParserInfo                                   

In [3]: from dateutil.parser import parse                                                

In [4]: parse("Ter, 01 Out 2013 14:26:00 -0300", parserinfo=PtParserInfo())              
Out[4]: datetime.datetime(2013, 10, 1, 14, 26, tzinfo=tzoffset(None, -10800))

Test it and take a look the class variables in the original parseinfo, specially the HMS variable.测试一下,看看原始解析信息中的类变量,特别是HMS变量。 Maybe'll need to declare other variables.也许需要声明其他变量。

Multiple languages - Allow english and german month names多种语言 - 允许英文和德文月份名称

The way to implement multiple languages at once.一次实现多种语言的方法。 I know there are other possibilites with calendar.day_abbr, calendar.day_name but this was the most convinient one for me.我知道calendar.day_abbr, calendar.day_name还有其他可能,但这对我来说是最方便的。 Just combine all the month names and list them down all togehter.只需结合所有月份名称并将它们全部列出。 These then will get accepted by the dateutil.parser然后这些将被dateutil.parser接受

from dateutil import parser as dateparser

class LocaleParserInfo(dateparser.parserinfo):
        MONTHS = [('Jan', 'Januar', 'January', 'Jänner'),
                  ('Feb', 'Februar', 'February'),
                  ('Mrz', 'März', 'March', 'Mar'),
                  ('Apr', 'April'),
                  ('Mai', 'May'),
                  ('Jun', 'Juni', 'June'),
                  ('Jul', 'Juli', 'July'),
                  ('Aug', 'August'),
                  ('Sep', 'September'),
                  ('Okt', 'Oktober', 'October', 'Oct'),
                  ('Nov', 'November'),
                  ('Dez', 'Dezember', 'Dec', 'December')]

parsed_date = dateparser.parse("31 Jänner 2022", dayfirst=True, parserinfo=LocaleParserInfo())
parsed_date = dateparser.parse("31.December 2022", dayfirst=True, parserinfo=LocaleParserInfo())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM