I need to use pyparsing for unicode characters. So I tried simple example from their github repository with French character cédille
and gives error.
My code
from pyparsing import Word, alphas
greet = Word(alphas) + "," + Word(alphas) + "!"
hello = "Hello, cédille!"
greet.parseString(hello)
and it gives error
pyparsing.ParseException: Expected "!" (at char 8), (line:1, col:9)
Is there a way to solve this problem?
Pyparsing has the pyparsing_unicode
module that defines a number of unicode character ranges with definitions for alphas
, nums
, and so on within each range. Ranges include CJK
, Cyrillic
, Devanagari
, Hebrew
, Arabic
, and others. The greetingInGreek.py
and greetingInKorean.py
examples in the examples directory show a couple of these in action.
Your example, using the Latin1 set, will look like:
from pyparsing import Word, pyparsing_unicode as ppu
intl_alphas = ppu.Latin1.alphas
greet = Word(intl_alphas) + "," + Word(intl_alphas) + "!"
hello = "Hello, cédille!"
print(greet.parseString(hello))
Prints:
['Hello', ',', 'cédille', '!']
alphas8bit
will probably be kept for legacy support, but new applications should use pyparsing_unicode.Latin1.alphas
.
alphas
is apparently English / pure ASCII only. The following appears to work:
from pyparsing import Word, alphas, alphas8bit
greet = Word(alphas+alphas8bit) + "," + Word(alphas+alphas8bit) + "!"
hello = "Hello, cédille!"
greet.parseString(hello)
This is Unicode, so there is nothing particularly "8-bit" about the character é ; but if the documentation is at least approximately correct, I guess it will still break with slightly more exotic accented characters (anything not available in Latin-1, like Czech or Polish accented characters, or go extreme and try Vietnamese).
Maybe explore the unicodedata
module to get a proper enumeration of "alphabetic" characters, or find a third-party module which exposes this Unicode feature properly.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.