简体   繁体   中英

What does it line mean in bash?

spkgender=$(perl -ane ' s/.*gender\\:\\W*(.).*/lc($1)/ei && print; ' <$rdm)

It is regex and it extracts M from 'Gender: Male', but it doesn't work for unicode.

How to make it work with unicode?

It doesn't work for 'Gender: Мужской' - looks like \\W "eats" all unicode symbols.

Use /u regex modifier. Source: https://perldoc.perl.org/perlre.html

spkgender=$(perl -ane ' s/.*gender\:\W*(.).*/lc($1)/uei && print; ' <$rdm)

Alternately, use the official POSIX character class. instead of \\W use [[:blank:]] . As far as I know it supports Unicode.

Also, please make sure you are using Unicode correctly in general. Reference: https://perldoc.perl.org/perlunicode.html

When the string has come from an external source marked as Unicode
The -C command line option can specify that certain inputs to the program are Unicode, and the values of this can be read by your Perl code, see ${^UNICODE} in perlvar.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM