I have a file with a different encoding than the machine has. When using regex, .
does not match non-printable characters for the current character set.
The following prints 0:
echo -e "\xfc" | awk '{ print match( $0, "^.*$" ) }'
How I can match all chars including non-printable chars?
I can confirm that it doesn't work with de_DE.UTF-8
locale, but both de_DE.iso88591
and C
print a 1
. I can't tell you why, but the [:alpha:]
character class matches:
echo -e "\xfc" | awk '{ print match( $0, "^([[:alpha:]]|.)*$" ) }'
Or maybe you could change the locale settings for that awk
call:
OLDLANG=$LANG; export LANG=de_DE.iso88591; echo -e "\xfc" | awk '{ print match( $0, "^.*$" ) }'; export LANG=$OLDLANG
See also Using special characters in a string argument to the awk match function. Current locale settings .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.