简体   繁体   中英

cut command in bash terminating on quotation marks

So I am trying to read in a file that has a bunch of lines with an email address and then a nickname in them. I am trying to extract this nickname, which is surrounded by parentheses, like below

email@somewhere.com (Tom)

so my thought was just to use cut to get at the word Tom , but this is foiled when I end up with something like the following

email2@somewhereElse.com ("Bob")

Because Bob has quotes around it, the cut command fails as follows

cut: <file>: Illegal byte sequence

Does anyone know of a better way of doing this? or a way to solve this problem?

Reset your locale to C (raw uninterpreted byte sequence) to avoid Illegal byte sequence errors.

locale charmap
LC_ALL=C cut ... | LC_ALL=C sort ...

I think that

grep -o '(.*)' emailFile 

should do it. "Go through all lines in the file. Look for a sequence that starts with open parens, then any characters until close parens. Echo the bit that matches the string to stdout."

This preserves the quotes around the nickname... as well as the brackets. If you don't want those, you can strip them:

grep -o '(.*)' emailFile | sed 's/[(")]//g'

("replace any of the characters between square brackets with nothing, everywhere")

perl -lne '$_=~/[^\(]*\(([^)]*)\)/g;print $1'

在这里测试

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM