简体   繁体   中英

Extract & store Strings with uneven spaces using AWK

I have a file contain data like below. I want to cut first and last Columns and store in variables. I am able to print it using command " awk -F" {2,}" '{print $1,$NF}' filename.txt " but I am unable to store it in variables using awk -v command.

The main problem is that first column contains space between words and awk is treating it 3 columns if I am using awk -v command.

Please suggest me how I can achieve this.


XML 2144 11270 2846 3385074

Java 7356 272651 242949 1350596

C++ 671 46497 42702 179366

C/C++ Header 671 16932 57837 44248

XSD 216 3131 807 27634

Korn Shell 129 3686 4279 12431

IDL 90 1098 0 8697

Perl 17 717 795 5698

Python 37 1102 786 4640

Ant 62 596 154 4015

XSLT 18 117 13 2153

make 14 414 1659 1833

Bourne Again Shell 32 532 469 1830

JavaScript 10 204 35 1160

CSS 5 95 45 735

SKILL 2 77 0 523

HTML 11 70 49 494

SQL 9 39 89 71

C Shell 3 13 25 31

D 1 5 15 10

SUM: 11498 359246 355554 5031239

The -v VAR=value parameter is evaluated before the awk code executes. It's not actually part of the code, so you can't reference fields because they don't exist yet. Instead, set the variable in code:

awk '{ Lang=$1; Last=$NF; print Lang, Last; }'

Also, setting those variables within awk won't affect bash's variables. Environments are hierarchical--each child environment inherits some state from the parent environment, but it never flows back upwards. The only way to get state from a child is for the child to print it in a format that the parent can handle. For example, you can pipe the above command to while read LANG LAST; do ...; done while read LANG LAST; do ...; done while read LANG LAST; do ...; done to read the awk output into variables.

It seems from your comment that you're trying to mix awk and shell in a way that doesn't quite make sense. So the correct full code (for getting the variables in a bash loop) would be:

cat loc.txt | awk '{ Lang=$1; Last=$NF; print Lang, Last; }' | while read LANG LAST; do ...; done

Or if it's a fixed number of fields, you can skip awk entirely:

cat loc.txt | while read LANG _ _ _ _ LAST; do ...; done

where the "_" just represents a variable which is created and ignored. It's a bit of a convention that underscores represent placeholders in some programming languages, and in this case it's actually a variable which could be printed with echo $_ . You'd give it a real name, and name each field differently, if you cared about the middle values.

Neither of these solutions cares about how much whitespace there is. Awk doesn't care unless you tell it to, and neither does the shell.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM