awk/sed regex, extract a column that has the delimiter

Question

I have a file with this format: two columns of numbers in the beginning and two columns of number in the end and one column in the middle which is the name but the name has a delimiter of space which mess things up.

Is there any kind of regex that I can take out the name column correctly. Is there anyway that i can use sed to replace (or remove) the space in that column so that I can take that out column out easily?

Example:

 1 2 name 3 4
 12 12 name1 name2 3 4
 12 12 name1 name2 name3 name4 3 4 
 3 4 name 3 4

-- The output that I want to have is:

name 
name1_name2
name1_name2_name3_name4
name

Thanks,

Amir,

Answer 1

One solution using awk is:

cat foo | awk '{ for(i=3; i<=NF-3; i++) { printf $i "_"; } printf $i "\n";  }'

Here is the same thing using sed:

cat foo  | sed -e 's/^[0-9 ]*//g' -e 's/ [0-9 ]*$//g' -e 's/ /_/g'

POSIX compliant for clarity:

cat foo  | sed -e 's/^[[:digit:][:space:]]*//g' -e 's/[[:space:]]*[[:digit:][:space:]]*$//g' -e 's/ /_/g'

Answer 2

sed 's/^[0-9]\+ [0-9]\+ \(.*\) [0-9]\+ [0-9]\+$/\1/;s/ /_/g'

Answer 3

another awk way without looping

 awk 'BEGIN{OFS="_"}{$1=$2=$NF=$(NF-1)="";gsub(/__/,"")}1' yourFile

test :

kent$  cat t
 1 2 name 3 4
 12 12 name1 name2 3 4
 12 12 name1 name2 name3 name4 3 4 
 3 4 name 3 4 

kent$  awk 'BEGIN{OFS="_"}{$1=$2=$NF=$(NF-1)="";gsub(/__/,"")}1' t
name
name1_name2
name1_name2_name3_name4
name

Answer 4

Couple of Perl options

perl -lne  '/\d+ \d+ (.+) \d+ \d+/ and do {($_ = $1) =~ s/ /_/g; print}'
perl -lape  'for (1..2) {shift @F; pop @F}; $_ = join "_", @F'

awk/sed regex, extract a column that has the delimiter

Question

4 answers

solution1
2 ACCPTED 2011-10-23 20:29:57

solution2
1 2011-10-23 20:32:04

solution3
1 2011-10-23 20:41:13

solution4
0 2011-10-24 13:17:33

awk/sed regex, extract a column that has the delimiter

Question

4 answers

solution1 2 ACCPTED 2011-10-23 20:29:57

solution2 1 2011-10-23 20:32:04

solution3 1 2011-10-23 20:41:13

solution4 0 2011-10-24 13:17:33

solution1
2 ACCPTED 2011-10-23 20:29:57

solution2
1 2011-10-23 20:32:04

solution3
1 2011-10-23 20:41:13

solution4
0 2011-10-24 13:17:33