simple filtering with `grep` , `awk`, `sed` or whatever else that's capable

Question

I have a file, each line of which can be described by this grammar:

<text> <colon> <fullpath> <comma> <"by"> <text> <colon> <text> <colon> <text> <colon> <text>

Eg.,

needs fixing (Sunday): src/foo/io.c, by Smith : in progress : <... random comment ...>

How do I get the <fullpath> portion, which lies between the first <colon> and the first <comma>

(I'm not very inclined to write a program to parse this, though this looks like it could be done easily with javacc. Hoping to use some built-in tools like sed , awk , ...)

Answer 1

Or with a regex substitution

sed -n 's/^[^:]*:\([^:,]*\),.*/\1/p' file

Linux sed dialect; if on a different platform, maybe you need an -E option and/or take out the backslashes before the round parentheses; or just go with Perl instead;

perl -nle 'print $1 if m/:(.*?),/' file

Answer 2

Assuming the input will be similar to what you have above:

awk '{print $4}' | tr -d ,

For the entire file you can just type the file name next to the awk command to the command I have above.

Answer 3

If you're using bash script to parse this stuff, you don't even need tools like awk or sed.

$ text="needs fixing (Sunday): src/foo/io.c, by Smith : in progress : <... comment ...>"
$ text=${text%%,*}
$ text=${text#*: }
$ echo "$text"
src/foo/io.c

Read about this on the bash man page under Parameter Expansion .

Answer 4

with GNU grep:

grep -oP '(?<=: ).*?(?=,)'

This may find more than one substring if there are subsequent commas in the line.

simple filtering with `grep` , `awk`, `sed` or whatever else that's capable

Question

4 answers

solution1
2 2012-09-26 18:10:50

solution2
1 ACCPTED 2012-09-26 17:37:58

solution3
1 2012-09-26 18:35:41

solution4
1 2012-09-26 20:16:23

simple filtering with `grep` , `awk`, `sed` or whatever else that's capable

Question

4 answers

solution1 2 2012-09-26 18:10:50

solution2 1 ACCPTED 2012-09-26 17:37:58

solution3 1 2012-09-26 18:35:41

solution4 1 2012-09-26 20:16:23

solution1
2 2012-09-26 18:10:50

solution2
1 ACCPTED 2012-09-26 17:37:58

solution3
1 2012-09-26 18:35:41

solution4
1 2012-09-26 20:16:23