AWK：提取两个不同模式之间的字符串

Question

I need to extract a string contained in a column of my csv. 我需要提取包含在我的csv列中的字符串。

My file is like this: 我的文件是这样的：

col1;col2;col3;cleavage=10-11;
col1;col2;col3;cleavage=1-2;
col1;col2;col3;cleavage=100-101;
col1;col2;col3;none;

So, the delimiter of my file is ";" 因此，我文件的定界符是“;” but in column 4 I want to extract the string between "cleavage=" and a "-". 但是在第4列中，我想提取“ cleavage =“和“-”之间的字符串。 What I did was to print the 2 chars after "cleavage=", but it's not always 2 chars. 我所做的是在“ cleavage =”之后打印2个字符，但并不总是2个字符。

I did it this way: 我这样做是这样的：

awk -F "\"*;\"*" '{if (match($4,"cleavage=")) print $1";"$2";"$3";"substr($4,RSTART+9,2); else print $1";"$2";"$3";0"}' file

I figured out that the following should be the correct command, but how should I integrate it in the previous one? 我发现以下命令应该是正确的命令，但是如何将其集成到上一个命令中呢？

awk "/Pattern1/,/Pattern2/ { print }" inputFile

Thanks for help! 感谢帮助！ :) :)

EDIT: My actual output is 编辑：我的实际输出是

col1;col2;col3;10;
col1;col2;col3;1-;
col1;col2;col3;10;
col1;col2;col3;0;

But what I would like is: 但是我想要的是：

col1;col2;col3;10;
col1;col2;col3;1;
col1;col2;col3;100;
col1;col2;col3;0;

Answer 1

You can use this awk with multiple delimiters as field separator: 您可以将此awk与多个分隔符一起用作字段分隔符：

awk -F '[;=-]' -v OFS=';' '{print $1, $2, $3, ($4 == "cleavage") ? $5 : 0, ""}' file
col1;col2;col3;10;
col1;col2;col3;1;
col1;col2;col3;100;
col1;col2;col3;0;

EDIT: In case - or = can be present in fields before $4 then you can use: 编辑：如果-或=可以出现在$4之前的字段中，则可以使用：

awk -F ';' -v OFS=';' '{split($4, a, /[=-]/);
           print $1, $2, $3, (a[1] == "cleavage") ? a[2] : 0, ""}' file
col1;col2;col3;10;
col1;col2;col3;1;
col1;col2;col3;100;
col1;col2;col3;0;

Answer 2

Unclear of the exact format but this works for your example and will work if = and - are in other fields. 不清楚确切的格式，但这适用于您的示例，如果=和-在其他字段中，则可以使用。

GNU awk (for match 3rd arg) GNU awk（用于第3个匹配项）

awk '{match($0,/(.*);[^-0-9]*([0-9]*)[^;]*;$/,a);print a[1]";"+a[2]";"}' file

col1;col2;col3;10;
col1;col2;col3;1;
col1;col2;col3;100;
col1;col2;col3;0;

or sed 或sed

sed 's/;[^-0-9]*\([0-9]\{1,\}\)[^;]*;$/;\1;/;t;s/[^;]*;$/0;/' file

Answer 3

I come up with this one liner: 我想出了这支班轮：

 awk -F';' -v OFS=";" '{sub(/cleavage=/,"",$(NF-1));
                        sub(/-.*/,"",$(NF-1));$(NF-1)+=0}7' file

it gives 它给

col1;col2;col3;10;
col1;col2;col3;1;
col1;col2;col3;100;
col1;col2;col3;0;

AWK：提取两个不同模式之间的字符串

问题描述

3 个解决方案

解决方案1
1 已采纳 2015-10-20 13:40:54

解决方案2
1 2015-10-20 13:48:11

解决方案3
0 2015-10-20 13:47:05

AWK：提取两个不同模式之间的字符串

问题描述

3 个解决方案

解决方案1 1 已采纳 2015-10-20 13:40:54

解决方案2 1 2015-10-20 13:48:11

解决方案3 0 2015-10-20 13:47:05

解决方案1
1 已采纳 2015-10-20 13:40:54

解决方案2
1 2015-10-20 13:48:11

解决方案3
0 2015-10-20 13:47:05