简体   繁体   English

使用sed提取多个值

[英]Extract multiple values using sed

So, I am trying to extract multiple values from a string using sed that are separated by ",". 因此,我正在尝试使用sed从字符串中提取多个值,这些值之间用“,”分隔。

Working Eg:

Input :
    echo "abc-de-aa-zzzz-1.2.3-4" | sed -E 's/(^([a-z]{3}-[a-z]{1,5}-[a-z]{1,5}-[a-z]{1,15})).*/\1/'

Output: 
     abc-de-aa-zzzz

Need help with the below expression: 需要以下表达式的帮助:

Not Working Eg:

Input:
    echo "abc-de-aa-zzzz-1.2.3-4,abc-de-aa-kkkk-1.2.5-4" | sed -E 's/(^([a-z]{3}-[a-z]{1,5}-[a-z]{1,5}-[a-z]{1,15})).*/\1/'

 Current output:
      abc-de-aa-zzzz

 Correct output:
      abc-de-aa-zzzz,abc-de-aa-kkkk

 This one works as well:
      abc-de-aa-zzzz
      abc-de-aa-kkkk

Thanks, 谢谢,

Jason 杰森

Sample input: 输入样例:

echo $x
abc-de-aa-zzzz-1.2.3-4,abc-de-aa-kkkk-1.2.5-4

awk only solution: awk解决方案:

 echo $x  |awk  'BEGIN{RS=",";FS=OFS="-"}{NF=4}1'
 abc-de-aa-zzzz
 abc-de-aa-kkkk

OR this, if you want output to be comma separated. 或者,如果要输出逗号分隔。 (One extra comma at the end) (末尾加一个逗号)

echo $x  |awk  'BEGIN{ORS=RS=",";FS=OFS="-"}{NF=4}1'
abc-de-aa-zzzz,abc-de-aa-kkkk,

dirty solution using tr and awk : 使用trawk肮脏解决方案:

echo $x |tr ',' '\n' |awk -F'-' -v OFS='-' '{NF=4}1'
abc-de-aa-zzzz
abc-de-aa-kkkk

It can be done pure Bash shell parameter expansion techniques but it involves multi-levels of extraction (two) though without using any third partly Linux tools like awk or sed . 可以完成纯Bash shell参数扩展技术,但是它涉及多级提取(两个),尽管没有使用任何第三部分的Linux工具,例如awksed You can run them directly on the command line. 您可以直接在命令行上运行它们。

# Read the input string into a bash array with a comma delimiter
$ IFS="," read -ra inputString <<< "abc-de-aa-zzzz-1.2.3-4,abc-de-aa-kkkk-1.2.5-4"

# For each of the individual strings, extract the sub-string from the end
# with de-limiter set as '-'
$ for eachString in "${inputString[@]}"; do tempString="${eachString%-*}"; \
       tempString="${tempString%-*}"; printf "%s\n" "$tempString"; done
abc-de-aa-zzzz
abc-de-aa-kkkk
$

One way is to delete only string not needed, in this case deletion pattern is - followed by 3 set of digits with . 一种方法是仅删除不需要的字符串,在这种情况下,删除模式为-后跟3组数字. as delimiter and then a final sequence of digits 作为定界符,然后是最后的数字序列

$ echo "abc-de-aa-zzzz-1.2.3-4,abc-de-aa-kkkk-1.2.5-4" | sed -E 's/-([0-9]+\.){2}[0-9]+-[0-9]+//g'
abc-de-aa-zzzz,abc-de-aa-kkkk


Alternate solutions: - extract what is required 替代解决方案: -提取所需内容

Using grep and pcre 使用greppcre

$ echo "abc-de-aa-zzzz-1.2.3-4,abc-de-aa-kkkk-1.2.5-4" | grep -oP '(^|,)\K([^-]+\-){3}[^-]+'
abc-de-aa-zzzz
abc-de-aa-kkkk

Using GNU sed 使用GNU sed

$ echo "abc-de-aa-zzzz-1.2.3-4,abc-de-aa-kkkk-1.2.5-4" | sed 's/,/\n/' | sed -E 's/^(([^-]+\-){3}[^-]+).*/\1/'
abc-de-aa-zzzz
abc-de-aa-kkkk


In case you need to combine the output as single line delimited by , 如果您需要将输出合并为以分隔的单行,

$ echo "abc-de-aa-zzzz-1.2.3-4,abc-de-aa-kkkk-1.2.5-4" | grep -oP '(^|,)\K([^-]+\-){3}[^-]+' | paste -s -d,
abc-de-aa-zzzz,abc-de-aa-kkkk

With awk : awk

awk -F, '{while(++i<=NF){sub(/-[0-9].*/,"",$i);print $i}}'

Sample: 样品:

echo "abc-de-aa-zzzz-1.2.3-4,abc-de-aa-kkkk-1.2.5-4" | awk -F, '{while(++i<=NF){sub(/-[0-9].*/,"",$i);print $i}}'
abc-de-aa-zzzz
abc-de-aa-kkkk

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM