简体   繁体   English

sed多条件正则表达式匹配

[英]Sed multiple conditions regex matching

I am making a bash script that will take a txt file as input, delete all lines containing dash ("-") or any integer (anywhere in the line) from it and parse it to a new file. 我正在制作一个bash脚本,它将txt文件作为输入,从其中删除所有包含破折号(“-”)或任何整数(行中的任何地方)的行,并将其解析为一个新文件。

I tried multiple ways but I had 0 success. 我尝试了多种方法,但成功了0次。

I'm stuck trying to figure out correct regex for "delete all lines containing number OR dash" since I can't make it work. 我一直在努力找出正确的正则表达式,以“删除包含数字或破折号的所有行”,因为我无法使其正常工作。

Here's my code: 这是我的代码:

wget -q awsfile1.csv.zip                      # downloads file
unzip "awsfile1".zip                          # unzips it
cut -d, -f 2 file1.csv > file2.csv            # cuts it
sort file2.csv > file2.txt                    # translates csv into text
printf "Removing lines containing numbers.\n" # prints output
sed 's/[0-9][0-9]*/Number/g'  file2.txt > file2-b.txt  # doesn't do anything, file is empty on the output

Thanks. 谢谢。

you can combine cut and filter into an awk script and sort after 您可以将剪切和过滤合并为awk脚本,然后进行排序

... get and unzip file
$ awk -F, '$2!~/[-0-9]/{print $2}' file | sort

print field 2 if it doesn't contain any digits or hyphen. 如果不包含任何数字或连字符,则打印字段2。

This might work for you (GNU sed): 这可能对您有用(GNU sed):

sed -E 'h;s/\S+/\n&\n/2;/\n.*[-0-9].*\n/d;x' file

Copy the current line, isolate the 2nd field and delete the line if it contains the required strings, otherwise revert to the original line. 复制当前行,隔离第二个字段,如果该行包含必需的字符串,则删除该行,否则恢复为原始行。

NB This prints the original line, if you only want the 2nd field, use: 注意:这将打印原始行,如果只需要第二个字段,请使用:

sed -E 's/\S+/\n&\n/2;s/.*\n(.*)\n.*/\1/;/[-0-9]/d' file

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM