[英]replace strings in file1 with empty space if strings not found in file2
This question: https://unix.stackexchange.com/questions/20322/replace-string-with-contents-of-a-file-using-sed replaces a fixed string in file1 with the contents of file 2.这个问题: https ://unix.stackexchange.com/questions/20322/replace-string-with-contents-of-a-file-using-sed 用文件 2 的内容替换文件 1 中的固定字符串。
I want to do this the other way around plus an inversion.我想反过来做,再加上一个反转。
If I have file1:如果我有文件 1:
A:B
B:B
C:
D:
E:A
and file2:和文件2:
D
E
:
then I want to be left with然后我想留下
:
:
:
D:
E:
If anyone has any pointers that would be great.如果有人有任何指示,那就太好了。 Bonus points if this can be done on a specific column of a file1 while preserving the rest of the file1.
如果这可以在文件 1 的特定列上完成,同时保留文件 1 的其余部分,则加分。
ie If I have three columns:即如果我有三列:
A:B A:B A:B
B:B B:B B:B
C: C: C:
D: D: D:
E:A E:A E:A
I would end up with (target column 2)我最终会得到(目标第 2 列)
A:B : A:B
B:B : B:B
C: : C:
D: D: D:
E:A E: E:A
tr
makes this trivial: tr
使这变得微不足道:
$ tr -cd "$(cat file2)" < file1
:
:
:
D:
E:
$ cat tst.awk
BEGIN { FS=OFS="\t" }
NR == FNR {
goodChars[$1]
next
}
{
goodStr = ""
for (i=1; i<=length($2); i++) {
char = substr($2,i,1)
if (char in goodChars) {
goodStr = goodStr char
}
}
$2 = goodStr
print
}
$ awk -f tst.awk file2 file1
A:B : A:B
B:B : B:B
C: : C:
D: D: D:
E:A E: E:A
The above assumes your input file is tab-separated as it looks like it is, otherwise just get rid of the BEGIN section.上面假设您的输入文件是由制表符分隔的,否则就去掉 BEGIN 部分。
This might work for you (GNU sed):这可能对你有用(GNU sed):
sed -z 's/\n//g;s/.*/s#[^&]##g/' file2 | sed -f - file1
Convert file2 into a sed script and run it against file1.将 file2 转换为 sed 脚本并针对 file1 运行它。 This concatenates each character in file2 and places them in a negative character class inside a sed substitution command which runs globally ie the command removes all occurrences of any character in file2 from file1.
这将连接 file2 中的每个字符,并将它们放在全局运行的 sed 替换命令中的负字符类中,即该命令从 file1 中删除 file2 中所有出现的任何字符。
To cater for the second problem, add newlines to the negative character class, isolate the second column, make a copy, apply the same code and using pattern matching replace the second column with the amended value:为了解决第二个问题,向负字符类添加换行符,隔离第二列,复制,应用相同的代码并使用模式匹配将第二列替换为修改后的值:
sed -z 's/\n//g;s/.*/s#[^&\\n]##g/' file2 |
sed -Ee 's/\S+/\n&\n/2;h' -f - -e 'H;g;s/\n.*\n(.*)\n.*\n(.*)\n/\2\1/' file3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.