简体   繁体   English

使用sed / awk的Linux文本文件操作

[英]Linux Text File Manipulation with sed/awk

I have a list in the following format 我有以下格式的列表

77 Infinite Dust
4 Illusion Dust
12 Dream Shard
29 Star's Sorrow

I need to change this to: 我需要将其更改为:

77 <a href="http://www.wowhead.com/?search=Infinite Dust">Infinite Dust</a>
4 <a href="http://www.wowhead.com/?search=Illusion Dust">Illusion Dust</a>
12 <a href="http://www.wowhead.com/?search=Dream Shard">Dream Shard</a>
29 <a href="http://www.wowhead.com/?search=Star's Sorrow">Star's Sorrow</a>

I've managed to get this list to the right format just missing the numbers by using: 我设法通过使用以下命令使列表正确格式,只是缺少数字:

sed 's|^[0-9]*.|<a href="http://www.wowhead.com/?search=|g' filename | sed 's|$|">|g' | sed 's#<a[ \t][ \t]*href[ \t]*=[ \t]*".*search=\([^"]*\)">#&\1</a>#'

But I can't figure out how to get it to keep the numbers before the list, any help appreciated, thanks! 但是我不知道如何将其保持在列表前,感谢任何帮助,谢谢!

You can do this with sed by mapping the line parts to groups. 您可以通过将线段映射到组来使用sed。 in sed groups the A and B in (A)--(B) match to \\1 and \\2, with the added wrinkle that the "()" need to be escaped: eg 在sed组中,(A)-(B)中的A和B匹配\\ 1和\\ 2,并且增加了皱纹,需要逃脱“()”:例如

sed 's/\([0-9]*\)\ \(.*\)$/\1 -- \2/g' testfile

maps the numbers up to the space to group 1 and everything following to group 2. You can then map group 1 and 2 to whatever you like -, eg by changing the sed replacement to something like 将数字映射到组1的空间,然后将所有内容映射到组2。然后,可以将组1和2映射到所需的对象-例如,通过将sed替换项更改为

 \1 <a href.....\2">\2</a>

If you had told us what you were ultimately trying to do in your last question , we would have told you a much easier way to do so. 如果您告诉我们您在最后一个问题中最终想做什么,我们会告诉您一种更简单的方法。

As I said in my answer to your last question, you can have sed remember a part of the pattern, and refer to that part as \\1 , \\2 , etc. 正如我在说我的回答对你的最后一个问题,你可以有sed记得模式的一部分,并且是指一部分\\1\\2 ,等等。

You need to remember the number and the rest of the line separately, so the pattern is: \\([0-9]*\\) \\(.*\\) : which is basically zero of more digits, followed by space, followed by any number of characters. 您需要分别记住数字和行的其余部分,因此模式为: \\([0-9]*\\) \\(.*\\) :基本上是零个数字,后跟空格,然后是空格任何数量的字符。

So your sed command becomes: 因此,您的sed命令变为:

`sed -e 's|\([0-9]*\) \(.*\)|\1 <a href="http://www.wowhead.com/?search=\2">\2</a>|'

That command does everything you want in one go. 该命令可一次性完成您想要的所有操作。

awk '
{
    s=""
    for(i=2;i<NF;i++) s=s$i
    s=s" "$NF
    printf $1 "<a href=\"http://www.wowhead.com/?search="s
    print "\042>"s"</a>"

} ' file

output 输出

$ ./shell.sh
77<a href="http://www.wowhead.com/?search=Infinite Dust">Infinite Dust</a>
4<a href="http://www.wowhead.com/?search=Illusion Dust">Illusion Dust</a>
12<a href="http://www.wowhead.com/?search=Dream Shard">Dream Shard</a>
29<a href="http://www.wowhead.com/?search=Star's Sorrow">Star's Sorrow</a>

With awk it would be something like: 使用awk时,它将类似于:

{  
   rest = substr($0, length($1)+2, length($0));
   printf("%d <a href=\"http://www.wowhead.com/?search=%s\">%s</a>\n", $1, rest, rest); 
}

In sed, you can use the & character to place the matched pattern in the replacement text. 在sed中,您可以使用&字符将匹配的模式放置在替换文本中。 For example: 例如:

echo xyz | 回声xyz | sed 's/^xyz/abc &/' sed's / ^ xyz / abc&/'

would output 将输出

abc xyz abc xyz

So in your example, 所以在你的例子中

sed 's|^[0-9]*.|& <a href .... sed's | ^ [0-9] *。|&<a href ....

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM