简体   繁体   English

使用 sed 在文本文件中查找匹配项和创建新行时出错

[英]Error finding matches and creating new lines in a text file using sed

I have this text in a file我在文件中有这个文本

random text 1.- random text 2. random text 3.- random 22 text 4. random text

I want to create a new line before each number followed by a dot.我想在每个数字前创建一个新行,后跟一个点。 This is my code:这是我的代码:

for number in {1..4}
do
var=$(sed -n "s/.*\([^0-9]$number\.[^0-9]\).*/\1/p" file)
echo $var
sed -i "s/$var/\n$var/g" file 
done

This is the result I get:这是我得到的结果:

random text
 1.- random text
 2. random text
 3.- random
 2. text
 4. random text

I do not understand why it creates a new line before the number 22 if there is no dot.如果没有点,我不明白为什么它会在数字 22 之前创建一个新行。 The expected result would be this:预期的结果是这样的:

random text
1.- random text
2. random text
3.- random 22 text
4. random text

Could someone help me and explain where my mistake is?有人可以帮助我并解释我的错误在哪里吗? Thank you very much非常感谢

When number=2 you get var=' 2.'number=2时,你会得到var=' 2.'

This gets fed into the last sed command as / 2./\n 2./g where the first 2. says to match a literal 2 with any other single character ( . ) which is why it ends up matching on 22 .这将作为/ 2./\n 2./g输入最后一个sed命令,其中第一个2.表示将文字2与任何其他单个字符( . )匹配,这就是它最终匹配22的原因。 Then to confuse ya a bit more, the \n 2. says to insert the literal 2. hence the 22 is replaced with 2. .然后让你更加困惑, \n 2.说要插入文字2.因此22被替换为2.

Consider:考虑:

$ echo '22' | sed 's/2./2./'
2.

One quick-fix for the current code is to use parameter substitution to add a backslash to escape the .当前代码的一个快速修复方法是使用参数替换添加反斜杠来转义. in $var :$var中:

for number in {1..4}
do
    var=$(sed -n "s/.*\([^0-9]$number\.[^0-9]\).*/\1/p" file)
    var="${var/./\\.}"      # replace "." with "\."
    echo $var
    sed -i "s/$var/\n$var/g" file
done

$ cat file
random text
 1.- random text
 2. random text
 3.- random 22 text
 4. random text

The leading space is your match of the first [^0-9] in the first sed command and since it's inside the parens it's considered part of the capture group and thus gets included in the \1 reference.前导空格是第一个sed命令中第一个[^0-9]的匹配项,由于它位于括号内,因此它被视为捕获组的一部分,因此包含在\1参考中。 Try moving the left paren to the right by one character, eg:尝试将左括号向右移动一个字符,例如:

for number in {1..4}
do
    # replace this:
    #var=$(sed -n "s/.*\([^0-9]$number\.[^0-9]\).*/\1/p" file)

    # with this:
    var=$(sed -n "s/.*[^0-9]\($number\.[^0-9]\).*/\1/p" file)

    var="${var/./\\.}"
    echo $var
    sed -i "s/[[:space:]]*$var/\n$var/g" file
done

NOTE: I've added the [[:space:]]* to match on any white space before $var ;注意:我添加了[[:space:]]*以匹配$var之前的任何空白; the replacement ( \n$var ) will effectively remove said white space from the end of what will now be the line-before $var .替换( \n$var )将有效地从现在将成为$var 之前的行的末尾删除所述空白。

The results:结果:

$ cat file
random text
1.- random text
2. random text
3.- random 22 text
4. random text

I'd write it using a single sed command:我将使用单个sed命令编写它:

sed 's/ \([0-9][0-9]*\.\)/\
\1/g' file

Using sed in a single pass一次性使用sed

$ sed -i.bak 's/[0-9]*\.[^0-9]*/\n&/g' file
random text
1.- random text
2. random text
3.- random 22 text
4. random text

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM