我有如下 3 行的文件。使用 linux 如何获取一行的拆分变量并将其附加到同一行

Question

Using linux, how can I get the below desired output for the given Input.使用 linux，如何为给定的输入获得以下所需的输出。 Input file:输入文件：

Line1: StringA1, stringB1| stringC1, stringD1, stringE1
Line2: StringA2, stringB2| stringC2, stringD2
Line3: StringA3, stringB3| stringC3, stringD3, stringE3, stringF3

My output should be:我的输出应该是：

StringA1, stringB1| stringC1
StringA1, stringB1| stringD1
StringA1, stringB1| stringE1
StringA2, stringB2| stringC2
StringA2, stringB2| stringD2
StringA3, stringB3| stringC3
StringA3, stringB3| stringD3
StringA3, stringB3| stringE3
StringA3, stringB3| stringF3

Answer 1

Assumptions:假设：

all lines have at least 3 fields所有行至少有 3 个字段
the lines do not contain the string Line#: (otherwise we just need to modify the proposed script)这些行不包含字符串Line#:否则我们只需要修改建议的脚本）

Sample data:样本数据：

$ cat strings.dat
StringA1, stringB1| stringC1, stringD1, stringE1
StringA2, stringB2| stringC2, stringD2
StringA3, stringB3| stringC3, stringD3, stringE3, stringF3

One awk solution:一种awk解决方案：

awk -F"[,|]" '
{ for ( i=3;i<=NF;i++ )
      { printf "%s,%s|%s\n", $1, $2, $i }
}' strings.dat

Where:在哪里：

-F"[,|]" - use comma and pipe ( ,| ) as input delimiters -F"[,|]" - 使用逗号和管道 ( ,| ) 作为输入分隔符
for ( i=3;i<=NF;i++ ) - for fields 3 to end of line (NF == number of fields == last field) for ( i=3;i<=NF;i++ ) - 用于字段 3 到行尾（NF == 字段数 == 最后一个字段）
{ printf ... } - print 1st, 2nd and ith fields { printf ... } - 打印第一个、第二个和ith字段

Results of running the above:以上运行结果：

StringA1, stringB1| stringC1
StringA1, stringB1| stringD1
StringA1, stringB1| stringE1
StringA2, stringB2| stringC2
StringA2, stringB2| stringD2
StringA3, stringB3| stringC3
StringA3, stringB3| stringD3
StringA3, stringB3| stringE3
StringA3, stringB3| stringF3

Answer 2

When you make a solution in sed , it will become hard to read and hard to maintain:当您在sed制定解决方案时，它将变得难以阅读且难以维护：

sed -E 's/,/\v/; :a; s/(.*\|)(.*),(.*)$/\1\2\r\1\3/;ta; s/\v/,/g;s/\r/\n/g' inputfile

Explanation:解释：
s/,/\\v/ Most , should be replaced, but not the one in the replacement string. s/,/\\v/ Most ,应该被替换，但不是替换字符串中的那个。
:a Repeat next command (until ta ) while a replacement is found. :a在找到替换时重复下一个命令（直到ta ）。
(.*\\|)(.*),(.*)$ Match 3 substrings: The starter, the middle part util the last , and the end part. (.*\\|)(.*),(.*)$第3子：起动器，中间部分UTIL最后,和端部。
\\r Use the windows CR as a marker where we want a newline when finished. \\r使用 windows CR 作为标记，完成后我们需要换行符。
\\1 Replace with the first remembered string (in example StringA1, stringB1 ). \\1替换为第一个记住的字符串（例如StringA1, stringB1 ）。
/\\1\\2\\r\\1\\3/ Replace the last , with a newline marker and the starter. /\\1\\2\\r\\1\\3/用换行符和起始符替换最后一个, 。
ta; Repeat until all replacements are done.重复直到所有替换完成。
s/\\v/,/g; Restore the , characters.恢复,字符。
s/\\r/\\n/g' Replace new line marker with a real newline. s/\\r/\\n/g'用真正的换行符替换新行标记。

Other ways are using awk and a while loop .其他方法是使用awk和while loop 。 For a large file I recommand awk , perhaps you want to try this yourself before someone posts an answer.对于我推荐的大文件awk ，也许您想在有人发布答案之前自己尝试一下。

Answer 3

In order to produce your desired output, if you are splitting on [,|] , you must further remove the beginning of field1 before outputting the results.为了产生您想要的输出，如果您在[,|]上进行拆分，则必须在输出结果之前进一步删除field1的开头。 There are two ways I see to do that.我认为有两种方法可以做到这一点。 The first way simply splits field1 into an array with the fieldsep of ' ' , the second is with a combination of substr, match & length .第一种方法简单地将field1拆分为一个数组，其中fieldsep为' ' ，第二种方法是使用substr, match & length的组合。 The first is the simple way of doing it using the split() command, eg第一个是使用split()命令的简单方法，例如

awk -F '[,|]' '{
    split ($1, arr, / /)
    for (i=3; i<=NF; i++) {
        printf "%s,%s|%s\n", arr[2], $2, $i
    }
}' file

For the second, you can remove split() above and replace arr[2] with:对于第二个，您可以删除上面的split()并将arr[2]替换为：

substr($1,match($1,/ /)+1,length($1)-match($1,/ /))

If your data file does not include "Line[0-9]: " as the prefix for each line, you can include the following as your printf to handle either case:如果您的数据文件不包括"Line[0-9]: "作为每一行的前缀，您可以将以下内容作为您的printf来处理任何一种情况：

printf "%s,%s|%s\n", arr[2]=="" ? arr[1] : arr[2], $2, $i

The results are the same either way, but using split() would be the recommended way.两种方式的结果都是一样的，但使用split()将是推荐的方式。

Example Use/Output示例使用/输出

Using the proposed awk solution with your data file (named file adjust as needed), you can just select-copy/middle-mouse-paste in an xterm with the file in the current directory to obtain the results, eg将建议的awk解决方案与您的数据文件（根据需要调整命名file ）一起使用，您只需在当前目录中的file的 xterm 中 select-copy/middle-mouse-paste 即可获得结果，例如

$ awk -F '[,|]' '{
>     split ($1, arr, / /)
>     for (i=3; i<=NF; i++) {
>         printf "%s,%s|%s\n", arr[2], $2, $i
>     }
> }' file
StringA1, stringB1| stringC1
StringA1, stringB1| stringD1
StringA1, stringB1| stringE1
StringA2, stringB2| stringC2
StringA2, stringB2| stringD2
StringA3, stringB3| stringC3
StringA3, stringB3| stringD3
StringA3, stringB3| stringE3
StringA3, stringB3| stringF3

Look things over and let me know if you have further questions.仔细检查一下，如果您还有其他问题，请告诉我。

我有如下 3 行的文件。使用 linux 如何获取一行的拆分变量并将其附加到同一行

问题描述

3 个解决方案

解决方案1
2 2020-02-07 21:44:37

解决方案2
1 2020-02-07 15:33:35

解决方案3
0 2020-02-08 00:09:10

我有如下 3 行的文件。 使用 linux 如何获取一行的拆分变量并将其附加到同一行

问题描述

3 个解决方案

解决方案1 2 2020-02-07 21:44:37

解决方案2 1 2020-02-07 15:33:35

解决方案3 0 2020-02-08 00:09:10

我有如下 3 行的文件。使用 linux 如何获取一行的拆分变量并将其附加到同一行

解决方案1
2 2020-02-07 21:44:37

解决方案2
1 2020-02-07 15:33:35

解决方案3
0 2020-02-08 00:09:10