简体   繁体   English

我有如下 3 行的文件。 使用 linux 如何获取一行的拆分变量并将其附加到同一行

[英]I have file with with 3 lines as follows. Using linux how can i get the split variables of a line and append it to the same line

Using linux, how can I get the below desired output for the given Input.使用 linux,如何为给定的输入获得以下所需的输出。 Input file:输入文件:

Line1: StringA1, stringB1| stringC1, stringD1, stringE1
Line2: StringA2, stringB2| stringC2, stringD2
Line3: StringA3, stringB3| stringC3, stringD3, stringE3, stringF3

My output should be:我的输出应该是:

StringA1, stringB1| stringC1
StringA1, stringB1| stringD1
StringA1, stringB1| stringE1
StringA2, stringB2| stringC2
StringA2, stringB2| stringD2
StringA3, stringB3| stringC3
StringA3, stringB3| stringD3
StringA3, stringB3| stringE3
StringA3, stringB3| stringF3

Assumptions:假设:

  • all lines have at least 3 fields所有行至少有 3 个字段
  • the lines do not contain the string Line#: (otherwise we just need to modify the proposed script)这些行不包含字符串Line#:否则我们只需要修改建议的脚本)

Sample data:样本数据:

$ cat strings.dat
StringA1, stringB1| stringC1, stringD1, stringE1
StringA2, stringB2| stringC2, stringD2
StringA3, stringB3| stringC3, stringD3, stringE3, stringF3

One awk solution:一种awk解决方案:

awk -F"[,|]" '
{ for ( i=3;i<=NF;i++ )
      { printf "%s,%s|%s\n", $1, $2, $i }
}' strings.dat

Where:在哪里:

  • -F"[,|]" - use comma and pipe ( ,| ) as input delimiters -F"[,|]" - 使用逗号和管道 ( ,| ) 作为输入分隔符
  • for ( i=3;i<=NF;i++ ) - for fields 3 to end of line (NF == number of fields == last field) for ( i=3;i<=NF;i++ ) - 用于字段 3 到行尾(NF == 字段数 == 最后一个字段)
  • { printf ... } - print 1st, 2nd and ith fields { printf ... } - 打印第一个、第二个和ith字段

Results of running the above:以上运行结果:

StringA1, stringB1| stringC1
StringA1, stringB1| stringD1
StringA1, stringB1| stringE1
StringA2, stringB2| stringC2
StringA2, stringB2| stringD2
StringA3, stringB3| stringC3
StringA3, stringB3| stringD3
StringA3, stringB3| stringE3
StringA3, stringB3| stringF3

When you make a solution in sed , it will become hard to read and hard to maintain:当您在sed制定解决方案时,它将变得难以阅读且难以维护:

sed -E 's/,/\v/; :a; s/(.*\|)(.*),(.*)$/\1\2\r\1\3/;ta; s/\v/,/g;s/\r/\n/g' inputfile

Explanation:解释:
s/,/\\v/ Most , should be replaced, but not the one in the replacement string. s/,/\\v/ Most ,应该被替换,但不是替换字符串中的那个。
:a Repeat next command (until ta ) while a replacement is found. :a在找到替换时重复下一个命令(直到ta )。
(.*\\|)(.*),(.*)$ Match 3 substrings: The starter, the middle part util the last , and the end part. (.*\\|)(.*),(.*)$第3子:起动器,中间部分UTIL最后,和端部。
\\r Use the windows CR as a marker where we want a newline when finished. \\r使用 windows CR 作为标记,完成后我们需要换行符。
\\1 Replace with the first remembered string (in example StringA1, stringB1 ). \\1替换为第一个记住的字符串(例如StringA1, stringB1 )。
/\\1\\2\\r\\1\\3/ Replace the last , with a newline marker and the starter. /\\1\\2\\r\\1\\3/用换行符和起始符替换最后一个,
ta; Repeat until all replacements are done.重复直到所有替换完成。
s/\\v/,/g; Restore the , characters.恢复,字符。
s/\\r/\\n/g' Replace new line marker with a real newline. s/\\r/\\n/g'用真正的换行符替换新行标记。

Other ways are using awk and a while loop .其他方法是使用awkwhile loop For a large file I recommand awk , perhaps you want to try this yourself before someone posts an answer.对于我推荐的大文件awk ,也许您想在有人发布答案之前自己尝试一下。

In order to produce your desired output, if you are splitting on [,|] , you must further remove the beginning of field1 before outputting the results.为了产生您想要的输出,如果您在[,|]上进行拆分,则必须在输出结果之前进一步删除field1的开头。 There are two ways I see to do that.我认为有两种方法可以做到这一点。 The first way simply splits field1 into an array with the fieldsep of ' ' , the second is with a combination of substr, match & length .第一种方法简单地将field1拆分为一个数组,其中fieldsep' ' ,第二种方法是使用substr, match & length的组合。 The first is the simple way of doing it using the split() command, eg第一个是使用split()命令的简单方法,例如

awk -F '[,|]' '{
    split ($1, arr, / /)
    for (i=3; i<=NF; i++) {
        printf "%s,%s|%s\n", arr[2], $2, $i
    }
}' file

For the second, you can remove split() above and replace arr[2] with:对于第二个,您可以删除上面的split()并将arr[2]替换为:

substr($1,match($1,/ /)+1,length($1)-match($1,/ /))

If your data file does not include "Line[0-9]: " as the prefix for each line, you can include the following as your printf to handle either case:如果您的数据文件不包括"Line[0-9]: "作为每一行的前缀,您可以将以下内容作为您的printf来处理任何一种情况:

printf "%s,%s|%s\n", arr[2]=="" ? arr[1] : arr[2], $2, $i

The results are the same either way, but using split() would be the recommended way.两种方式的结果都是一样的,但使用split()将是推荐的方式。

Example Use/Output示例使用/输出

Using the proposed awk solution with your data file (named file adjust as needed), you can just select-copy/middle-mouse-paste in an xterm with the file in the current directory to obtain the results, eg将建议的awk解决方案与您的数据文件(根据需要调整命名file )一起使用,您只需在当前目录中的file的 xterm 中 select-copy/middle-mouse-paste 即可获得结果,例如

$ awk -F '[,|]' '{
>     split ($1, arr, / /)
>     for (i=3; i<=NF; i++) {
>         printf "%s,%s|%s\n", arr[2], $2, $i
>     }
> }' file
StringA1, stringB1| stringC1
StringA1, stringB1| stringD1
StringA1, stringB1| stringE1
StringA2, stringB2| stringC2
StringA2, stringB2| stringD2
StringA3, stringB3| stringC3
StringA3, stringB3| stringD3
StringA3, stringB3| stringE3
StringA3, stringB3| stringF3

Look things over and let me know if you have further questions.仔细检查一下,如果您还有其他问题,请告诉我。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Linux中的一个非常大的文件的开头附加一行? - How do I append a line to the beginning of a very large file in Linux? 如何在Linux bash中将相邻行放在.txt文件中的同一行上? - How do I put adjacent lines in a .txt file on the same line in linux bash? 我如何分割线并仍将定界符保留在Linux中? - How can i split a line and still keep the delimiter in linux? 如何在CSV文件中的每一行附加其他字段? - How can I append additional fields to each line in a CSV file? 如何从 linux 命令行获取视频文件的分辨率(宽度和高度)? - How can I get the resolution (width and height) for a video file from a linux command line? 如何在行尾添加任何字符串,并在特定行数之后继续执行呢? - How can I append any string at the end of line and keep doing it after specific number of lines? 如何使用bash将多行合并为一行? - How can I make multiple lines into one line using bash? 如何使用linux命令从csv格式文件的某一行中只更改一个数字? - How can I change only one number from a certain line in a csv format file using linux comand? 如何使用awk / sed在2行之间添加新行文本? (在文件中) - How can I add a new line of text between 2 lines using awk/sed? (in a file) 如何使用任何Linux工具在FILENAME之前的每个文件的第n(5)行打印? - How can I print the nth (5th) line of every file preceded by the FILENAME using any linux tool?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM