简体   繁体   English

使用awk / sed重新格式化长文件

[英]Reformat a long file using awk/sed

I have a very long file. 我的档案很长。 The contents of the file are like: 该文件的内容如下:

myserver1
kernel_version
os

myserver2
kernel_version
os

myserver3
kernel_version
os
...

There are more than 10.000 entries and 3 entries for each host. 每个主机有10.000个条目和3个条目。 Hostname, kernel_version and OS version. 主机名,kernel_version和操作系统版本。

I would like to have an output like: 我想要一个类似的输出:

myserver1, kernel_version, os
myserver2, kernel_version, os
myserver3, kernel_version, os
...

instead. 代替。 So what is the best awk/sed command to provide this output? 那么提供此输出的最佳awk / sed命令是什么?

With sed: 与sed:

$ sed '/^$/d;N;N;s/\n/, /g' infile                                  
myserver1, kernel_version, os                              
myserver2, kernel_version, os                              
myserver3, kernel_version, os

This works as follows: 其工作原理如下:

/^$/d       # Delete line if empty (skips rest of commands)
N           # Append second line to pattern space
N           # Append third line to pattern space
s/\n/, /g   # Replace newlines by comma and a blank

If you want the criterion for the line to be skipped not be "empty line" but its line number (4, 8, 12...), you can replace the first command (this is a GNU extension): 如果要跳过的标准不是“空行”而是其行号(4、8、12 ...),则可以替换第一个命令(这是GNU扩展名):

sed '4~4d;N;N;s/\n/, /g' infile

您也可以使用paste

paste -d ',,\0' - - - - <file

You can use : 您可以使用 :

awk 'BEGIN{RS="";OFS=", "} {print $1,$2,$3}' data.txt

defining record separator as empty line with output field separator ( OFS ) as ", " 将记录分隔符定义为空行,并将输出字段分隔符( OFS )定义为", "

You can also use : 您也可以使用:

awk 'BEGIN{RS="";OFS=", "} {$1=$1; print $0}' data.txt

$1=$1 forces the record to be reconstituted, see this $1=$1强制进行重构的记录,看到

While AWK/SED could help you perform this task, a better way would be to use Python, assuming that the *NIX system you are working on has it installed to process this data. 虽然AWK / SED可以帮助您执行此任务,但更好的方法是使用Python,假设您正在使用的* NIX系统已安装了该文件来处理此数据。

You could use the following in python to process this quite easily: 您可以在python中使用以下命令轻松处理此问题:

import csv

output_file = csv.writer(open("/path/to/output/file","w"))

column_num = 3 # number of columns in your end-state data
with open("</path/to/your/input/file>","r") as input:
  row = []
  iteration_counter = 0
  for line in input:
    iteration_counter += 1
    stripped = line.strip() # to remove the newlines (\n)
    if iteration_counter <= column_num:
      row.append(stripped)
    else:
      iteration_counter = 0 # reset the counter to 0
      output_writer.writerow(row) # output the list as a csv row
      row = [] # clear the row list to nothing
      iteration_counter += 1
      row.append(stripped)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM