简体   繁体   English

使用 awk 追加到当前行的上方

[英]Using awk to append to the row above the current

I have a text files where the some text from the first field rolls over onto the next row.我有一个文本文件,其中第一个字段中的一些文本滚动到下一行。

Example例子

Company Name LLC
Company Name2
LLC
Very Good company name but rolls
over

I am able to get the rows that have rolled over by我能够得到翻转的行

awk '{ if (NF ==1) print $0}'

I am looking for a way to append the text onto (NR -1)我正在寻找一种将文本附加到(NR -1)上的方法

correct output正确的输出

Company Name LLC
Company Name2 LLC
Very Good company name but rolls over
awk -v ORS= '
    NR>1 { print NF>1 ? "\n" : OFS }
    1;
    END{ print "\n" }
' input_file
  • unset ORS so print doesn't emit implicit newlines未设置 ORS,因此 print 不会发出隐式换行符
  • on every line except the first print appropriate delimiter ( OFS on overflow, otherwise newline)在除第一个打印适当分隔符之外的每一行上(溢出时为OFS ,否则为换行符)
  • print the actual line打印实际行
  • at the end, print a newline最后,打印一个换行符

With your shown samples and attempts please try following tac + awk code.使用您显示的示例和尝试,请尝试遵循tac + awk代码。

tac Input_file | 
awk '
NF==1{
  val=$0(val?OFS val:"")
  next
}
NF>1{
  print $0,val
  val=""
}' | 
tac

Preliminarily, you are not making use of awk's pattern-action syntax and defaults;最初,您没有使用 awk 的模式动作语法和默认值; awk NF==1 has the same effect as the command you posted. awk NF==1与您发布的命令具有相同的效果。

But for your Q, in awk you need to buffer the previous line and then decide how to use it:但是对于您的 Q,在 awk 中您需要缓冲上一行,然后决定如何使用它:

awk 'NF==1{print p,$0; p=""; next} length(p){print p} {p=$0} END{print p}'

Or less efficiently but simpler you can do或者你可以做的效率更低但更简单

tac | awk 'NF==1{getline t; print t,$0; next} 1' | tac

I would use GNU AWK for this task following way let file.txt content be我会使用 GNU AWK来完成这项任务,方法是让file.txt内容成为

Company Name LLC
Company Name2
LLC
Very Good company name but rolls
over

then然后

awk 'BEGIN{RS=ORS=""}{print gensub(/\n([^[:space:]]+)(\n|$)/, " \\1\\2", "g")}' file.txt

gives output给出输出

Company Name LLC
Company Name2 LLC
Very Good company name but rolls over

Explanation: I set RS to empty string so GNU AWK treat things between empty lines as rows, in this case whole content as single row.说明:我将RS设置为空字符串,因此 GNU AWK将空行之间的内容视为行,在这种情况下,将整个内容视为单行。 Then I use gensub function to replace runs of non-whitespace characters after newline character which do occupy whole line.然后我使用gensub函数在换行符之后替换非空白字符的运行,这些字符确实占据了整行。 Newline before such run is replaced using space.使用空格替换此类运行之前的换行符。 1st capturing group is used to said non-whitespace characters, 2nd capturing group allows alternative as such run might be terminated by newline or end of file, whatever it was it is used as replacement value.第一个捕获组用于所述非空白字符,第二个捕获组允许替代,因为这样的运行可能会被换行符或文件结尾终止,无论它用作替换值。 Disclaimer : this solution assumes empty (blank) line is never present in your file.免责声明:此解决方案假定您的文件中永远不会出现空(空白)行。

(tested in gawk 4.2.1) (在 gawk 4.2.1 中测试)

awk '
    {NF>1 ? a[NR]=$0 : a[NR-1]=a[NR-1] FS $1}
    END{for(i in a) print a[i]}
' file

Company Name LLC
Company Name2 LLC
Very Good company name but rolls over
  • NF>1 if more than one field NF>1如果多于一个字段
    • true: a[NR]=$0 add element (line) to array true: a[NR]=$0将元素(行)添加到数组
    • false: a[NR-1]=a[NR-1] FS $1 update prev-array-element NR-1 false: a[NR-1]=a[NR-1] FS $1更新 prev-array-element NR-1
  • for(i in a) print a[i] print all array-elements for(i in a) print a[i]打印所有数组元素
$ awk 'NR>1{printf "%s%s", prev, (NF==1 ? OFS : ORS)} {prev=$0} END{print prev}' file
Company Name LLC
Company Name2 LLC
Very Good company name but rolls over

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM