简体   繁体   English

使用awk / sed附加和替换

[英]Append and replace using awk/sed

I have this file: 我有这个文件:

2016,05,P,0002    ,CJGLOPSD8                                                    
00,BBF,BBDFTP999,051000100,GBP,   ,       -2705248.00                           
00,BBF,BBDFTP999,059999998,GBP,   ,       -3479679.38                           
00,BBF,BBDFTP999,061505141,GBP,   ,             -0.40                           
00,BBF,BBDFTP999,061505142,GBP,   ,        6207621.00                           
00,BBF,BBDFTP999,061505405,GBP,   ,             -0.16                           
00,BBF,BBDFTP999,061552000,GBP,   ,             -0.24                           
00,BBF,BBDFTP999,061559010,GBP,   ,             -0.44                           
00,BBF,BBDFTP999,062108021,GBP,   ,             -0.34                           
00,BBF,BBDFTP999,063502007,GBP,   ,             -0.28  

I want to programmatically (in unix, or informatica if possible) grab the first two fields in the top row, concatenate them, append them to the end of each line and remove that first row. 我想以编程方式(如果可能的话,在unix或informatica中)抓住第一行中的前两个字段,将它们连接起来,将它们附加到每行的末尾,然后删除该第一行。

Like so: 像这样:

00,BBF,BBDFTP999,051000100,GBP,,-2705248.00,201605                          
00,BBF,BBDFTP999,059999998,GBP,,-3479679.38,201605                           
00,BBF,BBDFTP999,061505141,GBP,,-0.40,201605                           
00,BBF,BBDFTP999,061505142,GBP,,6207621.00,201605                           
00,BBF,BBDFTP999,061505405,GBP,,-0.16,201605                           
00,BBF,BBDFTP999,061552000,GBP,,-0.24,201605                          
00,BBF,BBDFTP999,061559010,GBP,,-0.44,201605                         
00,BBF,BBDFTP999,062108021,GBP,,-0.34,201605                        
00,BBF,BBDFTP999,063502007,GBP,,-0.28,201605

This is my current attempt: 这是我目前的尝试:

awk -vvar1=`cat OF\ OPSDOWN8.CSV | head -1 | cut -d',' -f1` -vvar2=`cat OF\ OPSDOWN8.CSV | head -1 | cut -d',' -f2` 'BEGIN {FS=OFS=","} {print $0, var 1var2}' OF\ OPSDOWN8.CSV> OF_OPSDOWN8.csv

Any pointers? 有指针吗? I've tried looking around the forum but can only find answers to part of my question. 我尝试在论坛上四处寻找,但只能找到部分问题的答案。

Thanks for your help. 谢谢你的帮助。

Use this awk : 使用这个awk

awk 'BEGIN{FS=OFS=","} NR==1{val=$1$2;next} {gsub(/ */,"");print $0,val}' file

Explanation: 说明:

  • BEGIN{FS=OFS=","} - This block will set FS (Field Separator) and OFS (Output Field Separator) as , . BEGIN{FS=OFS=","} -该块将设置FS(字段分隔符)和OFS(输出字段分隔符)为,
  • NR==1 - Working with line number 1. Here, $1 and $2 denotes field number. NR==1使用行号1。这里, $1$2表示字段号。
  • print $0,val - Printing $0 (whole line) and stored value from val . print $0,valval打印$0 (整行)和存储的值。

I would use the following awk command: 我将使用以下awk命令:

awk 'NR==1{d=$1$2;next}{$(NF+1)=d;gsub(/[[:space:]]/,"")}1' FS=, OFS=, file

Explanation: 说明:

  • NR==1{d=$1$2;next} applies on line 1 and set's a variable d(ate) to the value of the first and the second field. NR==1{d=$1$2;next}适用于第1行,并将变量d(ate)设置为第一个字段和第二个字段的值。 The variable is being used when processing the remaining lines. 处理其余行时将使用该变量。 next tells awk to go ahead with the next line right away without processing further instructions on this line. next告诉awk立即继续下一行,而不处理该行的进一步说明。

  • {$(NF+1)=d;gsub(/[[:space:]]/,"")}1 appends a new field to the line ( NF is the number of fields, assigning d to $(NF+1) effectively adds a field. gsub() is used to removing spaces. 1 at the end always evaluates to true and makes awk print the modified line. {$(NF+1)=d;gsub(/[[:space:]]/,"")}1将新字段追加到该行( NF是字段数,将d分配给$(NF+1)有效地添加了一个字段gsub()用于删除空格。最后的1始终为true并用awk打印修改后的行。

  • FS=, is a command line argument. FS=,是命令行参数。 It set's the input field delimiter to , . 这集的输入字段分隔符,

  • OFS=, is a command line argument. OFS=,是命令行参数。 It set's the output field delimiter to , . 这集的输出域分隔符,

Output: 输出:

00,BBF,BBDFTP999,051000100,GBP,,-2705248.00,201605
00,BBF,BBDFTP999,059999998,GBP,,-3479679.38,201605
00,BBF,BBDFTP999,061505141,GBP,,-0.40,201605
00,BBF,BBDFTP999,061505142,GBP,,6207621.00,201605
00,BBF,BBDFTP999,061505405,GBP,,-0.16,201605
00,BBF,BBDFTP999,061552000,GBP,,-0.24,201605
00,BBF,BBDFTP999,061559010,GBP,,-0.44,201605
00,BBF,BBDFTP999,062108021,GBP,,-0.34,201605
00,BBF,BBDFTP999,063502007,GBP,,-0.28,201605

With sed : 与sed:

sed '1{s/\([^,]*\),\([^,]*\),.*/\1\2/;h;d};/.*/G;s/\n/,/;s/ //g' file

in ERE mode : 在ERE模式下:

sed -r '1{s/([^,]*),([^,]*),.*/\1\2/;h;d};/.*/G;s/\n/,/;s/ //g' file

Output : 输出:

00,BBF,BBDFTP999,051000100,GBP,,-2705248.00,201605
00,BBF,BBDFTP999,059999998,GBP,,-3479679.38,201605
00,BBF,BBDFTP999,061505141,GBP,,-0.40,201605
00,BBF,BBDFTP999,061505142,GBP,,6207621.00,201605
00,BBF,BBDFTP999,061505405,GBP,,-0.16,201605
00,BBF,BBDFTP999,061552000,GBP,,-0.24,201605
00,BBF,BBDFTP999,061559010,GBP,,-0.44,201605
00,BBF,BBDFTP999,062108021,GBP,,-0.34,201605
00,BBF,BBDFTP999,063502007,GBP,,-0.28,201605

This might work for you (GNU sed): 这可能对您有用(GNU sed):

sed '1s/,//;1s/,.*//;1h;1d;s/ //g;G;s/\n/,/' file

For the first line only: remove the first comma, remove from the next comma to the end of the line, store the amended line in the hold space (HS) and then delete the current line (the d abruptly ends processing). 仅对于第一行:删除第一个逗号,从下一个逗号删除到行的末尾,将修改后的行存储在保留空间(HS)中,然后删除当前行( d突然结束处理)。 For subsequent lines: remove all spaces, append the HS and replace the newline (from the G command) with a comma. 对于后续行:删除所有空格,附加HS并将换行符(来自G命令)替换为逗号。

Or if you prefer: 或者,如果您喜欢:

sed '1{s/,//;s/,.*//;h;d};s/ //g;G;s/\n/,/' file

If you want to use Informatica for this, use two Source Qualifiers. 如果要为此使用Informatica,请使用两个Source Qualifiers。 Read the file twice - just one line in one SQ (filter out the rest) and in the second SQ read the whole file except the first line (skip header). 读取文件两次-一个SQ中只有一行(过滤掉其余部分),而在第二个SQ中,除了第一行(跳过标题)之外,读取了整个文件。 Join the two on dummy port and you're done. 将两者连接到虚拟端口上,您就完成了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM