簡體   English   中英

合並CSV文件,並通過AIX awk,sed,ksh將一些列連接為單個列

[英]Combine CSV files and concatenate some columns into single column via AIX awk, sed, ksh

我有多個文件,像這樣:

File_1.csv:
"Job Id", "Batch Id","Id","Success","Created","Error","Col1","Col2","Col3"
 aaabbb111,xxxyyy999,"false","false","Horrible_Error: Really Bad Error occured: yeah", "Val1", "Val2", "Val3"
 cccddd222,pppqqq888,"","false","Horrible_Error: Anoter Bad Error occured: ouch", "Val1", "Val2", "Val3"

File_2.csv:
"Job Id", "Batch Id","Id","Success","Created","Error","Col1","Col2","Col3","Col4", "Col5"
 aaabbb111,xxxyyy999,"false","false","Horrible_Error: Really Bad Error occured: oops","Val1","Val2","Val3","Val4","Val5"
 cccddd222,pppqqq888,"","false","Horrible_Error: Anoter Bad Error occured: oh-no", "Val1","Val1","Val2","Val3","Val4","Val5"

每個文件的前6列始終具有相同的名稱。 其余列的名稱和數量各不相同,我想將它們捕獲為單列,並用雙引號,方括號或大括號括起來,或者用其他表示此數據的方式包圍它們。

我需要能夠將這些文件組合成一個看起來像這樣的文件。 標頭是可選的,僅用於說明目的:

"File_Name"|"Job Id"|"Batch Id"|"Id"|"Success"|"Created"|"Error"|"Tran_Header"|"Tran_Record" 
File_1.csv|aaabbb111|xxxyyy999|"false"|"false"|"Horrible_Error: Really Bad Error occured: yeah"|["Col1","Col2","Col3"]|["Val1","Val2","Val3"]
File_1.csv|cccddd222|pppqqq888|""|"false"|"Horrible_Error: Anoter Bad Error occured: ouch"|["Col1","Col2","Col3"]|["Val1","Val2","Val3"]
File_2.csv|aaabbb111|xxxyyy999|"false"|"false"|"Horrible_Error: Really Bad Error occured: oops"|["Col1","Col2","Col3","Col4", "Col5"]|["Val1","Val2","Val3","Val4","Val5"]
File_2.csv|cccddd222|pppqqq888|""|"false"|"Horrible_Error: Anoter Bad Error occured: oh-no"|["Col1","Col2","Col3","Col4", "Col5"]|["Val1","Val1","Val2","Val3","Val4","Val5"]

我嘗試了以下方法來合並文件,但是這段代碼有時會阻塞替換雙引號,然后我的ETL工具反過來會阻塞分析串聯的列集(而且我也不知道如何將標頭捕獲到單獨的列中) :

outdirectory=/some/directory
outfilename=some_file_name.csv
for i in *.csv
do
    filename=$(echo "${i}")

    tail +2 "${i}" | sed -e 's/,/#|#/1' -e 's/,/#|#/1' -e 's/,/#|#/1' -e 's/,/#|#/1' -e 's/,/#|#/1' -e 's/,/#|#/1' -e s/\"//g -e "s/^/#${filename}/" -e s/$/#/ | sed s/#/\"/g >> "${outdirectory}/${outfilename}" 

    mv $i $srcdir/
done

任何幫助或想法,我們將不勝感激。 我對UNIX shell腳本一無所知。 差點忘了,我在AIX v6.2上

使用awk的解決方案(我使用gnu-awk)

awk 'BEGIN{FS=",";OFS="|"}
{
  if(FNR==1){
    if(NR==1){
      print "\"File_Name\"",$1,$2,$3,$4,$5,$6,"\"Tran_Header\"","\"Tran_Record\"";
    }
    $1=$2=$3=$4=$5=$6="";
    gsub("[|]+",",",$0);
    gsub("^,","",$0);
    titleCol = $0;
  }else{
    temp = FILENAME OFS $1 OFS $2 OFS $3 OFS $4 OFS $5 OFS "["titleCol"]";
    $1=$2=$3=$4=$5="";
    gsub("[|]+",",",$0);
    gsub("^,","",$0);
    print temp OFS "["$0"]";
  }
}' *.csv

你得到:

"File_Name"|"Job Id"|"Batch Id"|"Id"|"Success"|"Created"|"Error"|"Tran_Header"|"Tran_Record"
File_1.csv|aaabbb111|xxxyyy999|"false"|"false"|"Horrible_Error: Really Bad Error occured: yeah"|["Col1","Col2","Col3"]|["Val1","Val2","Val3"]
File_1.csv|cccddd222|pppqqq888|""|"false"|"Horrible_Error: Anoter Bad Error occured: ouch"|["Col1","Col2","Col3"]|["Val1","Val2","Val3"]
File_2.csv|aaabbb111|xxxyyy999|"false"|"false"|"Horrible_Error: Really Bad Error occured: oops"|["Col1","Col2","Col3","Col4","Col5"]|["Val1","Val2","Val3","Val4","Val5"]
File_2.csv|cccddd222|pppqqq888|""|"false"|"Horrible_Error: Anoter Bad Error occured: oh-no"|["Col1","Col2","Col3","Col4","Col5"]|["Val1","Val1","Val2","Val3","Val4","Val5"]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM