简体   繁体   中英

How to combine multiple csv files into a single excel sheet using unix shell scripting?

Eg we have 3 csv files 1.csv , 2.csv , 3.csv . I want the output as All.xls

containing

the 3 csv files 1.csv , 2.csv and 3.csv in their respective tab.

Yes the are multiple ways to do what you want. Perl, Python and Ruby have the appropriate modules. Probably other scripting languages also. Depends on which scripting language you are comfortable with.

Here is a pointer to one way of doing what you want using Python: Python script to convert CSV files to Excel

You can use this well maintained Perl script:

https://metacpan.org/release/Text-CSV_XS

which exists in most Linux distributions (although with different names):

https://repology.org/project/perl:text-csv-xs/versions

Here is the syntax

csv2xlsx -o ~/All.xlsx 1.csv 2.csv 3.csv

You can try awk '.a[$0]++'./*.csv >./all.xls This command will combine all the csv files in current folder and create a new file: all.xls with single header row.

Just do the following:

1-open an empty excel file then go to file then options then choose customize Ribbon then select developer then press ok

2-Now you have developer tap showing in your ribbon so all that you need to do is to select it then press on VisualBasic icon

3-Microsoft Visual basic for applications window will pop, select insert then modules

4-paste the following code into the empty window:

Sub CSVtoXLS()
'UpdatebyExtendoffice20170814
    Dim xFd As FileDialog
    Dim xSPath As String
    Dim xCSVFile As String
    Dim xWsheet As String
    Application.DisplayAlerts = False
    Application.StatusBar = True
    xWsheet = ActiveWorkbook.Name
    Set xFd = Application.FileDialog(msoFileDialogFolderPicker)
    xFd.Title = "Select a folder:"
    If xFd.Show = -1 Then
        xSPath = xFd.SelectedItems(1)
    Else
        Exit Sub
    End If
    If Right(xSPath, 1) <> "\" Then xSPath = xSPath + "\"
    xCSVFile = Dir(xSPath & "*.csv")
    Do While xCSVFile <> ""
        Application.StatusBar = "Converting: " & xCSVFile
        Workbooks.Open Filename:=xSPath & xCSVFile
        ActiveWorkbook.SaveAs Replace(xSPath & xCSVFile, ".csv", ".xlsx", vbTextCompare), xlWorkbookDefault
        ActiveWorkbook.Close
        Windows(xWsheet).Activate
        xCSVFile = Dir
    Loop
    Application.StatusBar = False
    Application.DisplayAlerts = True
End Sub

5-press f5 to run, navigate to your CSV files then press ok and wait for the VBA script to do its magic = this will transform CSV files to xlsx

6-Now delete the previous VBA script in modules and add the following script

Sub CSVtoXLS()
'UpdatebyExtendoffice20170814
    Dim xFd As FileDialog
    Dim xSPath As String
    Dim xCSVFile As String
    Dim xWsheet As String
    Application.DisplayAlerts = False
    Application.StatusBar = True
    xWsheet = ActiveWorkbook.Name
    Set xFd = Application.FileDialog(msoFileDialogFolderPicker)
    xFd.Title = "Select a folder:"
    If xFd.Show = -1 Then
        xSPath = xFd.SelectedItems(1)
    Else
        Exit Sub
    End If
    If Right(xSPath, 1) <> "\" Then xSPath = xSPath + "\"
    xCSVFile = Dir(xSPath & "*.csv")
    Do While xCSVFile <> ""
        Application.StatusBar = "Converting: " & xCSVFile
        Workbooks.Open Filename:=xSPath & xCSVFile
        ActiveWorkbook.SaveAs Replace(xSPath & xCSVFile, ".csv", ".xlsx", vbTextCompare), xlWorkbookDefault
        ActiveWorkbook.Close
        Windows(xWsheet).Activate
        xCSVFile = Dir
    Loop
    Application.StatusBar = False
    Application.DisplayAlerts = True
End Sub

7-press f5 to run, navigate to your files then press ok and wait for the VBA script to do its magic = this will transform xlsx files to one excel file

The command to cat files together to produce a new file is cat . However, if you simply did a

 cat *csv >All.xls

you would also have header lines in the middle of the resulting files. There are two ways to work around this problem:

The first involves that you create temporary files out of each csv file, where the header line is missing, before putting together the pieces. This can be done using the tail command, for example

tail -n +2 2.csv >2_without_header.csv

The second possibility may or may not be applicable in your case. If - as it is often the case with CSV files - the order of the lines doesn't matter and duplicate lines can be ignored and - as it is likely in your case - the headers are identical, you could simply do a

sort -u *csv >All.xls

Eg we have 3 csv files 1.csv , 2.csv , 3.csv . I want the output as All.xls

containing

the 3 csv files 1.csv , 2.csv and 3.csv in their respective tab.

Could you help me? Thank you in advance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM