簡體   English   中英

遞歸提取多個 tar.gz 文件中的特定文件夾

[英]Extracting specific folders in multiple tar.gz files recursively

我正在使用開放的 Synthetic 患者和人口健康數據Synthea

該數據集包含一個 21gb 的 tar.gz,它提取到一組 tar.gz 文件中,這些文件以多種數據格式表示數據。

提取的源文件夾結構如下所示:

|-- output_11_20170528T113605.tar.gz
|-- output_1_20170524T232103.tar.gz
|-- output_12_20170528T195303.tar.gz
|-- output_2_20170525T073836.tar.gz
|-- output_3_20170525T161555.tar.gz
|-- output_4_20170526T004637.tar.gz
|-- output_5_20170526T091439.tar.gz
|-- output_6_20170526T173337.tar.gz
|-- output_7_20170527T015508.tar.gz
|-- output_8_20170527T102552.tar.gz
|-- output_9_20170527T185007.tar.gz

我嘗試使用以下命令僅提取 CSV 文件,該命令適用於單個文件:

tar -zxvf output_1_20170525T073836.tar.gz "output_1*csv*" -C ../synthea_output_folder

最好構建一個 shell 腳本,該腳本可以遍歷這些文件並從每個 tar.gz 文件中提取 CSV 文件夾,以便它們出現在 synthea_output_folder 中,如下所示:

|-- output_11/csv
|-- output_1/csv
|-- output_12/csv
|-- output_2/csv
|-- output_3/csv
|-- output_4/csv
|-- output_5/csv
|-- output_6/csv
|-- output_7/csv
|-- output_8/csv
|-- output_9/csv

我找到了一個 shell 腳本以遞歸方式解壓縮,但我不知道如何從每個文件中僅過濾掉 CSV 文件夾:

for f in *.tar.gz; do tar -xzvf "$f"; done

可能的解決方案

在修改了上述 shell 代碼后,我設法通過添加csv通配符命令僅提取 csv 文件夾:

for f in *.tar.gz; do tar -xzvf "$f" "*csv*" -C ../synthea_output; done

output 現在看起來像這樣:

|-- output_1
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_10
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_11
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_12
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_2
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_3
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_4
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_5
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_6
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_7
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
|-- output_8
|   `-- csv
|       |-- allergies.csv
|       |-- careplans.csv
|       |-- conditions.csv
|       |-- encounters.csv
|       |-- immunizations.csv
|       |-- medications.csv
|       |-- observations.csv
|       |-- patients.csv
|       `-- procedures.csv
`-- output_9
    `-- csv
        |-- allergies.csv
        |-- careplans.csv
        |-- conditions.csv
        |-- encounters.csv
        |-- immunizations.csv
        |-- medications.csv
        |-- observations.csv
        |-- patients.csv
        `-- procedures.csv

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM