简体   繁体   中英

Exracting 200 first lines from a zip file and creating this file into a different folder

I have a folder called /home/myusername/originalFiles where I got tons of large *.gz files. Inside this folder and its subfolders we got also tons of large *.gz files, too. Without deleting or modifying any of the *.gz files I need:

a) For each file f in /home/myusername/originalFiles (and subfolders), expand it,

b) Extract the first 200 lines from the expanded f

c) Convert that "200 lines" file from b) into a gz file again

d) Copy the "gzipped 200 lines" file from c) into another folder called /home/myusername/newSampleFiles but respecting the folder structure and name at /home/myusername/originalFiles. So, if the original file f was in a subfolder like /home/myusername/originalFiles/year2020 then the respective "gzipped 200 lines" file from c) must be in /home/myusername/newSampleFiles/year2020 and using the same name and extension of the original file in /home/myusername/originalFiles

e) Do not keep any expanded file obtained at a)

f) Do this only using Linux cmds

I tried

find. -type f -name "*.gz" -print | xargs -I@ sh -c 'head -n200 @ > /home/myusername/newSampleFiles/@'

but I got the error message:

/home/myusername/newSampleFiles/./someFile.txt.gz: No such file or directory

while read file;
do
    file2="${file%.*}"
    gzip -cd "$file" | head -n200 > "/home/myusername/newSampleFiles$file2";
    gzip -c "/home/myusername/newSampleFiles$file2" > "/home/myusername/newSampleFiles$file"
 done <<< "$(find /path/to/dir -type f -name "*.gz")"

Redirect the find command into a while loop, reading each line of the output into the variable file and then using parameter expansion to strip any the file extension from file1 and reading the result into file2. The variables are then used in the gzip commands.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM