We would like to zip the files in directory with number of records along with directory naming convention to follow the zip file.
Ex: we have two directories with date names (2021-10-01, 2021-10-02 and each of this parent directories contains sub directories with country names and this country directories contains number of files.
2021-10-01/USA, 2021-10-01/UK
2021-10-02/USA, 2021-10-02/USA
And we would like to zip the country directories with limited number of records and and zip file should name as parentdirectory_Countrydirectory.zip(2021-10-01_USA.zip)
.
And My script accept the dates as parameter and which will pass it to sql query which will extract data with dates parent directory structure with country sub-directories inside the data with files from DB but I am just skipping the sql query part of my script here.
#!/bin/bash
startd=$1
endd=$2
compress () {
startd=$(date -d $startd +%Y%m%d)
endd=$(date -d $endd +%Y%m%d)
while [[ $startd -le $endd ]]
do
tempdate=$(date -d $startd +"%Y-%m-%d")
dirl+=" $tempdate"
startd=$(date -d"$startd + 1 day" +"%Y%m%d")
done
echo $dirl
for j in $dirl
do
if [ -d "$j" ]; then
cd $j
for d in *
do
zip ${j}_${d}.zip $d
mv ${j}_${d}.zip ../
done
else
echo "no data extracted on: $j"
fi
cd ..
done
}
I would like to zip the files with limit of number of records and name could be parentdirectory_subdirectory1.zip with incremental of the number with same naming convention.
Note: Number of records means files in the sub directories which is extracted by sql query, USA sub-directory may contains thousand of files so I would wanted to split the zip with sub directory files like 200 files then create the file with same naming convention like 2021-10-01_USA.zip 2021-10-01_USA1.zip etc.
This is a bit tricky to do in Bash, but you can use eg xargs
to conveniently split a long list of items into manageable chunks. The challenge then is to pass in a new file name for each zip file. Here's one quick and dirty attempt.
compress () {
local startd=$(date -d "$1" +%Y%m%d)
local endd=$(date -d "$2" +%Y%m%d)
local mm
local j
local d
while [[ $startd -le $endd ]]
do
mm=${startd#??}
j="${startd%????}-${mm%??}-${mm#??}")
startd=$(date -d"$startd + 1 day" +"%Y%m%d")
if [ -d "$j" ]; then
for d in "$j"/*/; do
printf '%s\0' "$j"/"$d"/* |
xargs -r -0 -n 200 sh -c '
for ((i=0; i<=99; i++)); do
test -e "$0${i#0}.zip" || break
done
zip -j "$0${i#0}.zip" "$@"' ../"${j}_${d}"
done
else
echo "$0: no data extracted on: $j" >&2
fi
done
}
Random observations:
date
again to insert dashes in the yyyy-mm-dd format in the array, use a series of parameter expansions. It's a bit tedious code-wise, but avoids calling an external process for something which the shell can do much quicker with internal facilitieszip -j
to remove directory names from the input files so that we don't have to cd
into and back out of each directory. (This is slightly error-prone if you have directory symlinks.) >&2
and include the name of the script which created the message in the message itself. The real meat is in the slightly complex xargs
invocation.
We printf
the file names to be zipped as null-separated items so that we can correctly cope with arbitrary file names. (See http://mywiki.wooledge.org/BashFAQ/020 for details.) The -0
argument to xargs
is a GNU extension to enable this. The -r
argument simply says to do nothing if there is no input (ie there were no files in the directory; probably shopt -s nullglob
too ).
The -n 200
says to restrict input to a maximum of 200 files at a time, and we then pass those 200 file names or less to the sh -c
script.
... which receives the base name of the zip file we want to create as $0
(this is just a hack to avoid having to separately shift
off an argument from the argument list it receives; the first argument to sh -c
is otherwise usually unused, so we use that to smuggle in this value). It uses a simple for
loop to find the first unused name with this prefix, using an empty string for the very first one.
(Maybe change this - I think your proposed convention is slightly confusing. I would prefer to have xxx.zip
solely if there is only a single file in the set, and xxx1.zip
, xxx2.zip
, etc when there are several.)
Once we have established the file name, we simply zip
the files we receive as arguments into that file.
xargs
takes care of portioning the input file set into chunks of the desired size and calling the sh -c
script as many times as necessary.
This is probably a bit intimidating at first; this would be a fair bit easier in a modern scripting language like Python.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.