[英]Linux copy next 'n' number of files from one folder to another in bash/python script
We have a 17k files with name like file1.csv, file2.csv, file3.csv...file17000.csv. 所有这些文件都应该从一个文件夹复制到另一个文件夹。 The goal is create Linux bash or python script to copy all this files divided by 'n' number of csv files every 5 minute, and prevent copying 'n' number of files that already copied.
这个想法是:
copy file1.csv file2.csv file3.csv file4.csv file5.csv to destination_dir
sleep for 300 seconds
copy file6.csv file7.csv file8.csv file9.csv file10.csv to destination_dir
sleep for 300 seconds
...
copy file16996.csv file16997.csv file16998.csv file16999.csv file17000.csv to destination_dir
对于少量文件,我们在下面的脚本中使用了在 2 个范围之间复制文件:
#!/bin/bash
source_dir='/source_dir'
target_dir='/target_dir'
echo "beginning number:$1"
echo $1
echo "finite number:$2"
echo $2
for f in $(eval ls $source_dir/file{$1..$2}.csv);
do
cp $f $target_dir
done
任何人都可以建议如何正确指向脚本以使用下一个“n”个 csv 文件
任何意见和建议将不胜感激。
这有帮助吗?
import os, os.path
import time
from shutil import copyfile
def copy_n_files(src, dest, n, start=0):
for file_num in range(start, start+n):
copyfile(f"{src}/file{i}.csv", dst)
SRC_DIR = "src"
DEST_DIR = "dest"
num_files = len([f for f in os.listdir(path)if os.path.isfile(os.path.join(path, f)) and f.endswith(".csv")])
step_size = 10 # number of files you want to copy in one go
sleep_time = 300 # nunmber of seconds you want to sleep for
for i in range(0, num_files, step_size):
copy_n_files(SRC_DIR, DEST_DIR, step_size, i)
time.sleep(sleep_time)
一个bash版本,随便加个batch_size变量
#!/bin/bash
source_dir='/source_dir'
target_dir='/target_dir'
all_csv_files=`ls -1v $source_dir/file*.csv`
batch_size=5
sleep_break=300
file_counter=0
echo Found ${#all_csv_files[@]} files
for f in "${all_csv_files[@]}"
do
cp $f $target_dir
let file_counter++
if [ $file_counter == $batch_size ]
then
echo Take a break `date`
file_counter=0
sleep $sleep_break
fi
done
echo Done
使用 bash:
max=$(for i in printf file*.csv;do echo $i;done | grep -Eo '[[:digit:]]+' | tail -1) # Work out the maximum file number
n=5 # Set the batch number of files to copy in one go
for ((i=1;i<=max;i=i+$n)); # Loop from one to max file in batches of n
do
sleep 300
p=$(($i+($n-1))); # Set the upper limit for batch file copying
for ((k=i;i<=p;k++));
do
cp "file$k.csv" destination_dir # Copy files using lower and upper limits of files for this pass
done
done
最后,我们通过更新 Marcel 提供的脚本来完成它。 我们添加了 while function 来读取数组中的文件列表,它按我们预期的方式工作:
#!/bin/bash
all_csv_files=()
source_dir='/source_dir'
target_dir='/target_dir'
while IFS= read -r -d $'\0'; do
all_csv_files+=("$REPLY")
done < <(find $source_dir -name "file*.csv" -print0)
echo ${#all_csv_files[@]}
batch_size=5
sleep_break=60
file_counter=0
echo Found ${#all_csv_files[@]} files
for f in "${all_csv_files[@]}"
do
cp $f $target_dir
echo $f
let file_counter++
if [ $file_counter == $batch_size ]
then
echo "Take a break $(date)"
file_counter=0
sleep $sleep_break
fi
done
echo Done
谢谢大家的建议!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.