Linux 在 bash/python 腳本中將下一個“n”個文件從一個文件夾復制到另一個文件夾

Question

We have a 17k files with name like file1.csv, file2.csv, file3.csv...file17000.csv. 所有這些文件都應該從一個文件夾復制到另一個文件夾。 The goal is create Linux bash or python script to copy all this files divided by 'n' number of csv files every 5 minute, and prevent copying 'n' number of files that already copied.

這個想法是：

copy file1.csv file2.csv file3.csv  file4.csv  file5.csv to destination_dir
sleep for 300 seconds
copy file6.csv file7.csv file8.csv file9.csv file10.csv to destination_dir
sleep for 300 seconds
...
copy file16996.csv file16997.csv file16998.csv file16999.csv file17000.csv to destination_dir

對於少量文件，我們在下面的腳本中使用了在 2 個范圍之間復制文件：

#!/bin/bash
source_dir='/source_dir'
target_dir='/target_dir'
echo "beginning number:$1"
echo $1
echo "finite number:$2"
echo $2
for f in $(eval ls $source_dir/file{$1..$2}.csv);
do
cp $f $target_dir
done

任何人都可以建議如何正確指向腳本以使用下一個“n”個 csv 文件

任何意見和建議將不勝感激。

Answer 1

這有幫助嗎？

import os, os.path
import time

from shutil import copyfile

def copy_n_files(src, dest, n, start=0):
    for file_num in range(start, start+n):
        copyfile(f"{src}/file{i}.csv", dst)

SRC_DIR = "src"
DEST_DIR = "dest"


num_files = len([f for f in os.listdir(path)if os.path.isfile(os.path.join(path, f)) and f.endswith(".csv")])
step_size = 10 # number of files you want to copy in one go
sleep_time = 300 # nunmber of seconds you want to sleep for

for i in range(0, num_files, step_size):
    copy_n_files(SRC_DIR, DEST_DIR, step_size, i)
    time.sleep(sleep_time)

Answer 2

一個bash版本，隨便加個batch_size變量

#!/bin/bash
source_dir='/source_dir'
target_dir='/target_dir'

all_csv_files=`ls -1v $source_dir/file*.csv`
batch_size=5
sleep_break=300

file_counter=0
echo Found ${#all_csv_files[@]} files

for f in "${all_csv_files[@]}"
do
    cp $f $target_dir
    let file_counter++
    if [ $file_counter == $batch_size ] 
    then
        echo Take a break `date`
        file_counter=0
        sleep $sleep_break
    fi
done

echo Done

Answer 3

使用 bash：

max=$(for i in printf file*.csv;do echo $i;done | grep -Eo '[[:digit:]]+' | tail -1)  # Work out the maximum file number
n=5                                                                                   # Set the batch number of files to copy in one go
for ((i=1;i<=max;i=i+$n));                                                            # Loop from one to max file in batches of n
do 
  sleep 300
  p=$(($i+($n-1)));                                                                   # Set the upper limit for batch file copying
  for ((k=i;i<=p;k++));
  do
     cp "file$k.csv" destination_dir                                                  # Copy files using lower and upper limits of files for this pass
  done
done

Answer 4

最后，我們通過更新 Marcel 提供的腳本來完成它。 我們添加了 while function 來讀取數組中的文件列表，它按我們預期的方式工作：

#!/bin/bash
all_csv_files=()
source_dir='/source_dir'
target_dir='/target_dir'
while IFS=  read -r -d $'\0'; do
    all_csv_files+=("$REPLY")
done < <(find $source_dir -name "file*.csv" -print0)

echo ${#all_csv_files[@]}

batch_size=5
sleep_break=60
file_counter=0

echo Found ${#all_csv_files[@]} files

for f in "${all_csv_files[@]}"
do
    cp $f $target_dir
    echo $f
    let file_counter++
    if [ $file_counter == $batch_size ]
    then
        echo "Take a break $(date)"
        file_counter=0
        sleep $sleep_break
    fi
done
echo Done

謝謝大家的建議！

Linux 在 bash/python 腳本中將下一個“n”個文件從一個文件夾復制到另一個文件夾

問題描述

4 個解決方案

解決方案1
0 2021-03-01 12:21:57

解決方案2
0 2021-03-01 12:38:50

解決方案3
0 2021-03-01 12:42:08

解決方案4
0 已采納 2021-05-06 07:55:29

Linux 在 bash/python 腳本中將下一個“n”個文件從一個文件夾復制到另一個文件夾

問題描述

4 個解決方案

解決方案1 0 2021-03-01 12:21:57

解決方案2 0 2021-03-01 12:38:50

解決方案3 0 2021-03-01 12:42:08

解決方案4 0 已采納 2021-05-06 07:55:29

解決方案1
0 2021-03-01 12:21:57

解決方案2
0 2021-03-01 12:38:50

解決方案3
0 2021-03-01 12:42:08

解決方案4
0 已采納 2021-05-06 07:55:29