[英]iterate through two external file list and execute commands in bash script
我正在創建一個可以獲取filelist1(tar文件列表)和filelist2(目錄列表)的腳本。 我需要遍歷/讀取這些文件列表,並將filelist1 mv中的第一個文件放到filelist2中的第一個dir中。 在那里,我將提取並執行此文件夾中的文件的其他活動。 嘗試自動化,因為我每天將有130多個tar文件,每個文件包含75到200個必須處理的文件。 以下是我正在處理的腳本(WIP):
#############################################################################
#############################################################################
#
# Incremental load script v1
# Created 02/09/2015 NHR
#
#############################################################################
#############################################################################
#
# Clean up before running
#
# "/u02/hdfs_staging/ios/incremental/TOPACTR_DeltaFiles"
#
if [ -f filelist1 ] ; then
rm filelist1
fi
if [ -f filelist2 ] ; then
rm filelist2
fi
#
# Create filelist containing name of files parsed for dir's loaded from kdwxxxx
#
for i in *tar
do
echo "$i" | rev | cut -d"." -f2 | rev >> filelist1
done
#
# Create work dir's for extracting tar files into for each date
#
while IFS= read -r file
do
[ ! -d "$file" ] && mkdir "$file"
done < "/u02/hdfs_staging/ios/incremental/TOPACTR_DeltaFiles/filelist1"
#
# Create filelist2 containing name of files parsed to copy
# tar files to dir's for extraction
#
shopt -s nullglob # Bash extension, so that empty glob matches will work
for file in ./*.tar ; do # Use this, NOT "for file in *"
echo "$file" >> filelist2
done
#
# Copy and Decompress tar files in these new dir's
# HERE IS WHERE I NEED TO LOOP THROUGH THE FILELIST1 AND FILELIST2
# AND PERFORM ADDITIONAL COMMANDS
#
#
# Execute hive load to external table script to load incremental files to ios_incremental.
# The ios_incremental database tables for these files is in place.
#
#hive -e CREATE EXTERNAL TABLE $filelist
#
# Run hive SQL script to add changed files to ios_staging tables.
# This will be called from a hql script file and will require variables
# for each table involved. This view combines record sets from both the
# Base (base_table) and Change (incremental_table) tables and is reduced
# only to the most recent records for each unique .id. It is defined as
# follows:
#
#hive -e
# CREATE VIEW reconcile_view AS
# SELECT t1.* FROM
# (SELECT * FROM base_table
# UNION ALL
# SELECT * FROM incremental_table) t1
# JOIN
# (SELECT id, max(modified_date) max_modified FROM
# (SELECT * FROM base_table
# UNION ALL
# SELECT * FROM incremental_table) t2
# GROUP BY id) s
# ON t1.id = s.id AND t1.modified_date = s.max_modified;
#
#
# Copy updated ios_staging data to update ios_prod db
#
#
# Clean and Archive files to get ready for next incremental load
#
我認為您正在尋找的是同時迭代兩個列表。
這是一種實現方式,它假設文件名的名稱中沒有換行符或冒號(將冒號更改為其他符號很容易):
paste -d: filelist1 filelist2 | while IFS=: read -r file1 file2; do
some_command "$file1" "$file2"
# ...
done
更具防御性的解決方案是將列表放入數組而不是文件中,然后使用for循環進行迭代。 (我省略了數組的創建; SO上有很多例子):
for ((i=0;i<${#filearray1[@]};++i)); do
file1="${filearray1[i]}"
file2="${filearray2[i]}"
some_command "$file1" "$file2"
# ...
done
也許這樣的事情(明顯缺乏錯誤檢查):
exec 3< filelist1 4< filelist2
while read -u3 tarfile
do
read -u4 destination
mv "${tarfile}" "${destination}"/.
( cd "${destination}"
# ... other stuff
) # subshell is to avoid having to cd back where you came from
done
exec 3<&- 4<&-
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.