繁体   English   中英

迭代两个外部文件列表并在bash脚本中执行命令

[英]iterate through two external file list and execute commands in bash script

我正在创建一个可以获取filelist1(tar文件列表)和filelist2(目录列表)的脚本。 我需要遍历/读取这些文件列表,并将filelist1 mv中的第一个文件放到filelist2中的第一个dir中。 在那里,我将提取并执行此文件夹中的文件的其他活动。 尝试自动化,因为我每天将有130多个tar文件,每个文件包含75到200个必须处理的文件。 以下是我正在处理的脚本(WIP):

 #############################################################################
 #############################################################################
 #
 #  Incremental load script v1
 #  Created 02/09/2015 NHR
 #
 #############################################################################
 #############################################################################

 #
 # Clean up before running
 #
 # "/u02/hdfs_staging/ios/incremental/TOPACTR_DeltaFiles"
 #


 if [ -f filelist1 ]  ; then
    rm filelist1
 fi

 if [ -f filelist2 ] ; then
    rm filelist2
 fi

 #
 # Create filelist containing name of files parsed for dir's loaded from kdwxxxx
 #
 for i in *tar
     do
         echo "$i" | rev | cut -d"." -f2 | rev >> filelist1
     done

 #
 # Create work dir's for extracting tar files into for each date
 #
 while IFS= read -r file
     do
         [ ! -d "$file"  ] && mkdir "$file"
     done < "/u02/hdfs_staging/ios/incremental/TOPACTR_DeltaFiles/filelist1"

 #
 # Create filelist2 containing name of files parsed to copy
 # tar files to dir's for extraction
 #
 shopt -s nullglob                 # Bash extension, so that empty glob matches will work
   for file in ./*.tar ; do        # Use this, NOT "for file in *"
      echo  "$file" >> filelist2
   done

 #
 # Copy and Decompress tar files in these new dir's
 # HERE IS WHERE I NEED TO LOOP THROUGH THE FILELIST1 AND FILELIST2
 # AND PERFORM ADDITIONAL COMMANDS
 #



 #
 # Execute hive load to external table script to load incremental files to ios_incremental.
 # The ios_incremental database tables for these files is in place.
 #


 #hive -e CREATE EXTERNAL TABLE $filelist


 #
 # Run hive SQL script to add changed files to ios_staging tables.
 # This will be called from a hql script file and will require variables
 # for each table involved. This view combines record sets from both the
 # Base (base_table) and Change (incremental_table) tables and is reduced
 # only to the most recent records for each unique .id.  It is defined as
 # follows:
 #

 #hive -e
 # CREATE VIEW reconcile_view AS
 #    SELECT t1.* FROM
 #    (SELECT * FROM base_table
 #          UNION ALL
 #          SELECT * FROM incremental_table) t1
 #    JOIN
 #       (SELECT id, max(modified_date) max_modified FROM
 #           (SELECT * FROM base_table
 #           UNION ALL
 #           SELECT * FROM incremental_table) t2
 #       GROUP BY id) s
 #    ON t1.id = s.id AND t1.modified_date = s.max_modified;
 #


 #
 # Copy updated ios_staging data to update ios_prod db
 #



 #
 # Clean and Archive files to get ready for next incremental load
 #

我认为您正在寻找的是同时迭代两个列表。

这是一种实现方式,它假设文件名的名称中没有换行符或冒号(将冒号更改为其他符号很容易):

paste -d: filelist1 filelist2 | while IFS=: read -r file1 file2; do
  some_command "$file1" "$file2"
  # ...
done

更具防御性的解决方案是将列表放入数组而不是文件中,然后使用for循环进行迭代。 (我省略了数组的创建; SO上有很多例子):

for ((i=0;i<${#filearray1[@]};++i)); do
  file1="${filearray1[i]}"
  file2="${filearray2[i]}"
  some_command "$file1" "$file2"
  # ...
done

也许这样的事情(明显缺乏错误检查):

exec 3< filelist1 4< filelist2

while read -u3 tarfile
do
  read -u4 destination
  mv "${tarfile}" "${destination}"/.
  ( cd "${destination}"
    # ... other stuff
  ) # subshell is to avoid having to cd back where you came from
done

exec 3<&- 4<&-

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM