简体   繁体   English

重命名每个组中的最新文件

[英]Rename the most recent file in each group

i try to create a script that should detect the latest file of each group, and add prefix to its original name. 我尝试创建一个脚本,该脚本应检测每个组的最新文件,并在其原始名称中添加前缀。

ll $DIR
asset_10.0.0.1_2017.11.19 #latest
asset_10.0.0.1_2017.10.28
asset_10.0.0.2_2017.10.02 #latest
asset_10.0.0.2_2017.08.15
asset_10.1.0.1_2017.11.10 #latest
...

2 questions: 2个问题:

1) how to find the latest file of each group? 1)如何查找每个组的最新文件?

2) how to rename adding only a prefix 2)如何重命名仅添加前缀

I tried the following procedure, but it looks for the latest file in the entire directory, and doesn't keep the original name to add a prefix to it: 我尝试了以下过程,但它会在整个目录中查找最新文件,并且不会保留原始名称来为其添加前缀:

find $DIR -type f ! -name 'asset*' -print | sort -n | tail -n 1 | xargs -I '{}' cp -p '{}' $DIR...

What would be the best approach to achieve this? 实现此目标的最佳方法是什么? (keeping xargs if possible) (如果可能,保留xargs)

Selecting the latest entry in each group 在每个组中选择最新条目

You can use sort to select only the latest entry in each group: 您可以使用sort来仅选择每个组中的最新条目:

find . -print0 | sort -r -z | sort -t_ -k2,2 -u -z | xargs ...

First, sort all files in reversed lexicographical order (so that the latest entry appears first for each group ). 首先,以相反的字典顺序对所有文件进行排序(以使最新的条目 首先出现在每个组中 )。 Then, by sorting on group name only (that's second field -k2,2 when split on underscores via -t_ ) and printing unique groups we get only the first entry per each group, which is also the latest. 然后,通过仅对组名进行排序(当通过-t_在下划线上分割时为第二个字段-k2,2 )并打印唯一的组,我们每个组仅获得第一个条目,这也是最新的。

Note that this works because sort uses a stable sorting algorithm - meaning the order or already sorted items will not be altered by sorting them again. 请注意,这是有效的,因为sort使用了稳定的排序算法-意味着顺序或已经排序的项目不会通过再次排序而改变。 Also note we can't use uniq here because we can't specify a custom field delimiter for uniq (it's always whitespace). 还要注意,我们不能在此处使用uniq ,因为我们无法为uniq指定自定义字段定界符(它总是空白)。

Copying with prefix 带前缀复制

To add prefix to each filename found, we need to split each path find produces to a directory and a filename (basename), because we need to add prefix to filename only. 要将前缀添加到找到的每个文件名,我们需要将find到的每个路径拆分为一个目录和一个文件名(基本名),因为我们只需要在文件名中添加prefix The xargs part above could look like: 上面的xargs部分可能看起来像:

... | xargs -0 -I '{}' sh -c 'd="${1%/*}"; f="${1##*/}"; cp -p "$d/$f" "$d/prefix_$f"' _ '{}'

Path splitting is done with shell parameter expansion , namely prefix ( ${1##*/} ) and suffix ( ${1%/*} ) substring removal . 路径拆分是通过shell参数扩展完成的 ,即删除前缀( ${1##*/} )和后缀( ${1%/*}子字符串


Note the use of NUL -terminated output (paths) in find ( -print0 instead of -print ), and the accompanying use of -z in sort and -0 in xargs . 注意使用NUL封端的输出(路径) find-print0而不是-print ),和附带的使用-zsort-0xargs That way the complete pipeline will properly handle filenames (paths) with "special" characters like newlines and similar. 这样,完整的管道将正确处理带有“特殊”字符(如换行符和类似字符)的文件名(路径)。

If you want to do this in bash alone, rather than using external tools like find and sort , you'll need to parse the "fields" in each filename. 如果要仅使用bash进行此操作,而不是使用外部工具(如findsort ,则需要解析每个文件名中的“字段”。

Something like this might work: 这样的事情可能会起作用:

declare -A o=()                         # declare an associative array (req bash 4)

for f in asset_*; do                    # step through the list of files,
  IFS=_ read -a a <<<"$f"               # assign filename elements to an array
  b="${a[0]}_${a[1]}"                   # define a "base" of the first two elements
  if [[ "${a[2]}" > "${o[$b]}" ]]; then # compare the date with the last value
    o[$b]="${a[2]}"                     # for this base and reassign if needed
  fi
done

for i in "${!o[@]}"; do                 # now that we're done, step through results
  printf "%s_%s\n" "$i" "${o[$i]}"      # and print them.
done

This doesn't exactly sort , it just goes through the list of files and grabs the highest sorting value for each filename base. 这不是完全排序 ,它只是遍历文件列表并为每个文件名基获取最高的排序值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM