简体   繁体   English

删除 bash 脚本中除最新的 3 个文件之外的所有文件

[英]Delete all files except the newest 3 in bash script

Question: How do you delete all files in a directory except the newest 3?问题:如何删除目录中除最新的 3 个之外的所有文件?

Finding the newest 3 files is simple:查找最新的 3 个文件很简单:

ls -t | head -3

But I need to find all files except the newest 3 files.但我需要找到除最新的 3 个文件之外的所有文件。 How do I do that, and how do I delete these files in the same line without making an unnecessary for loop for that?我该怎么做,以及如何删除同一行中的这些文件而不需要为此进行循环?

I'm using Debian Wheezy and bash scripts for this.为此,我正在使用 Debian Wheezy 和 bash 脚本。

This will list all files except the newest three:这将列出除最新的三个文件之外的所有文件:

ls -t | tail -n +4

This will delete those files:这将删除这些文件:

ls -t | tail -n +4 | xargs rm --

This will also list dotfiles:这还将列出点文件:

ls -At | tail -n +4

and delete with dotfiles:并用点文件删除:

ls -At | tail -n +4 | xargs rm --

But beware: parsing ls can be dangerous when the filenames contain funny characters like newlines or spaces.但请注意:当文件名包含有趣的字符(如换行符或空格)时,解析ls可能会很危险。 If you are certain that your filenames do not contain funny characters then parsing ls is quite safe, even more so if it is a one time only script.如果您确定您的文件名不包含有趣的字符,那么解析ls是非常安全的,如果它是一次性脚本,则更是如此。

If you are developing a script for repeated use then you should most certainly not parse the output of ls and use the methods described here: http://mywiki.wooledge.org/ParsingLs如果您正在开发重复使用的脚本,那么您肯定不应该解析ls的输出并使用此处描述的方法: http : //mywiki.wooledge.org/ParsingLs

Solution without problems with "ls" (strange named files)没有“ls”问题的解决方案(奇怪的命名文件)

This is a combination of ceving's and anubhava's answer.这是 ceving 和 anubhava 的答案的结合。 Both solutions are not working for me.这两种解决方案都不适合我。 Because I was looking for a script that should run every day for backing up files in an archive, I wanted to avoid problems with ls (someone could have saved some funny named file in my backup folder).因为我正在寻找一个应该每天运行的脚本来备份存档中的文件,所以我想避免ls出现问题(有人可以在我的备份文件夹中保存一些有趣的命名文件)。 So I modified the mentioned solutions to fit my needs.所以我修改了提到的解决方案以满足我的需求。

My solution deletes all files, except the three newest files.我的解决方案删除所有文件,三个最新文件除外

find . -type f -printf '%T@\t%p\n' |
sort -t $'\t' -g | 
head -n -3 | 
cut -d $'\t' -f 2- |
xargs rm

Some explanation:一些解释:

find lists all files (not directories) in current folder. find列出当前文件夹中的所有文件(不是目录)。 They are printed out with timestamps.它们与时间戳一起打印出来。
sort sorts the lines based on timestamp (oldest on top). sort根据时间戳(最旧的在顶部)对行sort排序。
head prints out the top lines, up to the last 3 lines. head打印出最上面的行,直到最后 3 行。
cut removes the timestamps. cut删除时间戳。
xargs runs rm for every selected file. xargs为每个选定的文件运行rm

For you to verify my solution:供您验证我的解决方案:

(
touch -d "6 days ago" test_6_days_old
touch -d "7 days ago" test_7_days_old
touch -d "8 days ago" test_8_days_old
touch -d "9 days ago" test_9_days_old
touch -d "10 days ago" test_10_days_old
)

This creates 5 files with different timestamps in the current folder.这将在当前文件夹中创建 5 个具有不同时间戳的文件。 Run this script first and then the code for deleting old files.先运行这个脚本,然后运行删除旧文件的代码。

The following looks a bit complicated, but is very cautious to be correct, even with unusual or intentionally malicious filenames.以下看起来有点复杂,但非常谨慎,即使是不寻常的或故意恶意的文件名也是如此。 Unfortunately, it requires GNU tools:不幸的是,它需要 GNU 工具:

count=0
while IFS= read -r -d ' ' && IFS= read -r -d '' filename; do
  (( ++count > 3 )) && printf '%s\0' "$filename"
done < <(find . -maxdepth 1 -type f -printf '%T@ %P\0' | sort -g -z) \
     | xargs -0 rm -f --

Explaining how this works:解释这是如何工作的:

  • Find emits <mtime> <filename><NUL> for each file in the current directory. Find 为当前目录中的每个文件发出<mtime> <filename><NUL>
  • sort -g -z does a general (floating-point, as opposed to integer) numeric sort based on the first column (times) with the lines separated by NULs. sort -g -z根据第一列(时间)与由 NUL 分隔的行进行一般(浮点数,而不是整数)数字排序。
  • The first read in the while loop strips off the mtime (no longer needed after sort is done). while循环中的第一次read去除 mtime( sort完成后不再需要)。
  • The second read in the while loop reads the filename (running until the NUL). while循环中的第二次read读取文件名(运行直到 NUL)。
  • The loop increments, and then checks, a counter;循环递增,然后检查一个计数器; if the counter's state indicates that we're past the initial skipping, then we print the filename, delimited by a NUL.如果计数器的状态表明我们已经超过了最初的跳过,那么我们打印文件名,以 NUL 分隔。
  • xargs -0 then appends that filename into the argv list it's collecting to invoke rm with. xargs -0然后将该文件名附加到它正在收集以调用rm的 argv 列表中。
ls -t | tail -n +4 | xargs -I {} rm {}

如果你想要一个 1 班轮

In zsh:在 zsh 中:

rm /files/to/delete/*(Om[1,-4])

If you want to include dotfiles , replace the parenthesized part with (Om[1,-4]D) .如果要包含dotfiles ,请将括号中的部分替换为(Om[1,-4]D)

I think this works correctly with arbitrary chars in the filenames (just checked with newline).我认为这适用于文件名中的任意字符(只是用换行符检查)。

Explanation: The parentheses contain Glob Qualifiers.说明:括号包含 Glob 限定符。 O means "order by, descending", m means mtime (See man zshexpn for other sorting keys - large manpage; search for "be sorted"). O表示“排序,降序”, m表示 mtime(有关其他排序键,请参阅man zshexpn - 大型联机帮助页;搜索“被排序”)。 [1,-4] returns only the matches at one-based index 1 to (last + 1 - 4) (note the -4 for deleting all but 3). [1,-4]仅返回基于一的索引 1 到 (last + 1 - 4) 的匹配项(注意-4表示删除除 3 之外的所有项)。

ls -t | tail -n +4 | xargs -I {} rm {}

Michael Ballent's answer works best as Michael Ballent 的回答最有效

ls -t | tail -n +4 | xargs rm --

throw me error if I have less than 3 file如果我的文件少于 3 个,则向我抛出错误

Don't use ls -t as it is unsafe for filenames that may contain whitespaces or special glob characters.不要使用ls -t因为它对于可能包含空格或特殊 glob 字符的文件名是不安全的。

You can do this using all gnu based utilities to delete all but 3 newest files in the current directory:您可以使用所有基于gnu的实用程序来删除当前目录中除 3 个最新文件之外的所有文件:

find . -maxdepth 1 -type f -printf '%T@\t%p\0' |
sort -z -nrk1 |
tail -z -n +4 |
cut -z -f2- |
xargs -0 rm -f --

Recursive script with arbitrary num of files to keep per-directory具有任意数量的文件的递归脚本以保留每个目录

Also handles files/dirs with spaces, newlines and other odd characters还处理带有空格、换行符和其他奇数字符的文件/目录

#!/bin/bash
if (( $# != 2 )); then
  echo "Usage: $0 </path/to/top-level/dir> <num files to keep per dir>"
  exit
fi

while IFS= read -r -d $'\0' dir; do
  # Find the nth oldest file
  nthOldest=$(find "$dir" -maxdepth 1 -type f -printf '%T@\0%p\n' | sort -t '\0' -rg \
    | awk -F '\0' -v num="$2" 'NR==num+1{print $2}')

  if [[ -f "$nthOldest" ]]; then
    find "$dir" -maxdepth 1 -type f ! -newer "$nthOldest" -exec rm {} +
  fi
done < <(find "$1" -type d -print0)

Proof of concept概念证明

$ tree test/
test/
├── sub1
│   ├── sub1_0_days_old.txt
│   ├── sub1_1_days_old.txt
│   ├── sub1_2_days_old.txt
│   ├── sub1_3_days_old.txt
│   └── sub1\ 4\ days\ old\ with\ spaces.txt
├── sub2\ with\ spaces
│   ├── sub2_0_days_old.txt
│   ├── sub2_1_days_old.txt
│   ├── sub2_2_days_old.txt
│   └── sub2\ 3\ days\ old\ with\ spaces.txt
└── tld_0_days_old.txt

2 directories, 10 files
$ ./keepNewest.sh test/ 2
$ tree test/
test/
├── sub1
│   ├── sub1_0_days_old.txt
│   └── sub1_1_days_old.txt
├── sub2\ with\ spaces
│   ├── sub2_0_days_old.txt
│   └── sub2_1_days_old.txt
└── tld_0_days_old.txt

2 directories, 5 files

This uses find instead of ls with a Schwartzian transform .这使用find而不是lsSchwartzian 变换

find . -type f -printf '%T@\t%p\n' |
sort -t $'\t' -g |
tail -3 |
cut -d $'\t' -f 2-

find searches the files and decorates them with a time stamp and uses the tabulator to separate the two values. find搜索文件并用时间戳装饰它们,并使用制表符将两个值分开。 sort splits the input by the tabulator and performs a general numeric sort, which sorts floating point numbers correctly. sort通过制表符拆分输入并执行通用数字排序,从而正确对浮点数进行排序。 tail should be obvious and cut undecorates. tail应该很明显, cut装饰的东西。

The problem with decorations in general is to find a suitable delimiter, which is not part of the input, the file names.装饰的问题通常是找到一个合适的分隔符,它不是输入文件名的一部分。 This answer uses the NULL character.答案使用 NULL 字符。

As an extension to the answer by flohall .作为flohall 答案的扩展 If you want to remove all folders except the newest three folders use the following:如果要删除除最新的三个文件夹之外的所有文件夹,请使用以下命令:

find . -maxdepth 1 -mindepth 1 -type d -printf '%T@\t%p\n' |
 sort -t $'\t' -g | 
 head -n -3 | 
 cut -d $'\t' -f 2- |
 xargs rm -rf

The -mindepth 1 will ignore the parent folder and -maxdepth 1 subfolders. -mindepth 1将忽略父文件夹和-maxdepth 1子文件夹。

以下对我有用:(干杯🍾)

rm -rf $(ll -t | tail -n +5 | awk '{ print $9}')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM