简体   繁体   中英

Delete all files except the newest 3 in bash script

Question: How do you delete all files in a directory except the newest 3?

Finding the newest 3 files is simple:

ls -t | head -3

But I need to find all files except the newest 3 files. How do I do that, and how do I delete these files in the same line without making an unnecessary for loop for that?

I'm using Debian Wheezy and bash scripts for this.

This will list all files except the newest three:

ls -t | tail -n +4

This will delete those files:

ls -t | tail -n +4 | xargs rm --

This will also list dotfiles:

ls -At | tail -n +4

and delete with dotfiles:

ls -At | tail -n +4 | xargs rm --

But beware: parsing ls can be dangerous when the filenames contain funny characters like newlines or spaces. If you are certain that your filenames do not contain funny characters then parsing ls is quite safe, even more so if it is a one time only script.

If you are developing a script for repeated use then you should most certainly not parse the output of ls and use the methods described here: http://mywiki.wooledge.org/ParsingLs

Solution without problems with "ls" (strange named files)

This is a combination of ceving's and anubhava's answer. Both solutions are not working for me. Because I was looking for a script that should run every day for backing up files in an archive, I wanted to avoid problems with ls (someone could have saved some funny named file in my backup folder). So I modified the mentioned solutions to fit my needs.

My solution deletes all files, except the three newest files.

find . -type f -printf '%T@\t%p\n' |
sort -t $'\t' -g | 
head -n -3 | 
cut -d $'\t' -f 2- |
xargs rm

Some explanation:

find lists all files (not directories) in current folder. They are printed out with timestamps.
sort sorts the lines based on timestamp (oldest on top).
head prints out the top lines, up to the last 3 lines.
cut removes the timestamps.
xargs runs rm for every selected file.

For you to verify my solution:

(
touch -d "6 days ago" test_6_days_old
touch -d "7 days ago" test_7_days_old
touch -d "8 days ago" test_8_days_old
touch -d "9 days ago" test_9_days_old
touch -d "10 days ago" test_10_days_old
)

This creates 5 files with different timestamps in the current folder. Run this script first and then the code for deleting old files.

The following looks a bit complicated, but is very cautious to be correct, even with unusual or intentionally malicious filenames. Unfortunately, it requires GNU tools:

count=0
while IFS= read -r -d ' ' && IFS= read -r -d '' filename; do
  (( ++count > 3 )) && printf '%s\0' "$filename"
done < <(find . -maxdepth 1 -type f -printf '%T@ %P\0' | sort -g -z) \
     | xargs -0 rm -f --

Explaining how this works:

  • Find emits <mtime> <filename><NUL> for each file in the current directory.
  • sort -g -z does a general (floating-point, as opposed to integer) numeric sort based on the first column (times) with the lines separated by NULs.
  • The first read in the while loop strips off the mtime (no longer needed after sort is done).
  • The second read in the while loop reads the filename (running until the NUL).
  • The loop increments, and then checks, a counter; if the counter's state indicates that we're past the initial skipping, then we print the filename, delimited by a NUL.
  • xargs -0 then appends that filename into the argv list it's collecting to invoke rm with.
ls -t | tail -n +4 | xargs -I {} rm {}

如果你想要一个 1 班轮

In zsh:

rm /files/to/delete/*(Om[1,-4])

If you want to include dotfiles , replace the parenthesized part with (Om[1,-4]D) .

I think this works correctly with arbitrary chars in the filenames (just checked with newline).

Explanation: The parentheses contain Glob Qualifiers. O means "order by, descending", m means mtime (See man zshexpn for other sorting keys - large manpage; search for "be sorted"). [1,-4] returns only the matches at one-based index 1 to (last + 1 - 4) (note the -4 for deleting all but 3).

ls -t | tail -n +4 | xargs -I {} rm {}

Michael Ballent's answer works best as

ls -t | tail -n +4 | xargs rm --

throw me error if I have less than 3 file

Don't use ls -t as it is unsafe for filenames that may contain whitespaces or special glob characters.

You can do this using all gnu based utilities to delete all but 3 newest files in the current directory:

find . -maxdepth 1 -type f -printf '%T@\t%p\0' |
sort -z -nrk1 |
tail -z -n +4 |
cut -z -f2- |
xargs -0 rm -f --

Recursive script with arbitrary num of files to keep per-directory

Also handles files/dirs with spaces, newlines and other odd characters

#!/bin/bash
if (( $# != 2 )); then
  echo "Usage: $0 </path/to/top-level/dir> <num files to keep per dir>"
  exit
fi

while IFS= read -r -d $'\0' dir; do
  # Find the nth oldest file
  nthOldest=$(find "$dir" -maxdepth 1 -type f -printf '%T@\0%p\n' | sort -t '\0' -rg \
    | awk -F '\0' -v num="$2" 'NR==num+1{print $2}')

  if [[ -f "$nthOldest" ]]; then
    find "$dir" -maxdepth 1 -type f ! -newer "$nthOldest" -exec rm {} +
  fi
done < <(find "$1" -type d -print0)

Proof of concept

$ tree test/
test/
├── sub1
│   ├── sub1_0_days_old.txt
│   ├── sub1_1_days_old.txt
│   ├── sub1_2_days_old.txt
│   ├── sub1_3_days_old.txt
│   └── sub1\ 4\ days\ old\ with\ spaces.txt
├── sub2\ with\ spaces
│   ├── sub2_0_days_old.txt
│   ├── sub2_1_days_old.txt
│   ├── sub2_2_days_old.txt
│   └── sub2\ 3\ days\ old\ with\ spaces.txt
└── tld_0_days_old.txt

2 directories, 10 files
$ ./keepNewest.sh test/ 2
$ tree test/
test/
├── sub1
│   ├── sub1_0_days_old.txt
│   └── sub1_1_days_old.txt
├── sub2\ with\ spaces
│   ├── sub2_0_days_old.txt
│   └── sub2_1_days_old.txt
└── tld_0_days_old.txt

2 directories, 5 files

This uses find instead of ls with a Schwartzian transform .

find . -type f -printf '%T@\t%p\n' |
sort -t $'\t' -g |
tail -3 |
cut -d $'\t' -f 2-

find searches the files and decorates them with a time stamp and uses the tabulator to separate the two values. sort splits the input by the tabulator and performs a general numeric sort, which sorts floating point numbers correctly. tail should be obvious and cut undecorates.

The problem with decorations in general is to find a suitable delimiter, which is not part of the input, the file names. This answer uses the NULL character.

As an extension to the answer by flohall . If you want to remove all folders except the newest three folders use the following:

find . -maxdepth 1 -mindepth 1 -type d -printf '%T@\t%p\n' |
 sort -t $'\t' -g | 
 head -n -3 | 
 cut -d $'\t' -f 2- |
 xargs rm -rf

The -mindepth 1 will ignore the parent folder and -maxdepth 1 subfolders.

以下对我有用:(干杯🍾)

rm -rf $(ll -t | tail -n +5 | awk '{ print $9}')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM