简体   繁体   English

Linux:删除不包含特定行数的文件

[英]Linux: delete files that don't contain specific number of lines

如何删除目录中具有多于或少于指定行数的文件(所有文件都有“.txt”后缀)?

This bash script should do the trick. 这个bash脚本应该可以解决问题。 Save as "rmlc.sh". 保存为“rmlc.sh”。

Sample usage: 样品用法:

rmlc.sh -more 20 *.txt   # Remove all .txt files with more than 20 lines
rmlc.sh -less 15 *       # Remove ALL files with fewer than 15 lines

Note that if the rmlc.sh script is in the current directory, it is protected against deletion. 请注意,如果rmlc.sh脚本位于当前目录中,则会对其进行保护以防删除。


#!/bin/sh

# rmlc.sh - Remove by line count

SCRIPTNAME="rmlc.sh"
IFS=""

# Parse arguments 
if [ $# -lt 3 ]; then
    echo "Usage:"
    echo "$SCRIPTNAME [-more|-less] [numlines] file1 file2..."
    exit 
fi

if [ $1 == "-more" ]; then
    COMPARE="-gt" 
elif [ $1 == "-less" ]; then
    COMPARE="-lt" 
else
    echo "First argument must be -more or -less"
    exit 
fi

LINECOUNT=$2

# Discard non-filename arguments
shift 2

for filename in $*; do
    # Make sure we're dealing with a regular file first
    if [ ! -f "$filename" ]; then
        echo "Ignoring $filename"
        continue
    fi

    # We probably don't want to delete ourselves if script is in current dir
    if [ "$filename" == "$SCRIPTNAME" ]; then
        continue
    fi

    # Feed wc with stdin so that output doesn't include filename
    lines=`cat "$filename" | wc -l`

    # Check criteria and delete
    if [ $lines $COMPARE $LINECOUNT ]; then
        echo "Deleting $filename"
        rm "$filename"
    fi 
done

Played a bit with the answer from 0x6adb015. 玩了一下0x6adb015的答案。 This works for me: 这对我有用:

LINES=10
for f in *.txt; do
  a=`cat "$f" | wc -l`;
  if [ "$a" -ne "$LINES" ]
  then
    rm -f "$f"
  fi
done

This one liner should also do 这个班轮也应该这样做

 find -name '*.txt' | xargs  wc -l | awk '{if($1 > 1000 && index($2, "txt")>0 ) print $2}' | xargs rm

In the example above, files greater than 1000 lines are deleted. 在上面的示例中,将删除大于1000行的文件。

Choose > and < and the number of lines accordingly. 选择>和<以及相应的行数。

Try this bash script: 试试这个bash脚本:

LINES=10
for f in *.txt; do 
  if [ `cat "$f" | wc -l` -ne $LINES ]; then 
     rm -f "$f"
  fi
done

(Not tested) (未测试)

EDIT: Use a pipe to feed in wc, as wc prints the filename as well. 编辑:使用管道输入wc,因为wc也打印文件名。

My command line mashing is pretty rusty, but I think something like this will work safely (change the "10" to whatever number of lines in the grep) even if your filenames have spaces in them. 我的命令行mashing非常生疏,但我认为这样的东西可以安全地工作(将“10”改为grep中的任意数量的行),即使你的文件名中有空格。 Adjust as needed. 根据需要调整。 You'd need to tweak it if newlines in filenames are possible. 如果可以使用文件名中的换行符,则需要对其进行调整。

find . -name \*.txt -type f -exec wc -l {} \; | grep -v "^10 .*$" | cut --complement -f 1 -d " " | tr '\012' '\000' | xargs -0 rm -f

Here is a one liner option. 这是一个单线选项。 RLINES is the number of lines to use for removal. RLINES是用于删除的行数。

rm \`find $DIR -type f -exec wc -l {} \; | grep "^$RLINES " | awk '{print $2}'\`

A bit late since the question was asked. 问题提出后有点晚了。 I just had the same question, and this is what a came up with, in the lines of Chad Campbell 我刚才有同样的问题,这就是Chad Campbell所提出的问题

find $DIR -name '*.txt' -exec wc -l {} \; | grep -v "$LINES" | awk '{print $2}' | xargs rm
  • First part looks for all the files in DIR ending in *.txt and print the number of lines. 第一部分查找以* .txt结尾的DIR中的所有文件并打印行数。
  • Second part select all the files that do not have the required number of lines (LINES). 第二部分选择所有没有所需行数(LINES)的文件。
  • The third part prints just the file names. 第三部分只打印文件名。
  • And the forth part deletes those files. 第四部分删除这些文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM