简体   繁体   English

Hadoop FileSystem-如何删除给定hdfs目录中所有大小为零的文件?

[英]Hadoop FileSystem - How to delete all files that are of zero size in a given hdfs directory?

I have a hdfs directory A in the path: /user/A 我在路径中有一个hdfs目录A:/ user / A

How do I delete all files within A that are of zero size? 如何删除A中所有大小为零的文件?

Deleting only files: 仅删除文件:

hdfs dfs -rm $(hdfs dfs -ls -R /user/A/ | grep -v "^d" | awk '{if ($5 == 0) print $8}')

Test what you get first, since there are lots of HDFS formats, that may have metadata or files with 0 byte sizes (like parquet ..SUCCESS, ..TEMPORARY etc.) 测试您首先获得的内容,因为存在许多HDFS格式,其中可能包含元数据或字节大小为0的文件(例如镶木地板..SUCCESS,.. TEMPORARY等)

hdfs dfs -ls -R /user/A/ | grep -v "^d" | awk '{if ($5 == 0) print $8}'

这可能会有所帮助

hdfs dfs -ls -R /path/to/directory/ | grep part- | awk '{ if ($5 == 0) print $8 }' | xargs hdfs dfs -rm

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何删除Hadoop HDFS中的多级分区 - How to delete multi level partition in Hadoop HDFS 在hadoop文件系统中创建目录 - Create directory in hadoop filesystem 如何使用Shell脚本每4分钟在HDFS(Hadoop)上的目录中查找是否有新文件 - How to find if there are new files in a directory on HDFS (Hadoop) every 4 min using shell script 如何将目录的所有文件移动到具有给定数量文件的多个目录中? - How to move all the files of a directory into multiple directories with a given number of files? 从 HDFS 路径中删除日期后缀为 HDFS 路径的文件早于给定日期 | unix - Delete files from HDFS path having date suffixed to the HDFS path older than a given date | unix 如何使用 schell 脚本迭代 HDFS 目录中的所有文件? - How do I iterate to all the files in HDFS directory using a schell script? 如何遍历bash给定目录中指定类型的所有文件? - How to loop through all files of a specified type in a given directory in bash? 删除在日期范围之间创建的HDFS中的所有0字节文件 - Delete all 0 byte files in HDFS which is created between a date range 如何删除目录中除一个文件夹和一个文件以外的所有文件? - how to delete all files in directory except one folder and one file? 在Linux终端中,如何删除目录中除一个或两个以外的所有文件 - In Linux terminal, how to delete all files in a directory except one or two
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM