[英]how to remove all files from hdfs location except one?
I want to remove all files from hdfs location except one, But unable to find any solution for it.我想从 hdfs 位置删除所有文件,但找不到任何解决方案。
i have tried shopt -s extglob
then hadoop fs -rm location/!(filename)
but it did not work.我试过
shopt -s extglob
然后hadoop fs -rm location/!(filename)
但它没有用。
A best option would be to copy specific file to some other directory and delete all the remaining files in target directory and then move specific file to the same directory.最好的选择是将特定文件复制到其他目录并删除目标目录中的所有剩余文件,然后将特定文件移动到同一目录。
Else, There are couple of other ways as well to do the same thing.否则,还有其他几种方法可以做同样的事情。
Below is one sample shell script to delete all the files expect one matching pattern.下面是一个示例 shell 脚本,用于删除所有文件(期望一种匹配模式)。
#!/bin/bash
echo "Executing the shell script"
for file in $(hadoop fs -ls /user/xxxx/dev/hadoop/external/csvfiles |grep -v 'a_file_pattern_to_search' | awk '{print $8}')
do
printf '\n' >> "$file"
hadoop fs -rm "$file"
done
echo "shell scripts ends"
List all the files and then using grep with -v option which get all the files other than your specific pattern or a filename.列出所有文件,然后使用带有 -v 选项的 grep 获取除特定模式或文件名之外的所有文件。
Using the following code i'am able to remove all files from hdfs location at once except the file which is needed.使用以下代码,我可以一次从 hdfs 位置删除所有文件,但需要的文件除外。
file_arr=()
for file in $(hadoop fs -ls /tmp/table_name/ | grep -v 'part-' | awk '{print $8}')
do
file_arr+=("$file")
done
hadoop fs -rm "${file_arr[@]}"
I came up with a solution following vikrant rana's one.在 vikrant rana 的解决方案之后,我想出了一个解决方案。 It does not require rm command to execute multiple times, and also doesn't need to store the files in any array, reducing loc and efforts:
它不需要rm命令多次执行,也不需要将文件存储在任何数组中,减少了loc和工作量:
hadoop fs -ls /user/xxxx/dev/hadoop/external/csvfiles| grep -v 'a_file_pattern_to_search' | awk '{print $8}' | xargs hadoop fs -rm
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.