[英]Bash script, print filenames that contain a string
I have a folder with a couple of files that I need to organize/manipulate depending on if they both exist, or only one of them exists.我有一个包含几个文件的文件夹,我需要根据它们是否都存在或仅存在一个来组织/操作这些文件。
In my folder called folder1/checkthese/*.bam
the files are:在我名为
folder1/checkthese/*.bam
的文件夹中,文件是:
file1_aln.bam
file1_aln_sorted.bam
I have a script that checks if I have the unsorted file (which is just *_aln.bam
) and sorted file ( *_aln_sorted.bam
) but I am having trouble getting my script to run correctly depending on if they both exist or not.我有一个脚本来检查我是否有未排序的文件(只是
*_aln.bam
)和已排序的文件( *_aln_sorted.bam
),但我无法让我的脚本正确运行,具体取决于它们是否都存在。
Here is my mini script:这是我的迷你脚本:
for files in folder1/checkthese/*.bam
do
if [[ ${files} =~ "_aln.bam" ]] && [[ ${files} =~ "_aln_sorted.bam" ]]
then
echo "both files exist, need to delete unsorted file only"
echo "REMOVE $(basename ${files/_aln*}_aln.bam)"
rm -f ${files/_aln*}_aln.bam
elif [[ ${files} =~ "_aln_sorted.bam" ]] && [[ ! ${files} =~ "_aln.bam" ]]
then
echo "Only sorted file exists, all good"
fi
done
But this is the output I get:但这是我得到的 output:
Only sorted file exists, all good.
But clearly the unsorted file exists so for some reason it is skipping the first part of my loop and not removing the _aln.bam
file.但显然未排序的文件存在,因此由于某种原因它跳过了我循环的第一部分,而不是删除
_aln.bam
文件。 I am not sure how to change my conditional statement in my elif statement so that if ONLY the _aln_sorted.bam
file exists, then all is good and I don't need to delete anything.我不确定如何在我的 elif 语句中更改条件语句,以便如果仅
_aln_sorted.bam
文件存在,那么一切都很好,我不需要删除任何内容。 I think I should not be using the &&
for my elif
statement, but I thought the !
我认为我不应该将
&&
用于我的elif
语句,但我认为!
essentially is the NOT
boolean for this.本质上是
NOT
boolean。
Dude, your comparision can't do what you want.伙计,你的比较不能做你想做的事。
Your first comparision is checking for the files that name contains both _aln.bam and _aln_sorted.bam string.您的第一个比较是检查名称包含 _aln.bam 和 _aln_sorted.bam 字符串的文件。 And the second is checking for the files that name contains _aln_sorted.bam and doesn't contain _aln.bam!
第二个是检查名称包含 _aln_sorted.bam 且不包含 _aln.bam 的文件!
So these comparions works on same file in every execution!所以这些比较在每次执行时都在同一个文件上工作!
You need this:你需要这个:
#!/bin/bash
for file in /full_path/folder1/checkthese/*.bam
do
if [[ ${file} =~ "_aln.bam" ]]
then
echo "Unsorted file was found! It will be removed!"
echo "Removing the file named ${file}"
rm -f ${file}
echo "File removed!"
elif [[ ${file} =~ "_aln_sorted.bam" ]]
then
echo "${file} is a sorted file!"
fi
done
-----------EDIT-------------------- - - - - - -编辑 - - - - - - - - - -
Okay I fixed my original script which did not use booleans to check for strings in the filename but instead checked if files existed.好的,我修复了我的原始脚本,该脚本不使用布尔值来检查文件名中的字符串,而是检查文件是否存在。 This worked for me:
这对我有用:
Originally I had this script as well but ran into similar problems:最初我也有这个脚本,但遇到了类似的问题:
for files in folder1/checkthese/*.bam
do
if [ -f ${files/_aln*}_aln.bam ] && [ -f ${files/_aln*}_aln_sorted.bam ]
then
echo "both files exist, need to delete unsorted file only"
echo "REMOVE $(basename ${files/_aln*}_aln.bam)"
rm -f ${files/_aln*}_aln.bam
elif [ -f ${files/_aln*}_aln_sorted.bam ] && [ ! -f ${files/_aln*}_aln_sorted.bam ]
then
echo "Only sorted file exists, all good"
fi
done
Output works now. Output 现在可以工作了。
I will present a little less conventional solution, stressing two points:我将提出一个不太传统的解决方案,强调两点:
First create some test files首先创建一些测试文件
mkdir data
seq 1 5 | xargs -I{} touch 'data/file_{}_aln.bam'
# first three of them have their sorted equivalents
seq 1 3 | xargs -I{} touch 'data/file_{}_aln_sorted.bam'
First let's check what files I'd delete:首先让我们检查一下我要删除哪些文件:
find data -name '*.bam' | sort | sed 's/_sorted//' | uniq -d
The complement are the files I have to sort yet:补充是我必须排序的文件:
find data -name '*.bam' | sort | sed 's/_sorted//' | uniq -u
After checking, I can do something like this to delete the files检查后,我可以做这样的事情来删除文件
find data -name '*.bam' | sort | sed 's/_sorted//' | uniq -d | xargs rm
The final check if all unsorted are gone can be done easily by最后检查是否所有未排序的都消失了,可以通过以下方式轻松完成
ls data/*_aln.bam
# or to get some numeric results:
ls data/*_aln.bam | wc -l
Of course the usual caveats apply - use sensible file names or you have to use find -print0 | xargs -0
当然,通常的警告适用 - 使用合理的文件名,或者您必须使用
find -print0 | xargs -0
find -print0 | xargs -0
and deal with the consequences. find -print0 | xargs -0
并处理后果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.