简体   繁体   English

在bash脚本中的变量上运行sed

[英]Running sed ON a variable in bash script

Apologies for a seemingly inane question. 对于一个看似无知的问题表示歉意。 But I have spent the whole day trying to figure it out and it drives me up the walls. 但是我花了一整天的时间试图弄清楚它,它使我无法自拔。 I'm trying to write a seemingly simple bash script that would take a list of files in the directory from ls, replace part of the file names using sed, get unique names from the list and pass them onto some command. 我正在尝试编写一个看似简单的bash脚本,该脚本将从ls中获取目录中的文件列表,使用sed替换部分文件名,从列表中获取唯一名称,然后将其传递给某些命令。 Like so: 像这样:

inputs=`ls *.ext`
echo $inputs
test1_R1.ext  test1_R2.ext  test2_R1.ext  test2_R2.ext

Now I would like to put it through sed to replace 1.ext and 2.ext with * to get test1_R* etc. Then I'd like to remove resulting duplicates by running sort -u to arrive to the following $outputs variable: 现在,我想通过sed将其替换为*,以将1.ext和2.ext替换为test1_R *等。然后,我想通过运行sort -u到达以下$ outputs变量来删除结果重复项:

echo $outputs
test1_R* test2_R*

And pass this onto a command, like so 并将其传递给命令,像这样

cat $outputs

I can do something like this in a command line: 我可以在命令行中执行以下操作:

ls *.ext | sed s/..ext/\*/g | sort -u

But if I try to assign the above to a variable in the script it just returns the output from the ls. 但是,如果我尝试将上述内容分配给脚本中的变量,它只会返回ls的输出。 I have tried several ways to do it: including the whole pipe in the script. 我尝试了几种方法来做到这一点:在脚本中包括整个管道。 Running each command separately and assigning it to a variable, then passing that variable to the next command and writing the outputs to files then passing the file to the next command. 分别运行每个命令并将其分配给变量,然后将该变量传递给下一个命令,并将输出写入文件,然后将文件传递给下一个命令。 But so far none of this managed to achieve what I aimed to. 但是到目前为止,这些都没有实现我的目标。 I think my problem lies in (except general cluelessness aroung bash scripting) inability to run seq on a variable within script. 我认为我的问题出在(除了一般的无知bash脚本之外)无法在脚本中的变量上运行seq。 There seems to be a lot of advice around in how to pass variables to pattern or replacement string in sed, but they all seem to take files as input. 关于如何在sed中将变量传递给模式或替换字符串,似乎有很多建议,但是它们似乎都以文件作为输入。 But I understand that it might not be the proper way of doing it anyway. 但是我知道这可能不是正确的方法。 Therefore I would really appreciate if someone could suggest an elegant way to achieve, what I'm trying to. 因此,如果有人能提出一种优雅的实现方式,我将非常感激。

Many thanks! 非常感谢!

Update 2/06/2014 更新2/06/2014

Hi Barmar, thanks for your answer. 嗨,巴尔玛,谢谢您的回答。 Can't say it solved the problem, but it helped pin-pointing it. 不能说它解决了问题,但它有助于查明问题。 Seems like the problem is in me using the asterisk. 似乎问题出在我使用星号。 I have to say, I'm very puzzled. 我不得不说,我很困惑。 The actual file names I've got are: 我得到的实际文件名是:

test1_R1.fastq.gz test1_R2.fastq.gz test2_R1.fastq.gz test2_R2.fastq.gz

If I'm using the code you suggested, which seems to me the right way do to it: 如果我使用的是您建议的代码,对我来说似乎是正确的方法:

ins=$(ls *.fastq.gz | sed 's/..fastq.gz/\\*/g' | sort -u)

Sed doesn't seem to do anything and I'm getting the output of ls: Sed似乎什么也没做,我得到的是ls的输出:

test1_R1.fastq.gz test1_R2.fastq.gz test2_R1.fastq.gz test2_R2.fastq.gz

Now if I replace that backslash with anything else, the sed works, but it also returns whatever character I'm putting in front (or after) the asteriks: 现在,如果我将反斜杠替换为其他任何东西,则sed可以正常工作,但它还会返回我在星号前面(或后面)加上的任何字符:

ins=$(ls *.fastq.gz | sed 's/..fastq.gz/"*/g' | sort -u)
test1_R"* test2_R"*

That's odd enough, but surely I can just put an "R" in front of the asteriks and then replace R in the search pattern string, right? 这很奇怪,但是可以肯定的是,我可以在星号前面放一个“ R”,然后在搜索模式字符串中替换R,对吗? Wrong! 错误! If I do that whichever way: 's/R..fastq.gz/R*/g' 's/...fastq.gz/R*/g' 's/[AZ]..fastq.gz/R*/g' I'm back to the original names! 如果我以任何方式进行操作: 's/R..fastq.gz/R*/g' 's/...fastq.gz/R*/g' 's/[AZ]..fastq.gz/R*/g'我又回到了原来的名字! And even if I end up with something like test1_RR* test2_RR* and try to run it through sed again and replace "_R" for "_" or "RR" for "R" , I'm having no luck and I'm back to the original names. 即使我最终得到类似test1_RR* test2_RR*并尝试再次通过sed运行它,并用"_R"代替"_""RR"代替"R" ,我还是没有运气,我回来了保留原始名称。 And yet I can replace the rest of the file name no problem, just not to get me test1_R* I need. 但是我可以替换其余的文件名没有问题,只是不要让我需要我的test1_R*

I have a feeling I should be escaping that * in some very clever way, but nothing I've tried seems to work. 我觉得我应该以某种非常聪明的方式来逃避*,但是我尝试过的任何事情似乎都没有用。 Thanks again for your help! 再次感谢你的帮助!

这是您如何在变量中捕获整个管道的结果的方式:

var=$(ls *.ext | sed s/..ext/\*/g | sort -u)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM