My input file looks like this
Chr1 1
Chr1 2
Chr2 3
And I want to split the input file into multiple files according to Chr in the first column.
There should be two output files Output file 1 (named tmpChr1):
Chr1 1
Chr1 2
Output file 2 (named tmpChr2):
Chr2 3
Here's the code so far:
#!/bin/bash
for((chrom=1;chrom<30;chrom++)); do
echo Chr${chrom}
chr=Chr${chrom}
awk "\$1==$chr{print \$1}" input.txt > tmp$chr
done
The line awk "\\$1==$chr{print \\$1}"
is the problem, awk seems to require quotations around $chr to correctly match $1
awk '$1=="Chr1"{print $1}'
works and tmpChr1 is made
awk '$1=="$chr"{print $1}'
doesn't work either
and neither does awk "$1=='$chr'{print $1}"
Really struggling with the quotations, could anyone shed some light on what I should do?
Never use double quotes around an awk script and never allow shell variables to expand as part of the body of an awk script. See http://cfajohnson.com/shell/cus-faq-2.html#Q24
You are WAY off the mark with your general approach though. All you need is this awk script:
awk '{print > ("tmp"$1)}' file
Look:
$ ls
file
$ cat file
Chr1 1
Chr1 2
Chr2 3
$ awk '{print > ("tmp"$1)}' file
$ ls
file tmpChr1 tmpChr2
$ cat tmpChr1
Chr1 1
Chr1 2
$ cat tmpChr2
Chr2 3
Any time you write a loop in shell just to manipulate text you have the wrong approach. UNIX shell is an environment from which to call tools with a language to sequence those calls. The UNIX tool to manipulate text is awk. So if you need to manipulate text in UNIX, write an awk script and call it from shell, that's all.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.