简体   繁体   English

匹配文件的第一列与awk,引号有困难

[英]Matching first column of file with awk, difficulty with quotes

My input file looks like this 我的输入文件如下所示

Chr1 1
Chr1 2
Chr2 3

And I want to split the input file into multiple files according to Chr in the first column. 我想根据第一列中的Chr将输入文件拆分为多个文件。

There should be two output files Output file 1 (named tmpChr1): 应该有两个输出文件输出文件1(名为tmpChr1):

Chr1 1
Chr1 2

Output file 2 (named tmpChr2): 输出文件2(名为tmpChr2):

Chr2 3

Here's the code so far: 这是迄今为止的代码:

#!/bin/bash

for((chrom=1;chrom<30;chrom++)); do
echo Chr${chrom}
chr=Chr${chrom}
awk "\$1==$chr{print \$1}" input.txt > tmp$chr
done

The line awk "\\$1==$chr{print \\$1}" is the problem, awk seems to require quotations around $chr to correctly match $1 awk "\\$1==$chr{print \\$1}"是问题,awk似乎需要在$ chr附近引用以正确匹配$ 1

awk '$1=="Chr1"{print $1}' works and tmpChr1 is made awk '$1=="Chr1"{print $1}'工作,tmpChr1成立

awk '$1=="$chr"{print $1}' doesn't work either awk '$1=="$chr"{print $1}'也不起作用

and neither does awk "$1=='$chr'{print $1}" 并没有awk "$1=='$chr'{print $1}"

Really struggling with the quotations, could anyone shed some light on what I should do? 真的在报价上挣扎,有人能说清楚我应该做些什么吗?

Never use double quotes around an awk script and never allow shell variables to expand as part of the body of an awk script. 永远不要在awk脚本周围使用双引号,并且绝不允许shell变量作为awk脚本主体的一部分进行扩展。 See http://cfajohnson.com/shell/cus-faq-2.html#Q24 请参阅http://cfajohnson.com/shell/cus-faq-2.html#Q24

You are WAY off the mark with your general approach though. 尽管如此,你仍然可以通过一般方法取消标记。 All you need is this awk script: 你只需要这个awk脚本:

awk '{print > ("tmp"$1)}' file

Look: 看:

$ ls
file
$ cat file
Chr1 1
Chr1 2
Chr2 3
$ awk '{print > ("tmp"$1)}' file
$ ls
file  tmpChr1  tmpChr2
$ cat tmpChr1
Chr1 1
Chr1 2
$ cat tmpChr2
Chr2 3

Any time you write a loop in shell just to manipulate text you have the wrong approach. 无论何时在shell中编写循环只是为了操作文本,你都有错误的方法。 UNIX shell is an environment from which to call tools with a language to sequence those calls. UNIX shell是一种环境,可以使用该语言调用工具来对这些调用进行排序。 The UNIX tool to manipulate text is awk. 用于操作文本的UNIX工具是awk。 So if you need to manipulate text in UNIX, write an awk script and call it from shell, that's all. 因此,如果您需要在UNIX中操作文本,请编写一个awk脚本并从shell调用它,这就是全部。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM