bash for awk的循环变量

Question

This is my first time posting on stack overflow, after being mostly searching for solutions and reading posts. 这是我第一次在堆栈溢出中发布帖子，之后主要是寻找解决方案并阅读了帖子。 I am trying to run a loop using bash so I can do the string search over a bunch of different files with the ext .u.clean I want to look through these files for the string "H#" or "h#" with the # being 1-28, and outputting to a file with the number that was searched for in the string. 我正在尝试使用bash运行循环，所以我可以使用ext .u.clean对一堆不同的文件进行字符串搜索，我想通过这些文件在字符串中搜索字符串“ H＃”或“ h＃” ＃为1-28，并输出到具有在字符串中搜索的编号的文件。 I am doing two separate searches in two fields ($5 and $0) and I wanted to output the total number of unique matches to a file "temp"#.txt. 我在两个字段（$ 5和$ 0）中进行两个单独的搜索，我想将唯一匹配的总数输出到文件“ temp”＃。txt。 After this I want to do some math on the two numbers that are input in the file. 之后，我想对文件中输入的两个数字做一些数学运算。 So far I have gotten this far: 到目前为止，我已经做到了：

for i in {1..28}; do
    awk -v var="$i" -F"\t"  ' $19 ~ "_[hH]"var {print $0}' */*.u.clean | \
        sort | uniq | wc -l > 'temp'$i'.txt' | \
        awk -v var="$i" -F"\t"  ' $19 ~ "_[hH]"var {print $5}' */*.u.clean | \
        sort | uniq | wc -l >> 'chris'$i'.txt'
done

The problem is that the numbers are coming out wrong. 问题是数字错了。 I am getting a total of 28 "temp"#".txt" files, but the inputs are not the correct word count numbers. 我总共得到28个“ temp”＃“。txt”文件，但是输入的字数不正确。 I also dont know how to do a mathematical operation one I have the files with the numbers in them. 我也不知道如何进行数学运算，因为我有文件，里面有数字。 Can someone help me out or point me to the right direction? 有人可以帮我或指出正确的方向吗？ Thanks for any help. 谢谢你的帮助。

EDIT: 编辑：

Here is what some of the input might look like: 以下是一些输入内容：

112 E 03 294168 FBLN7_rs335586251.5 GG 112 E 03 294168 FBLN7_rs335586251.5 GG
01/23/2013 2 3 VSD control 130123_CR_CH5_H26 1 A.Conservative 2013年1月23日2 3 VSD控制130123_CR_CH5_H26 1 A.

17 D 11 294319 FBLN7_rs335586251.5 GG 17 D 11 294319 FBLN7_rs335586251.5 GG
06/26/2012 2 3 VSD control 06/26/2012 2 3 VSD控制
120626_CR_CH5_H3 1 A.Conservative 120626_CR_CH5_H3 1 A.保守

22 B 01 294703 FBLN7_rs335586251.5 GG 22 B 01 294703 FBLN7_rs335586251.5 GG
06/26/2012 2 2 VSD control 06/26/2012 2 2 VSD控制
120626_CR_CH5_H4 1 A.Conservative 120626_CR_CH5_H4 1 A.保守

103 A 07 295033 FBLN7_rs335586251.5 GG 103 A 07 295033 FBLN7_rs335586251.5 GG
01/23/2013 2 1 VSD control 2013年1月23日2 1 VSD控制
130123_CR_CH5_H23 1 A.Conservative 130123_CR_CH5_H23 1 A.保守

44 G 07 295119 Tbx5_rs61931008.5 GG 44 G 07 295119 Tbx5_rs61931008.5 GG
07/11/2012 2 5 ASD control 07/11/2012 2 5 ASD控制
120711_CR_CH5_H12 1 A.Conservative 120711_CR_CH5_H12 1 A.保守

42 H 12 295201 JAG1_rs1232607.5 GG 42高12 295201 JAG1_rs1232607.5 GG
07/11/2012 1 2 ASD control 07/11/2012 1 2 ASD控制
120711_CR_CH5_H12 1 A.Conservative 120711_CR_CH5_H12 1 A.保守

I am trying to find a count of how many times in field 19 ( the field with the text Tbx5_rs61931008.5.), each occurence of H'#' occurs with # being from 1-28, output that number to a separate file for each H#. 我正在尝试查找字段19（带有文本Tbx5_rs61931008.5。的字段）中有多少次计数，每次出现H'＃'时发生的＃是从1-28开始，将该数字输出到一个单独的文件中每个H＃。 Then I want to know withing these matches of H#, how many unique occasions of field 5 there are, and output that number to the same file for each H#. 然后，我想知道与H＃的这些匹配，字段5有多少个独特的情况，并将每个H＃的编号输出到同一文件中。 I hope this is clear, and let me know id it is not. 我希望这很清楚，让我知道不是。 Thanks. 谢谢。

Answer 1

This seems a bit complicated for what you are trying to achieve. 对于您要实现的目标，这似乎有些复杂。 I would suggest using find and grep 我建议使用find和grep

find . -name "*.u.clean" -exec egrep -c '([Hh][1-9])|([Hh][1-2][0-9])'

You have to take the output and do the math 您必须获取输出并进行数学运算

This assumes there is only one h# per line in the file, if this is not correct then you will need to do a little more work. 假设文件中每行只有一个h# ，如果这不正确，则您需要做更多的工作。 I would find all the files that have any occurrences and then use egrep -o '([Hh][1-9])|([Hh][1-2][0-9])' | wc -l 我会找到所有出现的文件，然后使用egrep -o '([Hh][1-9])|([Hh][1-2][0-9])' | wc -l egrep -o '([Hh][1-9])|([Hh][1-2][0-9])' | wc -l to get the total for each file. egrep -o '([Hh][1-9])|([Hh][1-2][0-9])' | wc -l获取每个文件的总数。

bash for awk的循环变量

问题描述

1 个解决方案

解决方案1
1 2013-03-04 20:59:52

bash for awk的循环变量

问题描述

1 个解决方案

解决方案1 1 2013-03-04 20:59:52

解决方案1
1 2013-03-04 20:59:52