带grep -F的通配符

Question

I have the following file 我有以下文件

0 0
0 0.001
0 0.032
0 0.1241
0 0.2241
0 0.42
0.0142 0
0.0234 0
0.01429 0.01282
0.001 0.224
0.098 0.367
0.129 0
0.123 0.01282
0.149 0.16
0.1345 0.216
0.293 0
0.2439 0.01316
0.2549 0.1316
0.2354 0.5
0.3345 0
0.3456 0.0116
0.3462 0.316
0.3632 0.416
0.429 0
0.42439 0.016
0.4234 0.3
0.5 0
0.5 0.33
0.5 0.5

Notice that the two columns are sorted ascending, first by the first column and then by the second one. 请注意，这两列按升序排列，首先是第一列，然后是第二列。 The minimum value is 0 and the maximum is 0.5. 最小值为0，最大值为0.5。

I would like to count the number of lines that are: 我想算一下行数：

0 0

and store that number in a file called "0_0". 并将该号码存储在名为“ 0_0”的文件中。 In this case, this file should contain "1". 在这种情况下，该文件应包含“ 1”。

Then, the same for those that are: 然后，对于那些是相同的：

0 0.0*

For example, 例如，

0 0.032

And call it "0_0.0" (it should contain "2"), and this for all combinations only considering the first decimal digit (0 0.1*, 0 0.2* ... 0.0* 0, 0.0* 0.0* ... 0.5 0.5). 并将其称为“ 0_0.0”（应包含“ 2”），并且对于所有组合，仅考虑第一个十进制数字（0 0.1 *，0 0.2 * ... 0.0 * 0、0.0 * 0.0 * ... 0.5 0.5）。

I am using this loop: 我正在使用此循环：

for i in 0 0.0 0.1 0.2 0.3 0.4 0.5
do
    for j in 0 0.0 0.1 0.2 0.3 0.4 0.5
    do
        grep -F ""$i" "$j"" file | wc -l > "$i"_"$j"
    done
done

rm 0_0 #this 0_0 output is badly done, the good way is with the next command, which accepts \n
pcregrep -M "0 0\n" file | wc -l > 0_0

The problem is that for example, line 问题是，例如，线

0.0142 0

will not be recognized by the iteration "0.0 0", since there are digits after the "0.0". 将不会被迭代“ 0.0 0”识别，因为在“ 0.0”之后有数字。 Removing the -F option in grep in order to consider all numbers that start by "0.0" will not work, since the point will be considered a wildcard symbol and therefore for example in the iteration "0.1 0" the line 删除grep中的-F选项以考虑所有以“ 0.0”开头的数字将不起作用，因为该点将被视为通配符，因此例如在迭代“ 0.1 0”中，该行

 0.0142 0

will be counted, because 0.0142 is a 0"anything"1. 将被计数，因为0.0142是0“任何” 1。

I hope I am making myself clear! 我希望我能使自己清楚！

Is there any way to include a wildcard symbol with grep -F, like in: 有什么办法可以在grep -F中包含通配符，例如：

for i in 0 0.0 0.1 0.2 0.3 0.4 0.5
do
    for j in 0 0.0 0.1 0.2 0.3 0.4 0.5
    do
        grep -F ""$i"* "$j"*" file | wc -l > "$i"_"$j"
    done
done

(Please notice the asterisks after the variables in the grep command). （请注意grep命令中变量后面的星号）。

Thank you! 谢谢！

Answer 1

Don't use shell loops just to manipulate text, that's what the guys who invented shell also invented awk to do. 不要仅仅使用shell循环来操纵文本，这就是发明shell的人也发明了awk来做的。 See why-is-using-a-shell-loop-to-process-text-considered-bad-practice . 请参阅为什么使用shell循环处理文本被认为是不好的做法。

It sounds like all you need is: 听起来您需要做的只是：

awk '{cnt[substr($1,1,3)"_"substr($2,1,3)]++} END{ for (pair in cnt) {print cnt[pair] > pair; close(pair)} }' file

That will be vastly more efficient than your nested shell loops approach. 这将比嵌套的shell循环方法效率更高。

Here's what it'll be outputting to the files it creates: 这是将输出到它创建的文件中的内容：

$ awk '{cnt[substr($1,1,3)"_"substr($2,1,3)]++} END{for (pair in cnt) print pair "\t" cnt[pair]}' file
0.0_0.3 1
0_0.4   1
0.5_0   1
0.2_0.5 1
0.4_0.3 1
0.0_0   2
0.1_0.0 1
0.3_0   1
0.1_0.1 1
0.1_0.2 1
0.3_0.0 1
0_0     1
0.1_0   1
0.5_0.3 1
0.4_0   1
0.3_0.3 1
0.2_0.0 1
0_0.0   2
0.5_0.5 1
0.3_0.4 1
0.2_0.1 1
0.0_0.0 1
0_0.1   1
0_0.2   1
0.4_0.0 1
0.2_0   1
0.0_0.2 1

带grep -F的通配符

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-06-25 11:55:46

带grep -F的通配符

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-06-25 11:55:46

解决方案1
2 已采纳 2017-06-25 11:55:46