使用awk处理数据库

Question

I have a directory on my computer which contains an entire database I found online for my research. 我的计算机上有一个目录，其中包含我在网上为我的研究找到的整个数据库。 This database contains thousands of files, so to do what I need I've been looking into file i/o stuff. 这个数据库包含数千个文件，所以要做我需要的东西，我一直在寻找文件的i / o东西。 A programmer friend suggested using bash/awk. 程序员朋友建议使用bash / awk。 I've written my code: 我写了我的代码：

    #!/usr/bin/env awk
    ls -l|awk'
    BEGIN {print "Now running"}
    {if(NR == 17 / $1 >= 0.4 / $1 <= 2.5)
    {print $1 > wavelengths.txt;
    print $2 > reflectance.txt;
    print $3 > standardDev.txt;}}END{print "done"}'

When I put this into my console, I'm already in the directory of the files I need to access. 当我把它放到我的控制台时，我已经在我需要访问的文件的目录中。 The data I need begins on line 17 of EVERY file. 我需要的数据从每个文件的第17行开始。 The data looks like this: 数据如下所示：

some number    some number    some number
some number    some number    some number
    .              .              .
    .              .              .
    .              .              .

I want to access the data when the first column has a value of 0.4 (or approximately) and get the information up until the first column has a value of approximately 2.5. 我希望在第一列的值为0.4（或大约）时访问数据，并获取信息，直到第一列的值大约为2.5。 The first column represents wavelengths. 第一列表示波长。 I want to verify they are all the same for each file later, so I copy them into a file. 我想稍后验证它们对于每个文件都是相同的，所以我将它们复制到一个文件中。 The second column represents reflectance and I want this to be a separate file because later I'll take this information and build a data matrix from it. 第二列代表反射率，我希望这是一个单独的文件，因为稍后我将获取此信息并从中构建数据矩阵。 And the third column is the standard deviation of the reflectance. 第三列是反射率的标准偏差。

The problem I am having now is that when I run this code, I get the following error: No such file or directory 我现在遇到的问题是，当我运行此代码时，我收到以下错误：没有这样的文件或目录

Please, if anyone can tell me why I might be getting this error, or can guide me as to how to write the code for what I am trying to do... I will be so grateful. 请问，如果有人能告诉我为什么我可能会收到这个错误，或者可以指导我如何编写我想要做的代码......我将非常感激。

Answer 1

Excellent attempt, but this is because you should never parse the output of ls . 出色的尝试，但这是因为你永远不应该解析ls的输出。 Still, you were probably looking for ls -1 , not ls -l . 不过，你可能正在寻找ls -1 ，而不是ls -l 。 awk can also accept a glob of files. awk也可以接受一串文件。 For example, in the desired directory, you can run: 例如，在所需目录中，您可以运行：

awk -f /path/to/script.awk *

Contents of script.awk : script.awk内容：

BEGIN {
    print "Now running"
}

NR == 17 && $1 >= 0.4 && $1 <= 2.5 {

    print $1 > "wavelengths.txt"
    print $2 > "reflectance.txt"
    print $3 > "standardDev.txt"
}

END {
    print "Done"
}

Answer 2

The main problem is that you need to quote the names of the output file names as they are strings not variables. 主要问题是您需要引用输出文件名的名称，因为它们是字符串而不是变量。 Use: 使用：

print $1 > "wavelengths.txt"

instead of: 代替：

print $1 > wavelengths.txt

使用awk处理数据库

问题描述

2 个解决方案

解决方案1
3 2013-11-23 05:57:58

解决方案2
3 2013-11-23 17:50:10

使用awk处理数据库

问题描述

2 个解决方案

解决方案1 3 2013-11-23 05:57:58

解决方案2 3 2013-11-23 17:50:10

解决方案1
3 2013-11-23 05:57:58

解决方案2
3 2013-11-23 17:50:10