[英]finding and sorting datapoints in bash and awk
First of all, let me clarify that I am unfortunately still quite inexperienced in programming, so I really need some help. 首先,让我澄清一下,很遗憾,我对编程还没有足够的经验,所以我确实需要一些帮助。
What I have: 我有的:
I have a data file containing 3 columns: $1=(Energy1)
, $2=(Energy2)
, $3=(intensity of their frequency in combination)
. 我有一个包含3列的数据文件:
$1=(Energy1)
, $2=(Energy2)
, $3=(intensity of their frequency in combination)
。 If I plot these data eg in gnuplot by doing spl "datafile.dat" u 1:2:3
I obtain a surface plot with my 2D-spectrum. 如果我通过执行
spl "datafile.dat" u 1:2:3
在gnuplot中绘制这些数据,则得到2D光谱的表面图。
What I want: 我想要的是:
Now, I would like to select only certain data points, for which my ($1-$2)=5.7
give this specific value, thus obtaining a line spectrum along a diagonal, with all possible combinations of $1
and $2
yielding this value. 现在,我只想选择某些数据点,对于这些数据点,我的
($1-$2)=5.7
给出该特定值,从而获得沿对角线的线谱,其中$1
和$2
所有可能组合都会产生该值。
The new data-file should then contain the $1
-value and the intensity (stored in $3
) corresponding to the selected line, which contained the correct values of $1
and $2
yielding 5.7. 然后,新的数据文件应包含
$1
值和对应于所选行的强度(存储在$3
),其中包含正确的值$1
和$2
产生5.7。
I have tried do do this in bash using awk, but unfortunately until now I failed. 我尝试使用awk在bash中执行此操作,但不幸的是直到现在我都失败了。 PLEASE help me!!!
请帮我!!! thank you very much in advance.
提前非常感谢您。
You do not need awk
for this, gnuplot
can do it. 您不需要
awk
, gnuplot
可以做到。
admissible(x,y,value,epsilon)=(abs(x-y-value)<epsilon)
plot 'datafile.dat' using (admissible($1,$2,5.7,1e-5)?$1:1/0):3 with points
Function admissible
is tested for each line of data file, if it returns true then the point ($1,$3) is plotted, else the x-coordinate is set to undefined (1/0) and thus the point is not plotted. 对于数据文件的每一行,测试
admissible
函数,如果返回true,则绘制点($ 1,$ 3),否则x坐标设置为未定义(1/0),因此不绘制点。 The only shortcoming is that you cannot use the lines
style with this, since lines will be interrupted by non-admissible datapoints. 唯一的缺点是您不能与此一起使用
lines
样式,因为线条将被不可接受的数据点中断。
Maybe I don't understand all the issues, or maybe you are having a floating-equal problem as others have noted, but why doesn't a simple filter through the data work?: 也许我不理解所有问题,或者您遇到了浮点数相等的问题,但是为什么不对数据进行简单的筛选呢?:
awk -v s=5.7 -v e=.01 '{d=$1-$2-$s}d<e&&d>-e{print $1,$3}'
Tack on a sort if you want/need: 如有需要,请按以下说明进行排序:
| sort -n
Or, is it possible that your data is too sparse, and you're looking for some value interpolation solution? 或者,您的数据是否太稀疏,并且您正在寻找某种价值插值解决方案?
If you want to compare every $1 against every $2, you need to take 2 passes through the file, once to collect all the $1,$3 pairs, the next to do all the comparisons: 如果要将每个$ 1与每个$ 2进行比较,则需要对文件进行2次遍历,一次要收集所有对$ 1,$ 3对,然后下一步进行所有比较:
awk -v diff=5.7 '
NR == FNR {
# this is the first trip through
val[$1] = $3
next
}
{
for (v1 in val) {
if ( (v1 - $2) == diff ) {
print v1, val[v1]
}
}
}
' file file # yes, give the same filename twice.
To address @Baruchel's comment about floating point precision, try this: 要解决@Baruchel关于浮点精度的评论,请尝试以下操作:
awk -v diff=5.7 -v epsilon=0.0001'
NR == FNR {val[$1] = $3; next}
{
for (v1 in val) {
delta = v1 - $2 - diff
if (-epsilon <= delta && delta <= epsilon)
print v1, val[v1]
}
}
' file file
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.