简体   繁体   English

仅在特定行中获取文本后的前 2 个数字以进行乘法

[英]get the first 2 numbers after text in only specific lines for multiplication

I have a file where I am getting data and thinning it out so that I only have what I need.我有一个文件,我在其中获取数据并将其精简,以便我只拥有我需要的东西。 However, I have lines with numbers that I either need to grab and put in another file possibly so I can multiply them or multiply in place and output to a.csv.但是,我有一些带有数字的行,我可能需要抓取这些数字并将其放入另一个文件中,这样我就可以将它们相乘或就地相乘,并将 output 乘以 a.csv。 It might help to put into proper columns as well.它也可能有助于放入适当的列。

This is a sample of lines and I am going to do this on 42000 lines give or take.这是一个线条示例,我将在 42000 行给予或接受上执行此操作。 and that is a Trumpf machine.那是一台通快机器。 :) :)

ELQADDXP.DAT-*test ADDXP 20GA ASTM A1011 0
ELQADDXP.DAT- 7.75000 14.00000
ELQADDXP.DAT- TRUMP 59.6517 0 3 4
ELQADDXQ.DAT-*1140242-0 ADDXQ 20GA ASTM A1011
ELQADDXQ.DAT- 7.75000 14.00000
ELQADDXQ.DAT- TRUMP 59.6517 0 3 4
ELQADDXR.DAT-*1140242-0A ADDXR 16GA ASTM A1011 0
ELQADDXR.DAT- 7.75000 14.00000
ELQADDXR.DAT- TRUMP 59.6517 0 3 4
ELQADDXS.DAT-*1139977-0 ADDXS 16GA ASTM A1011
ELQADDXS.DAT- 4.00000 8.64848
ELQADDXS.DAT- TRUMP 24.1015 0 3 4
ELQADDXT.DAT-*1137679-0 ADDXT 16GA ASTM A1011
ELQADDXT.DAT- 8.00000 15. .
ELQADDXT.DAT- TRUMP 71.1517 0 3 4
ELQADDXU.DAT-*1139617-0 ADDXU 11GA ASTM A1011
ELQADDXU.DAT- 6.37500 7.63330
ELQADDXU.DAT- TRUMP 30.1449 1 3 1044 0
ELQADDXV.DAT-*1140569-0 ADDXV 11GA ASTM A1011
ELQADDXV.DAT- 6.94190 35.50000
ELQADDXV.DAT- TRUMP 168.3770 1 3 1060 0
ELQADDXW.DAT-*1075665-9 ADDXW 11GA ASTM A1011 0
ELQADDXW.DAT- 10.60339 36.74345
ELQADDXW.DAT- TRUMP 335.6440 1 3 1060 0

The lines with only 2 numbers need to be multiplied by each other and I need the result included in the.csv只有 2 个数字的行需要相互相乘,我需要包含在.csv 中的结果

I tried grep -A1 - but this gets more than I need since - is in every line.我尝试了 grep -A1 - 但是这比我需要的要多,因为 - 在每一行中。 find.寻找。 -regex '.*/[0-9]+\myfile but I don't need other numbers. -regex '.*/[0-9]+\myfile 但我不需要其他数字。 I assume there might be an easy way I just have not discovered it yet.我想可能有一个简单的方法,我只是还没有发现它。

I need all of the other data for the csv file but I would like it to look something like我需要 csv 文件的所有其他数据,但我希望它看起来像

ELQADDXP.DAT-*test ADDXP 20GA ASTM A1011 0
ELQADDXP.DAT- 7.75000 14.00000 108.500
ELQADDXP.DAT- TRUMP 59.6517

As indicated by Barmar, awk is best for what you are trying to do, and it is very straightforward ( modified ).正如 Barmar 所指出的,awk 最适合您要执行的操作,而且非常简单(已修改)。

#!/bin/bash

input="input.txt"

cat >"${input}" <<"EnDoFiNpUt"
ELQADDXP.DAT-*test ADDXP 20GA ASTM A1011 0
ELQADDXP.DAT- 7.75000 14.00000
ELQADDXP.DAT- TRUMP 59.6517 0 3 4
ELQADDXQ.DAT-*1140242-0 ADDXQ 20GA ASTM A1011
ELQADDXQ.DAT- 7.75000 14.00000
ELQADDXQ.DAT- TRUMP 59.6517 0 3 4
ELQADDXR.DAT-*1140242-0A ADDXR 16GA ASTM A1011 0
ELQADDXR.DAT- 7.75000 14.00000
ELQADDXR.DAT- TRUMP 59.6517 0 3 4
ELQADDXS.DAT-*1139977-0 ADDXS 16GA ASTM A1011
ELQADDXS.DAT- 4.00000 8.64848
ELQADDXS.DAT- TRUMP 24.1015 0 3 4
ELQADDXT.DAT-*1137679-0 ADDXT 16GA ASTM A1011
ELQADDXT.DAT- 8.00000 15. .
ELQADDXT.DAT- TRUMP 71.1517 0 3 4
ELQADDXU.DAT-*1139617-0 ADDXU 11GA ASTM A1011
ELQADDXU.DAT- 6.37500 7.63330
ELQADDXU.DAT- TRUMP 30.1449 1 3 1044 0
ELQADDXV.DAT-*1140569-0 ADDXV 11GA ASTM A1011
ELQADDXV.DAT- 6.94190 35.50000
ELQADDXV.DAT- TRUMP 168.3770 1 3 1060 0
ELQADDXW.DAT-*1075665-9 ADDXW 11GA ASTM A1011 0
ELQADDXW.DAT- 10.60339 36.74345
ELQADDXW.DAT- TRUMP 335.6440 1 3 1060 0
EnDoFiNpUt

awk '{
    if( NF == 3 ){
        printf("%s %.5f %.5f %.5f\n", $1, $2, $3, $2*$3 ) ;
    }else{
        if( NF == 4 && $4 == "." ){
            printf("%s %.5f %.5f %.5f\n", $1, $2, $3, $2*$3 ) ;
        }else{
            print $0 ;
        } ;
    } ;
}' "${input}"

The output ( modified ) looks like this: output(已修改)如下所示:

ELQADDXP.DAT-*test ADDXP 20GA ASTM A1011 0
ELQADDXP.DAT- 7.75000 14.00000 108.50000
ELQADDXP.DAT- TRUMP 59.6517 0 3 4
ELQADDXQ.DAT-*1140242-0 ADDXQ 20GA ASTM A1011
ELQADDXQ.DAT- 7.75000 14.00000 108.50000
ELQADDXQ.DAT- TRUMP 59.6517 0 3 4
ELQADDXR.DAT-*1140242-0A ADDXR 16GA ASTM A1011 0
ELQADDXR.DAT- 7.75000 14.00000 108.50000
ELQADDXR.DAT- TRUMP 59.6517 0 3 4
ELQADDXS.DAT-*1139977-0 ADDXS 16GA ASTM A1011
ELQADDXS.DAT- 4.00000 8.64848 34.59392
ELQADDXS.DAT- TRUMP 24.1015 0 3 4
ELQADDXT.DAT-*1137679-0 ADDXT 16GA ASTM A1011
ELQADDXT.DAT- 8.00000 15.00000 120.00000
ELQADDXT.DAT- TRUMP 71.1517 0 3 4
ELQADDXU.DAT-*1139617-0 ADDXU 11GA ASTM A1011
ELQADDXU.DAT- 6.37500 7.63330 48.66229
ELQADDXU.DAT- TRUMP 30.1449 1 3 1044 0
ELQADDXV.DAT-*1140569-0 ADDXV 11GA ASTM A1011
ELQADDXV.DAT- 6.94190 35.50000 246.43745
ELQADDXV.DAT- TRUMP 168.3770 1 3 1060 0
ELQADDXW.DAT-*1075665-9 ADDXW 11GA ASTM A1011 0
ELQADDXW.DAT- 10.60339 36.74345 389.60513
ELQADDXW.DAT- TRUMP 335.6440 1 3 1060 0

Also, if the field count might encounter conflicts, then you can always have extra conditionals, such as此外,如果字段计数可能会遇到冲突,那么您总是可以有额外的条件,例如

if( NF == 3 && $0 !~ /ASTM/ && $0 !~ /TRUMP/ ){

I went a different route and used an awk script我走了一条不同的路线并使用了 awk 脚本

{
fc=substr($0,1,1)
if (fc == "@")
{ 
getline
print $1" "$3" "$4
getline 
rint $2" "$3, $2 * $3 
getline p
print $3
}
}

gives

$ awk -f grabq.awk ELQADDXT.DAT *1137679-0 16GA ASTM 8.00000 15.00000 120 71.1517 I just need a line to remove the * at the beginning. $ awk -f grabq.awk ELQADDXT.DAT *1137679-0 16GA ASTM 8.00000 15.00000 120 71.1517 我只需要一行删除开头的 *。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM