[英]combining numbers from multiple text files using bash
I'm strugling to combine some data from my txt files generated in my jenkins job.我正在努力合并 jenkins 作业中生成的 txt 文件中的一些数据。
on each of the files there is 1 line, this is how each file look:每个文件都有 1 行,这是每个文件的外观:
testsuite name="mytest" cars="201" users="0" bus="0" bike="0" time="116.103016"
What I manage to do for now is to extract the numbers for each txt file:我现在设法做的是提取每个 txt 文件的数字:
awk '/<testsuite name=/{print $3, $4, $5, $6}' my-output*.txt
Result are:结果是:
cars="193" users="2" bus="0" bike="0"
cars="23" users="2" bus="10" bike="7"
cars="124" users="2" bus="5" bike="0"
cars="124" users="2" bus="0" bike="123"
now I have a random number of files like this:现在我有随机数量的文件,如下所示:
my-output1.txt
my-output2.txt
my-output7.txt
my-output*.txt
I would like to create single command just like the one I did above and to sum all of the files to have the following echo result:我想像上面那样创建单个命令,并将所有文件相加以获得以下回显结果:
cars=544 users=32 bus=12 bike=44
is there a way to do that?有没有办法做到这一点? with a single line of command?
用一行命令?
1st solution: With your shown samples please try following awk
code, using match
function in here.第一种解决方案:对于您显示的示例,请尝试遵循
awk
代码,在此处使用match
function。 Since awk
could read multiple files within a single program itself and your files have .txt
format you can pass as .txt
format to awk
program itself.由于
awk
可以在单个程序本身中读取多个文件并且您的文件具有.txt
格式,因此您可以将.txt
格式传递给awk
程序本身。
Written and tested in GNU awk
with its match
function's capturing group capability to create/store values into an array to be used later on in program.在 GNU
awk
中编写和测试,其match
函数的捕获组功能可以创建/存储值到数组中,以便稍后在程序中使用。
awk -v s1="\"" '
match($0,/[[:space:]]+(cars)="([^"]*)" (users)="([^"]*)" (bus)="([^"]*)" (bike)="([^"]*)"/,tempArr){
temp=""
for(i=2;i<=8;i+=2){
temp=tempArr[i-1]
values[i]+=tempArr[i]
indexes[i-1]=temp
}
}
END{
for(i in values){
val=(val?val OFS:"") (indexes[i-1]"=" s1 values[i] s1)
}
print val
}
' *.txt
Explanation:解释:
awk
program creating variable named s1
to be set to "
to be used later in the program.awk
程序启动时,创建名为s1
的变量,将其设置为"
以便稍后在程序中使用。match
function in main program of awk
.match
awk
。[[:space:]]+(cars)="([^"]*)" (users)="([^"]*)" (bus)="([^"]*)" (bike)="([^"]*)"
(explained at last of this post) which is creating 8 groups to be used later on.[[:space:]]+(cars)="([^"]*)" (users)="([^"]*)" (bus)="([^"]*)" (bike)="([^"]*)"
(在这篇文章的最后解释)正在创建 8 个组以供以后使用。END
block of this program traversing through values array and printing the values from indexes and values array as per requirement.END
块中遍历值数组并根据要求打印索引和值数组中的值。 Explanation of regex:正则表达式的解释:
[[:space:]]+ ##Matching spaces 1 or more occurrences here.
(cars)="([^"]*)" ##Matching cars=" till next occurrence of " here.
(users)="([^"]*)" ##Matching spaces followed by users=" till next occurrence of " here.
(bus)="([^"]*)" ##Matching spaces followed by bus=" till next occurrence of " here.
(bike)="([^"]*)" ##Matching spaces followed by bike=" till next occurrence of " here.
2nd solution: In GNU awk
only with using RT
and RS
variables power here.第二种解决方案:在 GNU
awk
,仅在此处使用RT
和RS
变量 power。 This will make sure the sequence of the values also in output should be same in which order they have come in input.这将确保 output 中的值的顺序也应该与它们输入的顺序相同。
awk -v s1="\"" -v RS='[[:space:]][^=]*="[^"]*"' '
RT{
gsub(/^ +|"/,"",RT)
num=split(RT,arr,"=")
if(arr[1]!="time" && arr[1]!="name"){
if(!(arr[1] in values)){
indexes[++count]=arr[1]
}
values[arr[1]]+=arr[2]
}
}
END{
for(i=1;i<=count;i++){
val=(val?val OFS:"") (indexes[i]"=" s1 values[indexes[i]] s1)
}
print val
}
' *.txt
Using awk
使用
awk
$ cat script.awk
BEGIN {
FS="[= ]"
} {
gsub(/"/,"")
for (i=1;i<NF;i++)
if ($i=="cars") cars+=$(i+1)
else if($i=="users") users+=$(i+1);
else if($i=="bus") bus+=$(i+1);
else if ($i=="bike")bike+=$(i+1)
} END {
print "cars="cars,"users="users,"bus="bus,"bike="bike
}
To run the script, you can use;要运行脚本,您可以使用;
$ awk -f script.awk my-output*.txt
Or, as a ugly one liner.或者,作为一个丑陋的班轮。
$ awk -F"[= ]" '{gsub(/"/,"");for (i=1;i<NF;i++) if ($i=="cars") cars+=$(i+1); else if($i=="users") users+=$(i+1); else if($i=="bus") bus+=$(i+1); else if ($i=="bike")bike+=$(i+1)}END{print"cars="cars,"users="users,"bus="bus,"bike="bike}' my-output*.txt
found a way to do so a bit long:找到了一种方法来做这件事有点长:
awk '/<testsuite name=/{print $3, $4, $5, $6}' my-output*.xml | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | awk '{bus+=$1;users+=$2;cars+=$3;bike+=$4 }END{print "bus=" bus " users="users " cars=" cars " bike=" bike}'
M. Nejat Aydin answer was good fit: M. Nejat Aydin 的回答很合适:
awk -F '[ "=]+' '/testsuite name=/{ cars+=$5; users+=$7; buses+=$9; bikes+=$11 } END{ print "cars="cars, "users="users, "buses="buses, "bikes="bikes }' my-output*.xml
You can try rquery to do such query.您可以尝试 rquery 进行此类查询。
[ rquery]$ echo 'testsuite name="mytest" cars="201" users="0" bus="0" bike="0" time="116.103016"' > files/output1.txt
[ rquery]$ echo 'testsuite name="mytest" cars="201" users="1" bus="1" bike="2" time="116.103016"' > files/output2.txt
[ rquery]$ echo 'testsuite name="mytest" cars="301" users="10" bus="21" bike="23" time="116.103016"' > files/output3.txt
[ rquery]$ ./rq -q "p d/ /|s 'cars='+sum(trim(substr(@3,strlen('cars=')),'\"')),'users='+sum(trim(substr(@4,strlen('users=')),'\"')),'bus='+sum(trim(substr(@5,strlen('bus=')),'\"')),'bikes='+sum(trim(substr(@6,strlen('bikes=')),'\"'))" files/
cars=703 users=11 bus=22 bikes=25
Check out the latest rquery from here https://github.com/fuyuncat/rquery/releases从这里查看最新的 rquery https://github.com/fuyuncat/rquery/releases
You may use this awk
solution:您可以使用此
awk
解决方案:
awk '{
for (i=1; i<=NF; ++i)
if (split($i, a, /=/) == 2) {
gsub(/"/, "", a[2])
sums[a[1]] +=a[2]
}
}
END {
for (i in sums) print i "=" sums[i]
}' file*
bus=15
cars=464
users=8
bike=130
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.