简体   繁体   中英

Bash how to filter input file

I have pretty big file, containing lots of unnecessary things and need to filter it out. So the thing which is interesting me is sth like that:

 iteration # 20     ecut=    36.00 Ry     beta=0.10
 Davidson diagonalization with overlap
 ethr =  4.07E-13,  avg # of iterations =  3.8

 total cpu time spent up to now is   351441.3 secs

 End of self-consistent calculation

 Number of k-points >= 100: set verbosity='high' to print the bands.

 highest occupied, lowest unoccupied level (ev):     2.2896    4.1062

Okay so I need to calucate: Bg=ELUMO−EHOMO, while ELUMO and EHOMO are the highest and lowest occupied values. The problem is that I want to have output like that:

Iteration #<number>
Bg=xxx

My 2 questions: 1. I can grep by 'highest' - string so I get every line like:

 highest occupied, lowest unoccupied level (ev):     2.3005    4.0791

But how can I set variables to the highest and lowest unoccupied level?

2.Since not every iteration gives me the values of unoccupied levels (I want to skip it then), how should I grep/find to have always the Iteration number and unoccupied levels?

Using awk

awk '/^iteration|highest/{if ($0 ~ "iteration") gsub(/ecut.*/ , "", $0); if ($0 ~ "highest") $0=($(NF-1)-$NF); print}' 

Demo:

$cat file.txt 
iteration # 20     ecut=    36.00 Ry     beta=0.10
 Davidson diagonalization with overlap
 ethr =  4.07E-13,  avg # of iterations =  3.8

 total cpu time spent up to now is   351441.3 secs

 End of self-consistent calculation

 Number of k-points >= 100: set verbosity='high' to print the bands.

 highest occupied, lowest unoccupied level (ev):     2.2896    4.1062
$awk '/^iteration|highest/{if ($0~"iteration") gsub(/ecut.*/,"",$0); if ($0 ~ "highest") $0=($(NF-1)-$NF); print}'  < file.txt 
iteration # 20     
-1.8166
$

Explanation :

/^iteration|highest/ -- Select only rows starting with iteration or with highesh 
($0 ~ "iteration")  -- $0 means entire row, Check if row have iteration pattern
 gsub(/ecut.*/ , "", $0) -- Delete all char after **ecut**
NF -- > number of fields in row 
$NF --> last field 
$(NF-1) --> second last field 
 $0=($(NF-1)-$NF -- Set current row as second last field -  last field

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM