简体   繁体   中英

search specific set of words in .txt file

I have a txt file with the following dataset as list

Name:AP_A
Ch:0
Ptx:20
CCA:-68
AvgThroughput:{}
Data packets_sent:{}
Data_packets lost:{}
rts_cts_sent:{}
rts_cts_lost:{}
in-degA:0.0006766529737718963
out-degA:1.1814245426625214
-----------------
Name:AP_B
Ch:0
Ptx:5
CCA:-90
AvgThroughput:{}
Data packets_sent:{}
Data_packets lost:{}
rts_cts_sent:{}
rts_cts_lost:{}
in-degB:1.6025829114087657
out-degB:0.0006766529737718963

I need to search for these lines / words and have them as the next data set

---AP_A data---
Name:AP_A
in-degA:0.0006766529737718963
out-degA:1.1814245426625214
---AP_B data---
Name:AP_B
in-degB:1.6025829114087657
out-degB:0.0006766529737718963

i have a code to make this, but i cant make that i describe

archivo_ficha= "ficha_nodos_triang28.txt"
with open(archivo_ficha,'r') as inputfile:
     lines = []
     for line in inputfile:
         lines.append(line)

         search_words1=['Name:AP_A','in-degA','out-degA','Name:AP_B','in-degB','out-degB']
         for line in inputfile:
             if any(word in line  for word in search_words1):
                print("---datos_NodoA---")
                print(line)

                print("---datos_NodoB---")
                print(line)

thanks in advance

You know that you have data A and data B. You know that you get either a string with "AP_X" or "degX" from the lines you are interested in. Plus you want to print a flag to say which data you enter in.

Well, your data start with "Name:AP_X".

You set all "write" var for A and B to false. When you first meet "Name:AP_A", you turn write_A on, keep write_B off, print your header which will no be printed twice (cause only when write_A = False and "Name:AP_A" in line) and then you write the lines containing the labels of interest.

archivo_ficha= "ficha_nodos_triang28.txt"

with open(archivo_ficha,'r') as inputfile:

     write_A = False; write_B = False; out_list = []

     for line in inputfile:

         if 'AP_A' in line and write_A == False:
            out_list.append("---datos_NodoA---"); print (out_list[-1])
            write_A = True; write_B = False

         if write_A == True and 'AP_A' in line or 'degA' in line:
            out_list.append(line.strip()); print (out_list[-1])


         if 'AP_B' in line and write_B == False:
            out_list.append("---datos_NodoA---"); print (out_list[-1])
            write_B = True; write_A = False

         if write_B == True and 'AP_B' in line or 'degB' in line:
             out_list.append(line.strip()); print (out_list[-1])

     inputfile.close()

Output:

---datos_NodoA---
Name:AP_A
in-degA:0.0006766529737718963
out-degA:1.1814245426625214
---datos_NodoB---
Name:AP_B
in-degB:1.6025829114087657
out-degB:0.0006766529737718963

As PaulProgrammer suggested, you can use regular expressions . In Python:

import re
archivo_ficha = "ficha_nodos_triang28.txt"
matches = [re.search(r"(Name|(in|out))(.+)", line) for line in open(archivo_ficha, 'r')]
matches = [m.group() for m in matches if m]

matches is a list from which you can extract the necessary data:

['Name:AP_A',
 'in-degA:0.0006766529737718963',
 'out-degA:1.1814245426625214',
 'Name:AP_B',
 'in-degB:1.6025829114087657',
 'out-degB:0.0006766529737718963']

These could then be split into groups of 3 and produce the output you desire.

Explanation:

re.search scans through a string looking for substring that matches the pattern. Here the pattern is (Name|(in|out))(.+) .

  • The first part Name|(in|out) means:
    1. Find Name
    2. if not found, find in or out
    3. if a match is found, the execution is continued. Otherwise, the search moves onto the next line.
  • The second part (.+) consists of special characters to match the rest of the string:
    • . matches any character (except a newline)
    • + matches the previous character ( . ) 1 or more times

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM