简体   繁体   中英

Fetching data from a text file in python using re module

I have a text file named file.txt which looks like the below:

0,  1,  2.     |classes
A0: 1, 2, 3
A1: 1, 2, 3
A2: 1, 2, 3
A3: 1, 2, 3, 4
| Final Pseudo Deletion Count is 0.  Optimisaiton not possible.

From this file I just want to take out the attribute names that is: A0, A1, A2, A3. How can I do it?

I mean for this particular file it is A0,A1,A2,A3 only but I want the output for in general files. There can be A0,A1.....An . Like below :

0,  1,  2.     |classes
A0: 1, 2, 3
A1: 1, 2, 3
A2: 1, 2, 3
A3: 1, 2, 3, 4
A4: 1, 2, 3
A5: 1, 2, 3, 4
| Final Pseudo Deletion Count is 0.  Optimisaiton not possible.

So in this case the output will contain A0, A1, A2, A3, A4, A5 .

I have tried like :

f = open('filename1.txt')
attrib1 = f.readline()
    
attrib = []
for i in range(1, len(attrib1)-1):
    v_pos_colon = attrib1[i].find(':')
    attrib.append(attrib[i][0:v_pos_colon])
print(attrib)

You're looping over the characters in the first line of the file, not looping over the lines of the file.

find() returns -1 when the string isn't found. So when there's no : in the line, you're adding the slice attrib[i][0:-1] , which slices everything up to the 2nd-to-last character. You should first test whether the character was found.

attrib = []
with open('filename1.txt') as f:
    for line in f:
        if ':' in line:
            attrib.append(line.split(':')[0])
print(attrib)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM