简体   繁体   中英

How do I read a file line by line and print the line that have specific string only in python?

I have a text file containing these lines

wbwubddwo 7::a number1 234 **
/// 45daa;: number2 12

time 3:44

I am trying to print for example if the program find string number1 , it will print 234

I start with simple script below but it did not print what I wanted.

with open("test.txt", "rb") as f:
    lines = f.read()
    word = ["number1", "number2", "time"]
    if any(item in lines for item in word):
        val1 = lines.split("number1 ", 1)[1]
        print val1

This return the following result

234 **
/// 45daa;: number2 12

time 3:44

Then I tried changing f.read() to f.readlines() but this time it did not print out anything.

Does anyone know other way to do this? Eventually I want to get the value for each line for example 234 , 12 and 3:44 and store it inside the database.

Thank you for your help. I really appreciate it.

Explanations given below:

with open("test.txt", "r") as f:
    lines = f.readlines()
    stripped_lines = [line.strip() for line in lines]

words = ["number1", "number2", "time"]
for a_line in stripped_lines:
    for word in words:
        if word in a_line:
            number = a_line.split()[1]
            print(number)

1) First of all 'rb' gives bytes object ie something like b'number1 234' would be returned use 'r' to get string object.

2) The lines you read will be something like this and it will be stored in a list.

['number1 234\\r\\n', 'number2 12\\r\\n', '\\r\\n', 'time 3:44']

Notice the \\r\\n those specify that you have a newline. To remove use strip() .

3) Take each line from stripped_lines and take each word from words and check if that word is present in that line using in .

4) a_line would be number1 234 but we only want the number part. So split() output of that would be

['number1','234'] and split()[1] would mean the element at index 1. (2nd element).

5) You can also check if the string is a digit using your_string.isdigit()

UPDATE: Since you updated your question and input file this works:

import time

def isTimeFormat(input):
    try:
        time.strptime(input, '%H:%M')
        return True
    except ValueError:
        return False

with open("test.txt", "r") as f:
    lines = f.readlines()
    stripped_lines = [line.strip() for line in lines]

words = ["number1", "number2", "time"]
for a_line in stripped_lines:
    for word in words:
        if word in a_line:
            number = a_line.split()[-1] if (a_line.split()[-1].isdigit() or isTimeFormat(a_line.split()[-1]))  else a_line.split()[-2] 
            print(number)

why this isTimeFormat() function?

def isTimeFormat(input):
        try:
            time.strptime(input, '%H:%M')
            return True
        except ValueError:

To check if 3:44 or 4:55 is time formats. Since you are considering them as values too. Final output:

234
12
3:44

After some try and error, I found a solution like below. This is based on answer provided by @s_vishnu

with open("test.txt", "r") as f:
    lines = f.readlines()
    stripped_lines = [line.strip() for line in lines]

    for item in stripped_lines:
        if "number1" in item:
            getval = item.split("actual ")[1].split(" ")[0]
            print getval

        if "number2" in item:
            getval2 = item.split("number2 ")[1].split(" ")[0]
            print getval2

        if "time" in item:
            getval3 = item.split("number3 ")[1].split(" ")[0]
            print getval3

output

234
12
3:44

This way, I can also do other things for example saving each data to a database.

I am open to any suggestion to further improve my answer.

You're overthinking this. Assuming you don't have those two asterisks at the end of the first line and you want to print out lines containing a certain value(s), you can just read the file line by line, check if any of the chosen values match and print out the last value (value between a space and the end of the line) - no need to parse/split the whole line at all:

search_values = ["number1", "number2", "time"]  # values to search for

with open("test.txt", "r") as f:  # open your file
    for line in f:  # read it it line by line
        if any(value in line for value in search_values):  # check for search_values in line
            print(line[line.rfind(" ") + 1:].rstrip())  # print the last value after space

Which will give you:

234
12
3:44

If you do have asterisks you have to more precisely define your file format as splitting won't necessarily yield you your desired value.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM