简体   繁体   中英

Search external file for specific word and store the very next word in variable in Python

I have a file that has a line similar to this:

"string" "playbackOptions -min 1 -max 57 -ast 1 -aet 57

now i want to search the file and extract and store the value after " -aet" (in this case 57) in a variable.

I'm using

import mmap

with open('file.txt') as f:
    s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
    if s.find('-aet') != -1:
        print('true')

for searching. but can't go beyond this.

I suggest to use regular expressions to extract the values:

import re

# Open the file for reading
with open("file.txt", "r") as f:
    # Loop through all the lines:
    for line in f:
        # Find an exact match
        # ".*" skips other options,
        # (?P<aet_value>\d+) makes a search group named "aet_value"
        # if you need other values from that line just add them here
        line_match = re.search(r"\"string\" \"playbackOptions .* -aet (?P<aet_value>\d+)", line)
        # No match, search next line
        if not line_match:
            continue
        # We know it's a number so it's safe to convert to int
        aet_value = int(line_match.group("aet_value"))
        # Do whatever you need
        print("Found aet_value: {}".format(aet_value)


Here's another approach using native string and list methods, as I usually forget regex syntax when I haven't touched it in a while:

tag = "-aet"  # Define what tag we're looking for.

with open("file.txt", "r") as f:  # Read file.
    for line in f:  # Loop through every line.
        line_split = line.split()  # Split line by whitespace.

        if tag in line_split and line_split[-1] != tag:  # Check if the tag exists and that it's not the last element.
            try:
                index = line_split.index(tag) + 1  # Get the tag's index and increase by one to get its value.
                value = int(line_split[index])  # Convert string to int.
            except ValueError:
                continue  # We use try/except in case the value cannot be cast to an int. This may be omitted if the data is reliable.

            print value  # Now you have the value.

It would be interesting to benchmark but typically regex is slower, so this may perform faster especially if the file is particularly big.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM