简体   繁体   中英

Checking a list of files for a string in Python 3

I'm trying to get my code to go through a list of files and retrieve a piece of information from each. There are 618 .txt files in the list, and 606 of them contain the information I need. I need my code to check if each file contains the string "J magnitude" and if it does, retrieve the relevant value. If the string isn't there, I want the number -99.9 added instead so that my list is still 618 items long.

This is the code I have written so far:

def find_Jmag (files):
    mags = []
    for each in files:
        with open(each) as f:
            if "J magnitude" in f:
                for line in f:
                    if "J magnitude" in line:
                        mag = float((line.split()[4]))
                        mags.append(mag)
            else:
                mag = -99.9
                mags.append(mag)
    return mags
Jmags = np.array(find_Jmag(txtfiles))

The output I am getting now is:

[-99.9 -99.9 -99.9 ... -99.9 -99.9 -99.9]

which means that for some reason, every file is failing to meet the condition of having "J magnitude" in it, which is not right.

This is a sample of what each file looks like:

#  ----------------------------------------------------------------------------------

# SpeX prism spectrum of 2MASP J0345432+254023 (J03454316+2540233)

# Originally observed on 2003 Sep 05

# Average resolution = 75

# Originally published in Burgasser & McElwain (2006) AJ, 131, 1007

#

# PLEASE CITE THE ORIGINAL DATA REFERENCE WHEN PUBLISHING OR PRESENTING THESE DATA

#

# Optical spectral type: L0

# Near infrared spectral type: L1+/-1

# J magnitude = 13.997

# H magnitude = 13.211

# Ks magnitude = 12.672

#

#  Wavelength (micron)   F_lambda (normalized)  Noise (normalized as F_lambda)

#  ----------------------------------------------------------------------------------

0.657669    0.155371    0.0956746

0.659854    0.0718279   0.0411391

0.662031    -0.0147441  0.0684986

0.664202    -0.0543488  0.0497614

I'm not sure where I've gone wrong and any help would be appreciated!

It looks like your if J magnitude" in f: check fails. Instead of checking that, try having a flag inside if "J magnitude" in line: that is True if you found "J magnitude" and then do mag = -99.9 if the flag is False .

def find_Jmag (files):
    mags = []
    for each in files:
        with open(each) as f:
            is_found = False
            for line in f:
                if "J magnitude" in line:
                    is_found = True
                    mag = float((line.split()[4]))
                    mags.append(mag)
            if not is_found:
                mag = -99.9
                mags.append(mag)
    return mags
Jmags = np.array(find_Jmag(txtfiles))
import re
def find_Jmag(files):
    mags =[]
    re = re.compile(r'J magnitude =(.*)\n')
    for file in files:
        data = re.findall(open(file).read())
        if len(data) != 0 :
            mags.append(int(data[0].strip()))
        else:
            mags.append(-99.9)
    return mags

This is not working because you are looking for the string "J magnitude" in the filenames by saying if "J magnitude" in f: so it will return your else value in every circumstance. This can be modified quite easily. I copied your file example into 2 files, file1.txt and file2.txt. In file2.txt I took out the "J magnitude" line so it should return false. The key line to fix this which I have included below is:

if "J magnitude" in open(each).read():

This searches the contents of each file, rather than the filename.

Here is the code:

file_list = ['file1.txt', 'file2.txt']

mags = []
for each in file_list:
    if "J magnitude" in open(each).read(): # this searches contents of file instead of filename
        with open(each) as f:
            for line in f:
                if "J magnitude" in line:
                    mag = float((line.split()[4]))
                    mags.append(mag)
    else:
        mag = -99.9
        mags.append(mag)

print(mags)

Which prints:

[13.997, -99.9]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM