Let's say I have a root directory(folder) z
and i
have three sub-directory(folders) a, b,
and c
Each a, b,
and c
contain one csv
file which are similar data and have similar names a_data, b_data,
and c_data)
Out of three csv
files, only one csv
contains the value of integer 100
inside data frame.``
How can I design a loop that scans all csv
inside three sub-folders and tells me which csv
has the value "100"?
Thanks alot!
import glob
import pandas as pd
val = 100
subdir_files = glob.glob(folder_path + '/**/*.csv', recursive=True)
for file in subdir_files:
df = pd.read_csv(file)
if val in df['column_name'].values:
print(file)
break
I can't profile my idea at the moment, but I assume it is going to be faster to open each file with Pandas than try to search through the text of the CSV before opening it in Pandas. Also, it will probably read better.
So, under the assumption that its faster to open everything with Pandas than using something like the CSV library , let's do:
import pandas as pd
import numpy as np
df = pd.read_csv("~/z/a/a_data.csv")
if not df["column"].isin([100]).all():
df = pd.read_csv("~/z/b/b_data.csv")
if not df["column"].isin([100]).all():
df = pd.read_csv("~/z/c/c_data.csv")
if not df["column"].isin([100]).all():
print("No value")
Ultimately, nested if's aren't pretty. But, it's hard to find what's the right fit without seeing your code. If you can post your code, that would help. Otherwise, hope the above helps you get started.
You can loop over your csv_files
list like this, reading each using pandas.read_csv
and finding the first one with the desired value. The else
clause of the for
loop will be executed if the loop ended normally (ie not on break
), corresponding to none of the files containing the desired value.
import pandas as pd
csv_files = ["a/a.csv", "b/b.csv", "c/c.csv"]
found_df = None
for csv_file in csv_files:
df = pd.read_csv(csv_file)
if 100 in df["column"].values:
found_df = df
break
else:
print("No value found")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.