Concatenate csv files with condition

Question

I would like to concatenate a number of csv files with a condition on the value of a column in the fastest way possible.

I have some code that works, however it concatenates all the lines of all the csv files before i reduce the dataframe down to the stations i need (through value in the station_number column). I would like to select the rows I need first, before doing the concatenation, so it would improve the running time. Thank you for any suggestion!

station = int(input("station number? ")) 

def Datastations (station,path): 
    filepaths = [os.path.join(path, f) for f in listdir(path) if f.endswith('.csv')]
    df = pd.concat(map(pd.read_csv, filepaths)) 
    df = df[df.station_number==station]
    return (df)

df1 = Datastations(station,"refdata/obs") 
df2 = Datastations(station,"refdata/BoM_ETA_20160501-20170430/obs")

Answer 1

You didn't say what you were having trouble with, so I can only reorder this for you:

import pandas as pd
import os

def Datastations (station,path): 
    filepaths = [os.path.join(path, f) for f in os.listdir(path) if f.endswith('.csv')]
    def process_csv(file_name):
        df = pd.read_csv(file_name)
        return df[df.station_number==station]
    return pd.concat(map(process_csv, filepaths))

Concatenate csv files with condition

Question

1 answers

solution1
1 ACCPTED 2019-08-11 02:11:31

Concatenate csv files with condition

Question

1 answers

solution1 1 ACCPTED 2019-08-11 02:11:31

solution1
1 ACCPTED 2019-08-11 02:11:31