简体   繁体   中英

how to filter pandas data-frame based on user inputs from config file (text or CSV) config will tell filter values and column for filtering

I have a data frame created from a CSV file and I need help to filter on the data frame based on the inputs from configuration file (it can be text or CSV). The config file will contain the column name on which I have to filter and the values or condition on which I have to filter. I have the following code so far

import pandas as pd
import os
import time
import csv
import datetime
import sys


file_loc = sys.argv[1]

input_file_1 = 'mapping_config_1.txt'

file_det = os.path.join(file_loc, input_file_1)

file_details = pd.read_csv(file_det, header = 0, delimiter = "\t")

df = pd.read_csv(r'C:\filter\test.txt', sep = "|")

for index, row in file_details.iterrows():

  filter_col = row('Target_Column')

  filter = row['Filter']

  df = df.loc[df['filter_col'].isin(filter)]

  df.head(1000).to_csv(os.path.join(file_loc, 'output.txt'), sep = "|", index = False)

my config text file looks like, my dataframe has a column named

Client_Product
Filter  Target_Column
10170   Client_Product

I am getting 'Type Error' : series object is not callable

I am looking for any approach where I can pass filter conditions from a config file to a Python program

Looks basically OK. The error you're getting is due to the line filter_col = row('Target_Column') , where you are using parentheses () instead of brackets []

Other issues: pd.Series.isin expects a list, but you're providing it a single value. You can just replace those first three lines with df = df.loc[df[row['Target_Column']] == row['Filter']]

This also lets you avoid making a variable called filter , since that overwrites a python builtin and you should avoid that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM