简体   繁体   中英

Python pandas - Filter a data frame based on a pre-defined array

I'm trying to filter a data frame based on the contents of a pre-defined array.

I've looked up several examples on StackOverflow but simply get an empty output.

I'm not able to figure what is it I'm doing incorrectly. Could I please seek some guidance here?

import pandas as pd
import numpy as np

csv_path = 'history.csv'
df = pd.read_csv(csv_path)

pre_defined_arr = ["A/B", "C/D", "E/F", "U/Y", "R/E", "D/F"]
distinct_count_column_headers = ['Entity']

distinct_elements= pd.DataFrame(df.drop_duplicates().Entity.value_counts(),columns=distinct_count_column_headers)
filtered_data= distinct_elements[distinct_elements['Entity'].isin(pre_defined_arr)]    

print("Filtered data ... ")
print(filtered_data)

OUTPUT

Filtered data ... 
Empty DataFrame
Columns: [Entity]
Index: []

Managed to that using filter function -> .filter(items=pre_defined_arr )

import pandas as pd
import numpy as np

csv_path = 'history.csv'
df = pd.read_csv(csv_path)

pre_defined_arr = ["A/B", "C/D", "E/F", "U/Y", "R/E", "D/F"]
distinct_count_column_headers = ['Entity']

distinct_elements_filtered= pd.DataFrame(df.drop_duplicates().Entity.value_counts().filter(items=pre_defined_arr),columns=distinct_count_column_headers)

It's strange that there's just one answer I bumped on that suggests filter function. Almost 9 out 10 out there talk about .isin function which didn't work in my case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM