简体   繁体   中英

Using .str.contains to filter a df of FRED series

I am trying to download a data series for each state from the FRED api. i have loaded all the data series containing 'Housing Inventory: Active Listing Count state' into a df however there are still over 1000+ rows. Is there a way i can search the title of each series to see if it contains the name of a state?

i have tried

df=df.loc[df['title'].str.contains(["Alaska","Alabama",...,"Wyoming"])]

Series ID = ACTLISCOU

Assuming you have a list with all the states, you can define a custom function to filter your title column and use it calling pd.Series.apply :

state_list = ["Alaska","Alabama",...,"Wyoming"]
def my_filter(value):
    # return True if any state is in the value
    return any(state in value for state in state_list)

# Call apply to filter DF based on True|False by your filter
df_filtered = df[df['title'].apply(my_filter)]

The following code returns the country contained in the ACTLISCOUXX dataset, in this case California:

df = pd.read_csv('ACTLISCOUCA.csv',sep=';',header=None)
us_country_list=["Arizona","California","Oregon"]
country=[i for i in us_country_list if i in df.dropna().iloc[0][1]][0]
print(country)

How it works

  1. The CSV file is imported as a Pandas dataframe
  2. a list comprehension is used to build an array of involved countries by matching a list of US countries with the second column of the first row of the dataframe with both columns. This array should contain only one element if only one country is mentioned. Only the first element of the array is saved in the country variable.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM