简体   繁体   中英

remove non alphabetic str in dataframe column

Lets pretend i have this column

energy['Country'] = ['Brazil', 'France (2015)', np.nan, 'USA']

I have a dataframe, with a column of countries where i want to remove those who have numbers or parentheses. I am having trouble specially in the for loop to remove the value, it complains saying that a integer is required.

Energy_Supply = [type(x) == str for x in energy['Country']]  
es = energy['Country'].loc[Energy_Supply]
for k in es:
    if k.isalpha() == False:
        es.pop(k)
energy['Country'] = energy['Country'].where(energy['Country'].isin(es))

I would prefer if u told me a better and cleaner way to do this and please explain it

Try this;

# create boolean indexer for alpha only strings 
# this returns a pandas.Series where true is alpha
string_selector = energy['Country'].map(lambda x: x.isalpha() 
                                                  if not isinstance(x, numpy.NaN) 
                                                  else False)

# drop the rows that aren't alpha, notice the ~
energy.drop(energy[~string_selector].index, inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM