Dataframes: Dropping rows ending in an specific string?

Question

I have the following dataframe:

nutsgdp
Out[77]: 
              2010       2011       2012  ...       2016       2017       2018
NUTS_ID                                   ...                                 
AT       295896.60  310128.70  318653.00  ...  357299.70  370295.80  385711.90
AT1      131114.27  136271.77  139149.68  ...  155609.11  159879.39  166443.24
AT11       6698.37    7012.58    7365.43  ...    8353.78    8771.65    9005.49
AT111       738.53     784.29     791.16  ...     923.96     996.55     996.55
AT112      3843.03    4028.02    4313.17  ...    4923.69    5165.46    5165.46
           ...        ...        ...  ...        ...        ...        ...
UKN15      3762.30    3604.13    4228.35  ...    5391.50    5089.14    4203.36
UKN16      2169.86    2162.22    2452.28  ...    2801.88    2801.14    2730.28
UKZ       30761.26   33592.50   32090.74  ...   13343.86   12887.29   20225.66
UKZZ      30761.26   33592.50   32090.74  ...   13343.86   12887.29   20225.66
UKZZZ     30761.26   33592.50   32090.74  ...   13343.86   12887.29   20225.66

[1794 rows x 9 columns]

I would like to drop all the rows where the index is longer than 2 characters and ends on 'Z'. That means, as an example, dropping 'UKZ' , 'UKZZ' and 'UKZZZ' , but keeping 'CZ' . What would be the best way to do this? Thanks in advance for your help.

Answer 1

Use Series.str.contains with invert mask by ~ and filter by boolean indexing :

df = df[~df.index.str.contains('(.){2,}Z$')]

Or use Series.str.endswith with Series.str.len :

df = df[~df.index.str.endswith('Z') | (df.index.str.len() <= 2)]

Dataframes: Dropping rows ending in an specific string?

Question

1 answers

solution1
2 ACCPTED 2020-12-17 08:11:19

Dataframes: Dropping rows ending in an specific string?

Question

1 answers

solution1 2 ACCPTED 2020-12-17 08:11:19

solution1
2 ACCPTED 2020-12-17 08:11:19