I have the following dataframe:
nutsgdp
Out[77]:
2010 2011 2012 ... 2016 2017 2018
NUTS_ID ...
AT 295896.60 310128.70 318653.00 ... 357299.70 370295.80 385711.90
AT1 131114.27 136271.77 139149.68 ... 155609.11 159879.39 166443.24
AT11 6698.37 7012.58 7365.43 ... 8353.78 8771.65 9005.49
AT111 738.53 784.29 791.16 ... 923.96 996.55 996.55
AT112 3843.03 4028.02 4313.17 ... 4923.69 5165.46 5165.46
... ... ... ... ... ... ...
UKN15 3762.30 3604.13 4228.35 ... 5391.50 5089.14 4203.36
UKN16 2169.86 2162.22 2452.28 ... 2801.88 2801.14 2730.28
UKZ 30761.26 33592.50 32090.74 ... 13343.86 12887.29 20225.66
UKZZ 30761.26 33592.50 32090.74 ... 13343.86 12887.29 20225.66
UKZZZ 30761.26 33592.50 32090.74 ... 13343.86 12887.29 20225.66
[1794 rows x 9 columns]
I would like to drop all the rows where the index is longer than 2 characters and ends on 'Z'. That means, as an example, dropping 'UKZ'
, 'UKZZ'
and 'UKZZZ'
, but keeping 'CZ'
. What would be the best way to do this? Thanks in advance for your help.
Use Series.str.contains
with invert mask by ~
and filter by boolean indexing
:
df = df[~df.index.str.contains('(.){2,}Z$')]
Or use Series.str.endswith
with Series.str.len
:
df = df[~df.index.str.endswith('Z') | (df.index.str.len() <= 2)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.