简体   繁体   中英

How to select rows where number of characters different from 3 or contains at least 1 letter or no data (NaN) in Python Pandas?

I have DataFrame in Python Pandas like below (data type of col1 is string):

col1
-------
1112224
222
111
123DDT
985T
NaN

And I need to select only these rows where:

  • number of characters different from 3

or

  • contains at least 1 letter

or

  • no data (NaN)

So as a result I need something like below:

col1
-------
123DDT
985T
NaN

How can I do that in Python Pandas ?

You can easily use masks and boolean indexing:

# not 3 chars
m1 = df['col1'].str.len().ne(3)
# at least one letter
m2 = df['col1'].str.contains('[a-zA-Z]')
# or NaN
m3 = df['col1'].isna()

# any of the above conditions
out = df[m1|m2|m3]

NB. the condition 3 is already included in the first one (NaN has not 3 chars*), so you could skip it.

output:

      col1
0  1112224
3   123DDT
4     985T
5      NaN

* although, ironically, the name "NaN" does ;)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM