简体   繁体   中英

How to filter values in python dataframe?

How to filter the dataframe df1 based on column symbol that Starts with . and first digit numeric

    df1
    
      SYMBOL           TYPE
    .1E09UOV      Exchange code
    .2E09UP0      Exchange code
    .AT0013F      Exchange code
    .BT0013G      Exchange code
    .CT002MS      Exchange code
    .DT002MT      Exchange code
    .7T003MT      Exchange code
    .7T004MT      Exchange code
    .7T001MT      Exchange code
    .7T003MT      Exchange code
    
    
    
    Expected output
    
      SYMBOL           TYPE
    .1E09UOV      Exchange code
    .2E09UP0      Exchange code
    .7T003MT      Exchange code
    .7T004MT      Exchange code
    .7T001MT      Exchange code
    .7T003MT      Exchange code

Tried code:

df1.loc[(df1['SYMBOL'].re.sub(r'.\d')]

You can use the following:

df1 = df1[df1['SYMBOL'].str.match('^\.[0-9].*')]
  • ^ = start of string
  • \. = look for period
  • [0-9] = look for single digit
  • .* = look for zero or more characters


Here is an example showing the full code:

Code:

import pandas as pd

df1 = pd.DataFrame({ 'SYMBOL': ['.1E09UOV', '.2E09UP0', '.AT0013F', '.BT0013G', '.CT002MS', '.DT002MT', '.7T003MT', '.7T004MT', '.7T001MT', '.7T003MT'],
                    'TYPE': ['Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code', 'Exchange code']})

df1 = df1[df1['SYMBOL'].str.match('^\.[0-9].*')]

print(df1)

Output:

     SYMBOL           TYPE
0  .1E09UOV  Exchange code
1  .2E09UP0  Exchange code
6  .7T003MT  Exchange code
7  .7T004MT  Exchange code
8  .7T001MT  Exchange code
9  .7T003MT  Exchange code

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM