简体   繁体   中英

Pandas filter values with low number of decimal

I have a pandas dataframe:

df = pd.DataFrame({'start': [50, 100, 50000, 50030, 100000],
                   'end': [51, 101, 50001, 50031, 100001],
                   'value': [1.00, 2.1234567, 3.01, 4.12345, 5.456789]})

I would like to filter the values of columns 'value' and keep only values with decimal greater then two:

start   end        value
100     101        2.1234567
50030   50031      4.12345
100000  100001     5.456789

How to filter the column by decimal size?

Use Series.astype with Series.str.split , Series.map and Series.gt :

Cast your df into str . Split the value column on . and pick the 2nd part. Then get the length of the decimal part. Pick the rows with length > 2.

In [639]: df[df['value'].astype(str).str.split('.').str[1].map(len).gt(2)]
Out[639]: 
    start     end     value
1     100     101  2.123457
3   50030   50031  4.123450
4  100000  100001  5.456789

It is possible by converting to strings, but in real data because float accuracy this solutions should failed:

df = df[df['value'].astype(str).str.extract('.(\d+)$', expand=False).str.len().gt(2)]
print (df)

    start     end     value
1     100     101  2.123457
3   50030   50031  4.123450
4  100000  100001  5.456789

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM