简体   繁体   中英

How to select certain values based on a condition in a data frame?

I have a dataframe called df that looks like this:

Date        Reading1 Reading2 Reading3 Reading4
2000-05-01     15        13        14       11
2000-05-02     15        14        18        9
2000-05-03     14        12        15        8
2000-05-04     17        11        16       13

I used df.setindex('Date') to make the date the index. I have 3 questions.

1) How do I display the number of days that had a reading greater than 13 in the entire data frame not just in a single column?

I tried df.[(df.Reading1:df.Reading4>13)].shape[0] but obviously the syntax is wrong.

2) How do I display the values that happened on 2000-05-03 for columns Readings 1, 3, and 4?

I tried df.loc[["20000503"],["Reading1","Reading3,"Reading4"]]

but i got the error "None of the Index(['20000503'],dtype='object')] are in the [index]"

3) How do find do I display the dates for which the values for the column Readings 1 are twice as much as those in column Readings 2? And how do I display those values (the ones in Reading 1 that are twice as big) as well?

I have no idea where to even start this one.

Try this:

1. (df > 13).any(axis=1).sum()
Create a boolean dataframe then check to see if any value is True along the row and sum rows to get number of days.

2. df.loc['2000-05-03', ['Reading1', 'Reading3', 'Reading4']]
Use partial string indexing on DatetimeIndex to get a day, then column filtering with a list of column header.

3. df.loc[df['Reading1']  > (df['Reading2'] * 2)].index
   df.loc[df['Reading1']  > (df['Reading2'] * 2)].to_numpy().tolist()
Create a boolean series to do boolean indexing and get the index to return date.  Next convert the dataframe to numpy array then tolist to get values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM