I have a question on how to create a list of values that are greater than a specific value in a given data frame variable.
a. b. c.
1. 100 57 23
2. 99 56 23
3. 100 56 22
4. 101 57 23
...
300. 99 50 23
301. 99 51 29
302. 101 57 22
Create a list of all values where a > 100.
I am able to index, but not a list since all the values are boolean:
Greater_100 = df['a']>100
How do I turn this into a list?
df = pd.DataFrame(np.random.randint(0, 200, (10, 3)), columns=list('abc'))
list_a_more_than_hundred = df[df.a>100]
Only df[df['a'] > 100].loc[:, 'a']
or df[df['a'] > 100].loc[:, 'a'].tolist()
is sufficient.
Selecting the rows from column a
where value is > 100.
>>> df[df['a'] > 100].loc[:, 'a']
4 101
302 101
Name: a, dtype: int64
>>>
>>> type(df[df['a'] > 100].loc[:, 'a'])
<class 'pandas.core.series.Series'>
Converting the above Series into list.
>>> l = df[df['a'] > 100].loc[:, 'a'].tolist()
>>> l
[101, 101]
>>>
>>> type(l)
<class 'list'>
>>>
Let's look at the above code in more detail.
>>> import numpy as np
>>> import pandas as pd
>>>
>>> arr = [[100, 57, 23], [99, 56, 23],
... [100, 56, 20], [101, 57, 23], [99, 50, 23],
... [99, 51, 29], [101, 57, 22]]
>>>
>>> columns = [ch for ch in 'abc']
>>> indices = [str(n) for n in [1, 2, 3, 4, 300, 301, 302]]
>>>
>>> df = pd.DataFrame(arr, index=indices, columns=columns)
>>> df
a b c
1 100 57 23
2 99 56 23
3 100 56 20
4 101 57 23
300 99 50 23
301 99 51 29
302 101 57 22
>>>
>>> df['a'] > 100
1 False
2 False
3 False
4 True
300 False
301 False
302 True
Name: a, dtype: bool
>>>
>>> arr2 = df.loc[:,'a']
>>> arr2
1 100
2 99
3 100
4 101
300 99
301 99
302 101
Name: a, dtype: int64
>>>
>>> arr2 = df[df['a'] > 100]
>>> arr2
a b c
4 101 57 23
302 101 57 22
>>>
>>> arr3 = df[df['a'] > 100].loc[:, 'a']
>>> arr3
4 101
302 101
Name: a, dtype: int64
>>>
>>> l = arr3.tolist()
>>> l
[101, 101]
>>>
To filter your dataframe for rows where a > 100
, you can use pd.DataFrame.query
:
res_df = df.query('a > 100')
This also works for multiple conditions:
res_df = df.query('a > 100 & b < 57')
If you wish to extract a list of values from these rows, you can use use NumPy, eg
res_lst = df.query('a > 100 & b < 57').values.ravel().tolist()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.