简体   繁体   中英

How to get filtered values of data frame in Python?

I want to find, in a given column "type" , the values of that column that repeats "n" times.

I did this:

n = 5
df = dataf["type"].value_counts() > 5

print(df) will return something like this:

Bike           True
Truck          True
Car            False

How to get the values "Bike" and "Car"? I want to add them in a set.

You can use lambda in a loc for this:

import pandas as pd

df = pd.DataFrame({"vehicle": ["bike"] * 7 + ["truck"] * 8 + ["car"] * 4})
print(df)
print("\nUsing loc...")
print(df["vehicle"].value_counts().loc[lambda x: x > 5])

gives

   vehicle
0     bike
1     bike
2     bike
3     bike
4     bike
5     bike
6     bike
7    truck
8    truck
9    truck
10   truck
11   truck
12   truck
13   truck
14   truck
15     car
16     car
17     car
18     car

Using loc...
truck    8
bike     7
Name: vehicle, dtype: int64

Try this

aux = dataf["type"].value_counts()
greater_than_five = aux[aux > 5]

The first line get the count of the types and the second line filter for the types that is greater than five.

Try this,

n = 5
df = dataf["type"].value_counts()[dataf["type"].value_counts() > n]
print(df)

the most efficient way is with lambda that @user1717828 wrote it. another way:

df = pd.DataFrame({"vehicle": ["bike"] * 7 + ["truck"] * 8 + ["car"] * 4})


df2 = df["vehicle"].agg({'count':'value_counts'})
df2[df2['count'] > 5]

You can add a new columns called counter which contain '1':

df['counter'] = 1

and use groupby:

df = df.groupby(['types']).sum()
df = df[df.counter > n]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM