im trying to "remove" some rows. This is my code
popcoun = census_df.copy()
popcoun = popcoun[popcoun['SUMLEV'] == 50]
popcoun = popcoun[['STNAME','CTYNAME','CENSUS2010POP']]
popcoun = popcoun.set_index(['STNAME','CTYNAME'])
popcoun = popcoun.sort_values(by = ['STNAME','CENSUS2010POP'],ascending = False)
In the previous image link, you can observe that the information is sorted, so, for example, in the Wioming index, i only want the first three rows of the CENSUS2010POP (which are the highest values in that state), and also for the other states i have. Thank you, i hope somebody help me
Add this -
popcoun = popcoun.groupby(['STNAME']).head(3)
This should work as long as the rows are sorted for each group as you have mentioned above.
If you want to just select the top 3 rows of the table, you can do
df.iloc[:4]
For each state, you can iterate over df["state"].unique()
and do df.loc[df.state == state][:4]
Sorry if I misunderstood. Does this help?
Regardless of sorting... this would work with .groupby
and .nlargest
popcoun = popcoun.groupby(['STNAME']).apply(lambda x: x.nlargest(3, 'CENSUS2010POP'))[['CITYNAME', 'CENSUS2010POP']]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.