简体   繁体   中英

How to iterate over DataFrame to select rows

I'm new to Python and I'm trying to understand how to select n rows from each Index within a Dataframe and build a new Dataframe with only selected rows.

My df looks like this:

      Col1 Col2 Col3 etc
   A
   A
   A
   A
   B
   B
   B
   B

I would basically to take the first two rows for each index to have:

     Col1 Col2 Col3 etc.
   A
   A
   B
   B

I tried to do this with a for loop and iloc like here below but the loop stops to index A:

   for i in df:
       sel=df.iloc[:3]

I'm aware it is a basic question but more I read and more I get confused with for, apply, range, etc

Please help! Thanks

如果您想获取每组的前两行,您可以执行以下操作:

df.groupby('Col1').head(2)

A slight variation on @Chris's answer if A, B, etc. are in the index and not in the first column. You should first reset the index, use group_by, head, reset the index and remove its name:

df.reset_index().groupby('index').head(2).set_index('index').rename_axis(None)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM