简体   繁体   中英

Iterating through selected columns and rows in Pandas

I have just begun using Pandas for my study and I am facing an issue with the following step.

Suppose I have a Dataframe with 'n' columns and 'm' rows.

I want to iterate on a column indexed #2 and from row #5 omitting the preceding rows. How do I go about it?

I could select either the required rows are columns separately but am unable to do both in one step. Can someone help me with an idea here?

The loc and iloc methods are great for that.

If you only whant the 5th row and 2nd column :

df.iloc[5,2]

If you whant all rows starting with the 5th and all columns starting with the 2nd

df.iloc[5:,2:]

Don't forget to assing your change to the df like that :

df = df.iloc[5:,2:]

Using pd.DataFrame.iloc , you can use integer indexers to isolate part of your dataframe. Given a dataframe df :

res = df.iloc[5:, 2]

Note that indexing in Python begins with 0, so this is the 6th row onwards (or index 5 onwards). Similarly, 2 represents the 3rd row (or column with index 2). The indexing syntax is similar to Python list or NumPy array indexing.

Since we specify only one column index, the output will be a pd.Series object, which can be seen as a column. If you specified multiple column indices, your output will be another dataframe.

In general, iteration isn't the best option with Pandas. You should aim to use vectorised operations. There are plenty of examples in the Pandas docs demonstrating vectorised calculations.


If your column index consists of strings, you can use get_loc to extract an integer location given a name:

res = df.iloc[5:, df.columns.get_loc('some_name')]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM