How can I get the max (x) number of values from each column in a pandas dataframe while keeping the index for each?

Question

I'm attempting to get the top x largest values from each column in a pandas dataframe. Each column is one date while each row is a different stock ticker(see photo)

ideally i'd like to see the ticker and number of the top 5 for each date(column)

I have tried a few different iterators but none have worked and kept the index.

The output I want is into a new csv with the date and top 5 stock tickers (index) based on their value in the column that day.

import pandas as pd

df = pd.read_csv (see photo)

Haven't been able to get it to turn out right. enter image description here

Answer 1

Apply pd.Series.nlargest to each column to mask everything but the top N values. Then unstack and remove NaN . I'll use the top 2 values here for illustration.

Sample Data

import pandas as pd
import numpy as np

np.random.seed(42)
df = pd.DataFrame(np.random.normal(0, 10, (4, 3)), 
                  columns=['Date1', 'Date2', 'Date3'], 
                  index=['Stock1', 'Stock2', 'Stock3', 'Stock4'])
#            Date1     Date2     Date3
#Stock1   4.967142 -1.382643  6.476885
#Stock2  15.230299 -2.341534 -2.341370
#Stock3  15.792128  7.674347 -4.694744
#Stock4   5.425600 -4.634177 -4.657298

Code

df.apply(pd.Series.nlargest, n=2).unstack().dropna()

#Date1  Stock2    15.230299
#       Stock3    15.792128
#Date2  Stock1    -1.382643
#       Stock3     7.674347
#Date3  Stock1     6.476885
#       Stock2    -2.341370
#dtype: float64

How can I get the max (x) number of values from each column in a pandas dataframe while keeping the index for each?

Question

1 answers

solution1
0 ACCPTED 2019-07-02 02:48:13

Sample Data

Code

How can I get the max (x) number of values from each column in a pandas dataframe while keeping the index for each?

Question

1 answers

solution1 0 ACCPTED 2019-07-02 02:48:13

Sample Data

Code

solution1
0 ACCPTED 2019-07-02 02:48:13