I have a DataFrame like this:
A B C D
2000-01-03 -0.59885 0.18141 -0.68828 0.77572
2000-01-04 0.83935 0.15993 0.95911 -1.12959
2000-01-05 2.80215 -0.10858 -1.62114 -0.20170
2000-01-06 0.71670 -0.26707 1.36029 1.74254
I would like to filter the columns based on the value of the first row. Eg I want to take only the columns where the first value is >0. and the result I expect is this:
B D
2000-01-03 0.18141 0.77572
2000-01-04 0.15993 -1.12959
2000-01-05 -0.10858 -0.20170
2000-01-06 -0.26707 1.74254
Update Thanks to Jeff suggestion I wrote this code:
cols = []
firstRow = df.ix[0,:]
for i in range(len(firstRow)):
if firstRow[i]>0:
cols.append(i)
return df.ix[:, list(cols)].values.copy()
Is there a more elegant way to do this?
This is obviously using the data generated below, but you can easily apply to your example. The iloc[-2]
selects the 2nd to last row, and creates a boolean array The loc
then takes that boolean array and select the applicable columns
In [2]: df = DataFrame(np.random.randn(4,4),columns=list('ABCD'),
index=date_range('20000103',periods=4))
In [3]: df
Out[3]:
A B C D
2000-01-03 -0.132896 -0.151352 0.960943 -0.007701
2000-01-04 -1.653279 -1.101331 -2.083493 -1.920517
2000-01-05 -1.190868 0.983487 0.804209 0.962575
2000-01-06 0.232290 2.152097 0.414457 1.023253
In [6]: df.loc[:,df.iloc[-2]<0]
Out[6]:
A
2000-01-03 -0.132896
2000-01-04 -1.653279
2000-01-05 -1.190868
2000-01-06 0.232290
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.