I need to check a list of index's value on a daily basis, for the convenience of reading, I put them into a DataFrame. I'm using Python 2.7
First, I output my answer into a list:
index_list = [df1,df2,df3,df4,df5,df6,df7]
value_list = [20,22,28,29,30,31,32,33]
myarray = []
def minimum(dataframe,value):
return dataframe['Datetime'][(dataframe["IDXType"] == value)].min()
for i in index_list:
for value_i in value_list:
myarray.append(minimum(i,value_i))
This will output a 56 lens long list, and then I put it to a dataframe, manually.
result = {'df1':pd.Series(myarray[0:8], index=value_list),
'df2':pd.Series(myarray[8:16], index=value_list),
'df3':pd.Series(myarray[16:24], index=value_list),
'df4':pd.Series(myarray[24:32], index=value_list),
'df5':pd.Series(myarray[32:40], index=value_list),
'df6':pd.Series(myarray[40:48], index=value_list),
'df7':pd.Series(myarray[48:56], index=value_list),
}
result = pd.DataFrame(result)
result
It shows 8*7 dataframe. Like below:
Expected Result I want to ask if there is a short cut for this program? Like, directly put my result from the loop into a dataframe?
My list keeps growing therefore I can't afford to fix my code every other day.
You can use:
df1 = pd.DataFrame({'Datetime':pd.date_range('2015-01-04','2015-01-08'),
'IDXType':[20,20,33,33,33]})
print (df1)
Datetime IDXType
0 2015-01-04 20
1 2015-01-05 20
2 2015-01-06 33
3 2015-01-07 33
4 2015-01-08 33
df2 = pd.DataFrame({'Datetime':pd.date_range('2015-01-04','2015-01-08'),
'IDXType':[30,30,21,21,10]})
print (df2)
Datetime IDXType
0 2015-01-04 30
1 2015-01-05 30
2 2015-01-06 21
3 2015-01-07 21
4 2015-01-08 10
df3 = pd.DataFrame({'Datetime':pd.date_range('2015-01-04','2015-01-08'),
'IDXType':[20,20,30,31,31]})
print (df3)
Datetime IDXType
0 2015-01-04 20
1 2015-01-05 20
2 2015-01-06 30
3 2015-01-07 31
4 2015-01-08 31
index_list = [df1,df2,df3]
value_list = [20,22,28,29,30,31,32,33]
myarray = []
def minimum(dataframe,value):
return dataframe.loc[dataframe["IDXType"] == value, 'Datetime'].min()
for i in index_list:
for value_i in value_list:
myarray.append(minimum(i,value_i))
#print (myarray)
result = {
'df1':pd.Series(myarray[0:8], index=value_list),
'df2':pd.Series(myarray[8:16], index=value_list),
'df3':pd.Series(myarray[16:24], index=value_list)
}
result = pd.DataFrame(result)
print (result)
df1 df2 df3
20 2015-01-04 NaT 2015-01-04
22 NaT NaT NaT
28 NaT NaT NaT
29 NaT NaT NaT
30 NaT 2015-01-04 2015-01-06
31 NaT NaT 2015-01-07
32 NaT NaT NaT
33 2015-01-06 NaT NaT
My solution with groupby
and aggregating min
, concat
, reindex
and last remove index name
by rename_axis
(new in pandas
0.18.0
):
print (df1.groupby('IDXType')['Datetime'].min())
IDXType
20 2015-01-04
33 2015-01-06
Name: Datetime, dtype: datetime64[ns]
df = pd.concat([df1.groupby('IDXType')['Datetime'].min(),
df2.groupby('IDXType')['Datetime'].min(),
df3.groupby('IDXType')['Datetime'].min()],
axis=1,
keys=('df1','df2','df3')).reindex(value_list).rename_axis(None)
print (df)
df1 df2 df3
20 2015-01-04 NaT 2015-01-04
22 NaT NaT NaT
28 NaT NaT NaT
29 NaT NaT NaT
30 NaT 2015-01-04 2015-01-06
31 NaT NaT 2015-01-07
32 NaT NaT NaT
33 2015-01-06 NaT NaT
You can also use more dynamic solution - in concat
use list comprehension
, but need add new list for column names in new df5
:
index_list = [df1,df2,df3]
value_list = [20,22,28,29,30,31,32,33]
namesdf = ['df1','df2','df3']
df5 = pd.concat([x.groupby('IDXType')['Datetime'].min() for x in index_list],
axis=1,
keys=namesdf).reindex(value_list).rename_axis(None)
print (df5)
df1 df2 df3
20 2015-01-04 NaT 2015-01-04
22 NaT NaT NaT
28 NaT NaT NaT
29 NaT NaT NaT
30 NaT 2015-01-04 2015-01-06
31 NaT NaT 2015-01-07
32 NaT NaT NaT
33 2015-01-06 NaT NaT
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.