简体   繁体   中英

Write result from loop into dataframe in python

I need to check a list of index's value on a daily basis, for the convenience of reading, I put them into a DataFrame. I'm using Python 2.7

First, I output my answer into a list:

index_list = [df1,df2,df3,df4,df5,df6,df7] 
value_list = [20,22,28,29,30,31,32,33]
myarray = []

def minimum(dataframe,value):
    return dataframe['Datetime'][(dataframe["IDXType"] == value)].min()

for i in index_list:
    for value_i in value_list:
        myarray.append(minimum(i,value_i))

This will output a 56 lens long list, and then I put it to a dataframe, manually.

result = {'df1':pd.Series(myarray[0:8], index=value_list),
  'df2':pd.Series(myarray[8:16], index=value_list),
  'df3':pd.Series(myarray[16:24], index=value_list),
  'df4':pd.Series(myarray[24:32], index=value_list),
  'df5':pd.Series(myarray[32:40], index=value_list),
  'df6':pd.Series(myarray[40:48], index=value_list),
  'df7':pd.Series(myarray[48:56], index=value_list),
  }
result = pd.DataFrame(result)
result

It shows 8*7 dataframe. Like below:

Expected Result I want to ask if there is a short cut for this program? Like, directly put my result from the loop into a dataframe?

My list keeps growing therefore I can't afford to fix my code every other day.

You can use:

df1 = pd.DataFrame({'Datetime':pd.date_range('2015-01-04','2015-01-08'),
                    'IDXType':[20,20,33,33,33]})

print (df1)
    Datetime  IDXType
0 2015-01-04       20
1 2015-01-05       20
2 2015-01-06       33
3 2015-01-07       33
4 2015-01-08       33

df2 = pd.DataFrame({'Datetime':pd.date_range('2015-01-04','2015-01-08'),
                   'IDXType':[30,30,21,21,10]})

print (df2)
    Datetime  IDXType
0 2015-01-04       30
1 2015-01-05       30
2 2015-01-06       21
3 2015-01-07       21
4 2015-01-08       10

df3 = pd.DataFrame({'Datetime':pd.date_range('2015-01-04','2015-01-08'),
                   'IDXType':[20,20,30,31,31]})

print (df3)
    Datetime  IDXType
0 2015-01-04       20
1 2015-01-05       20
2 2015-01-06       30
3 2015-01-07       31
4 2015-01-08       31
index_list = [df1,df2,df3] 
value_list = [20,22,28,29,30,31,32,33]
myarray = []
def minimum(dataframe,value):
    return dataframe.loc[dataframe["IDXType"] == value, 'Datetime'].min()
for i in index_list:
    for value_i in value_list:
        myarray.append(minimum(i,value_i))
#print (myarray)          

result = {
'df1':pd.Series(myarray[0:8], index=value_list),
'df2':pd.Series(myarray[8:16], index=value_list),
'df3':pd.Series(myarray[16:24],  index=value_list)
}
result = pd.DataFrame(result)
print (result)
          df1        df2        df3
20 2015-01-04        NaT 2015-01-04
22        NaT        NaT        NaT
28        NaT        NaT        NaT
29        NaT        NaT        NaT
30        NaT 2015-01-04 2015-01-06
31        NaT        NaT 2015-01-07
32        NaT        NaT        NaT
33 2015-01-06        NaT        NaT

My solution with groupby and aggregating min , concat , reindex and last remove index name by rename_axis (new in pandas 0.18.0 ):

print (df1.groupby('IDXType')['Datetime'].min())
IDXType
20   2015-01-04
33   2015-01-06
Name: Datetime, dtype: datetime64[ns]

df = pd.concat([df1.groupby('IDXType')['Datetime'].min(),
                df2.groupby('IDXType')['Datetime'].min(),
                df3.groupby('IDXType')['Datetime'].min()], 
               axis=1, 
               keys=('df1','df2','df3')).reindex(value_list).rename_axis(None)
print (df)       
          df1        df2        df3
20 2015-01-04        NaT 2015-01-04
22        NaT        NaT        NaT
28        NaT        NaT        NaT
29        NaT        NaT        NaT
30        NaT 2015-01-04 2015-01-06
31        NaT        NaT 2015-01-07
32        NaT        NaT        NaT
33 2015-01-06        NaT        NaT

You can also use more dynamic solution - in concat use list comprehension , but need add new list for column names in new df5 :

index_list = [df1,df2,df3] 
value_list = [20,22,28,29,30,31,32,33]
namesdf = ['df1','df2','df3']   
df5 = pd.concat([x.groupby('IDXType')['Datetime'].min() for x in index_list], 
               axis=1, 
               keys=namesdf).reindex(value_list).rename_axis(None)
print (df5)  
          df1        df2        df3
20 2015-01-04        NaT 2015-01-04
22        NaT        NaT        NaT
28        NaT        NaT        NaT
29        NaT        NaT        NaT
30        NaT 2015-01-04 2015-01-06
31        NaT        NaT 2015-01-07
32        NaT        NaT        NaT
33 2015-01-06        NaT        NaT

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM