简体   繁体   中英

python apply function to list and return data frame

I am new to python. I wrote a function that returns a pandas data frame. I am trying to apply this function to a list and I would like to merge all the results to one data frame. For example, if my function looks like:

def test(x):
    return pd.DataFrame({'a':[x],'b':['test']})

I want to apply it to list [1,2,3,4,5] , and get the result as a data frame which looks like:

a b
1 test
2 test
3 test
4 test
5 test

If I do [test(x) for x in [1,2,3,4,5]] , it returns a weird list. Anyone could help me with this please? Thanks!

PS: the function I am actually using:

def cumRet(startDate,endDate=None,symbols=None,inDir=None):

if endDate is None:
    endDate=startDate
if inDir is None:
    inDir='E:\\python\\data\\mktData\\'

dates=dateRange(startDate,endDate)

if symbols is None:    
    adjClose=pd.merge(mktData_R(dates.iloc[0].strftime('%Y-%m-%d'),var=['adjClose'])
                     ,mktData_R(dates.iloc[-1].strftime('%Y-%m-%d'),var=['adjClose'])
                     ,on='symbol'
                     ,how='outer')

else:
    adjClose=pd.merge(mktData_R(dates.iloc[0].strftime('%Y-%m-%d'),symbols=symbols,var=['adjClose'])
                     ,mktData_R(dates.iloc[-1].strftime('%Y-%m-%d'),symbols=symbols,var=['adjClose'])
                     ,on='symbol'
                     ,how='outer')

adjClose['adjClose_x'][pd.isnull(adjClose['adjClose_x'])]=1
adjClose['adjClose_y'][pd.isnull(adjClose['adjClose_y'])]=1
adjClose['cumRet']=adjClose['adjClose_y']/adjClose['adjClose_x']-1

return adjClose[['symbol','cumRet']]

your original code produced this:

In [49]:

t = [1,2,3,4,5]

def test(x):
    return pd.DataFrame({'a':[x],'b':['test']})

[test(t) for x in [1,2,3,4,5]]
Out[49]:
[                 a     b
 0  [1, 2, 3, 4, 5]  test,                  a     b
 0  [1, 2, 3, 4, 5]  test,                  a     b
 0  [1, 2, 3, 4, 5]  test,                  a     b
 0  [1, 2, 3, 4, 5]  test,                  a     b
 0  [1, 2, 3, 4, 5]  test]

Which is not what you intended as you're performing a list comprehension which will loop over each element and produce a list containing 5 dfs which themselves contain your element values as a list for column a.

You can avoid all this by just passing the list as an arg to the DataFrame constructor the values need to be list-like but as your arg is already a list you don't need to wrap it in another list, additionally for the b column the length of the values have to match the length of the a column so you need to repeat the value by the length of the list:

In [4]:

t = [1,2,3,4,5]

def test(x):
    return pd.DataFrame({'a':x,'b':['test']* len(x)})

test(t)
Out[4]:
   a     b
0  1  test
1  2  test
2  3  test
3  4  test
4  5  test

In your approach you are creating five dataframes not one.

You can do this without creating a list of size of your list with 'test' strings(as suggested by @EdChum) :

l = [1,2,3,4,5]

def test(x):
 return pd.DataFrame({'a':x, 'b':'test'})

test(l)

>>>    a     b
    0  1  test
    1  2  test
    2  3  test
    3  4  test
    4  5  test 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM